developer-tools · May 22, 2026 · 7 min read

How to Scrape Apple App Store Data in 2026

A practical guide to pulling iOS app metadata, reviews, ratings, top charts and ASO keywords from Apple's public endpoints — without a headless browser or login.

Apple never shipped a real public App Store API. The iTunes Search API exists, but it’s intentionally thin — no reviews, no top charts, no rating histograms, no privacy labels, and a rate limit that punishes anyone who tries to use it for real ASO work. So the App Store data ecosystem is built on a patchwork of undocumented JSON endpoints that the App Store web client and the iOS app itself call. This guide walks through what those endpoints actually expose, how to read them, and where the rough edges are if you want production-grade iOS app intelligence in 2026.

What’s worth extracting

The App Store surfaces far more structured data than the official Search API admits. Once you’re hitting the right endpoints, you can pull:

App details — display name, subtitle, full localized descriptions, bundle ID, numeric track ID, current version, release notes, file size, minimum OS, supported devices, content rating.
Pricing — current price, currency, and in-app-purchase tiers, per storefront (this varies a lot by country).
Ratings — average rating and the full rating histogram (how many 5-star vs 1-star), which the Search API hides entirely.
Reviews — review title, body, star rating, author, version reviewed, and timestamp, paginated per storefront.
Top charts — free, paid and grossing charts by category and device (iPhone vs iPad).
Search & autocomplete — ranked search results plus the autocomplete suggestion list, which is the raw material for keyword research.
Similar apps & developer catalogs — “you might also like” plus a developer’s full published app list.
Privacy labels — the App Privacy disclosures (data linked to you, data used to track you, etc.).
Version history — past versions with release notes and dates.

For ASO work, the high-value combination is autocomplete suggestions plus search rankings plus the rating histogram. For competitive intelligence, it’s app details plus version history plus reviews over time.

How the data is exposed

This is not a “fight the bot wall” story like Booking.com or LinkedIn. Apple’s app catalog data lives behind public HTTP-JSON endpoints, and the main obstacle is rate limiting and undocumented response shapes, not TLS fingerprinting. Three layers are in play:

iTunes Search/Lookup API (itunes.apple.com/search, itunes.apple.com/lookup) — official, documented, but limited to roughly 20 requests/minute per IP and missing reviews/charts/histograms.
The amp-api / web App Store endpoints (amp-api.apps.apple.com) — the JSON the store website consumes. Richer, but requires a bearer token that the web page mints, and the response schema is deeply nested and changes without notice.
The RSS-style endpoints (itunes.apple.com/<country>/rss/...) — used for some chart and review feeds, returns JSON or XML depending on the path.

The practical headaches:

Rate limits per IP. Hammer the Search API and you get throttled, then temporarily blocked. Spreading requests across a proxy pool and pacing them is the entire game.
Storefront sprawl. There are 115+ country storefronts, each with its own pricing, localization and chart rankings. A US lookup tells you nothing about Germany’s rankings.
Token freshness. The richer amp-api endpoints need a short-lived authorization token; managing its lifecycle is fiddly.
Schema depth. The raw responses are deeply nested relationship graphs. Flattening them into export-ready rows is most of the engineering work.

Because it’s pure HTTP with no browser, throughput is high and cold starts are sub-3 seconds — the cost is in the plumbing, not in headless Chrome.

Endpoint structure

A few request shapes worth knowing:

# Lookup by numeric track ID (official)
https://itunes.apple.com/lookup?id=389801252&country=us

# Search (official)
https://itunes.apple.com/search?term=meditation&country=us&entity=software&limit=50

# Autocomplete suggestions (undocumented)
https://search.itunes.apple.com/WebObjects/MZSearchHints.woa/wa/hints?q=medit

# Top charts (RSS-style)
https://itunes.apple.com/us/rss/topfreeapplications/genre=6017/limit=100/json

Genre IDs (like 6017 for Education) and the device split between iPhone and iPad charts are the kind of undocumented constants you only learn by reverse-engineering the web client.

▶ Run the Apple App Store API — 10 endpoints in one actor: app details, search, reviews, top charts, similar apps, autocomplete, rating histograms, privacy labels and version history. Pure HTTP, sub-3s cold start, 115+ storefronts. Pay per result.

Build it yourself vs. use a managed actor

If you only need a single app’s details once, the official Lookup API in two lines of code is fine. The moment you want reviews, charts, autocomplete or multi-storefront coverage, the build cost climbs:

Building from scratch — reverse-engineer the amp-api token flow, learn the genre-ID constants, handle per-storefront pacing, flatten ten different nested response shapes, and re-fix it every time Apple tweaks a payload. That’s a week minimum, plus ongoing babysitting.
Using a managed actor — pick an endpoint, pass IDs or search terms, get flat rows. The token handling, rate-limit pacing and schema flattening are already solved.

For ASO and app-intelligence teams the deciding factor is usually the ten-endpoints-in-one design: you don’t want to maintain ten separate scrapers when one actor covers details, reviews, charts and keywords.

Schema design for downstream use

For app intelligence, normalize into a flat per-app row and keep reviews and the histogram as separate tables. A clean app record:

{
  "track_id": 389801252,
  "bundle_id": "com.tinyspeck.chatlyio",
  "name": "Slack",
  "developer": "Slack Technologies, Inc.",
  "developer_id": 618783545,
  "category": "Business",
  "price": 0,
  "currency": "USD",
  "storefront": "us",
  "version": "25.5.10",
  "average_rating": 4.7,
  "rating_count": 412883,
  "rating_histogram": { "5": 360120, "4": 28110, "3": 9001, "2": 4500, "1": 11152 },
  "min_os": "16.0",
  "content_rating": "4+",
  "release_notes": "Bug fixes and performance improvements.",
  "scraped_at": "2026-05-22T09:00:00Z"
}

Schema choices worth making early:

Always store storefront on every row. The same app has different rankings, prices and reviews in each of 115+ countries — without the storefront key your data is meaningless.
Keep track_id as the join key, not the name. Display names are localized and change.
Store the full histogram, not just the average. A 4.5 average from 50 reviews and a 4.5 from 500K reviews are very different signals; the distribution tells the real story.
Snapshot rankings with a timestamp. Chart positions and autocomplete order move daily — a ranking without a date is noise.

Typical use cases

What people actually do with this data:

ASO platforms — keyword research from autocomplete, track search-rank movements over time, alert on competitor metadata changes (new keywords stuffed into the subtitle, etc.).
App-intelligence dashboards — bulk-collect metadata, pricing, ratings and screenshots across countries to benchmark a portfolio.
Review analytics — aggregate reviews for sentiment analysis, detect rating spikes after a release, monitor user complaints.
Market research — benchmark categories and top charts, compare regional performance via rating histograms.
Privacy & compliance — audit App Privacy Label disclosures for vendor risk assessments.
AI agents & RAG — feed curated app lists and review content into retrieval pipelines so an LLM can answer “what’s the best budgeting app on iOS” with grounded data.

Cost math

At pay-per-event pricing — a tiny per-start fee plus a fraction of a cent per result — the economics favor volume. A daily ASO job tracking 200 competitor apps with rating histograms and top-10 reviews each runs a few thousand results per day. Over a month that’s tens of thousands of rows for low single digits in dollars, including the compute.

Compared to a paid ASO SaaS at $50–500/month with seat limits and export caps, owning the raw feed is cheap. Compared to building it yourself, you’re skipping the amp-api token reverse-engineering and the per-storefront pacing work entirely.

Common pitfalls

Search API rate limits bite fast. The official endpoint throttles around 20 req/min per IP. Without proxy rotation and pacing, a bulk job dies early.
One storefront is not “the App Store.” Charts and prices are wildly different US vs DE vs JP. Always loop the storefronts you care about.
Review pagination is shallow. Apple caps how deep you can page reviews per storefront; for deep review history you combine storefronts and refresh over time.
Genre IDs are undocumented and occasionally shift. Hardcoding them works until Apple reshuffles a category.
Privacy labels lag. The App Privacy section updates on app submission, not continuously; treat it as a point-in-time disclosure.

Wrapping up

The Apple App Store has no honest public API, but its web and iOS clients call a rich set of JSON endpoints you can read directly. If you need one app’s details once, the official Lookup call is enough. If you need reviews, charts, autocomplete keywords, rating histograms and privacy labels across 115+ storefronts — and you’d rather not maintain the token flow and per-storefront pacing yourself — a managed actor that bundles all ten endpoints is the faster path to a clean feed.

▶ Open the Apple App Store API on Apify — apps, reviews, ratings, charts and ASO keywords from one actor. No browser, no login. Pay per result, start on Apify’s free monthly credit.