Shopify Store Data Without the Admin API: Product and Merchant Intel
Extract Shopify product and merchant data without the Admin API using the public /products.json and /collections.json endpoints, with a clean schema.
If you have tried to read another company’s Shopify catalog, you already know the Admin API is a dead end. It requires a token issued by that store, which means you can only ever query stores you own. For competitive research, merchant lead-gen, or price monitoring, that is the wrong shape entirely. The good news: every Shopify store quietly exposes its catalog as public JSON. This guide covers how to pull Shopify store data without the Admin API — the public /products.json and /collections.json surface, what it gives you, where it stops, and a clean schema to land it in.
Why the Admin API cannot do this
Shopify’s Admin API is excellent and entirely beside the point for outside research:
- It authenticates against a single store’s access token. You install an app on a store, the merchant grants scopes, and you get a token bound to that shop. There is no token that lets you read a competitor’s catalog.
- It is built for managing a store, not observing the market. Orders, customers, inventory, fulfillment — all gated behind ownership and scopes, as they should be.
- The Storefront API is also store-scoped. It is public-facing but still keyed per store and intended for that store’s own headless front end.
So “get all products for someone else’s shop” has no official answer through Shopify’s APIs. But Shopify, by design, serves the catalog publicly to power storefronts — and that public surface is what you read when you need Shopify store data without the Admin API.
The endpoints every Shopify store exposes
Two URLs work on essentially every store on the platform, no auth, no key:
https://<store-domain>/products.json
https://<store-domain>/collections.json
/products.json returns the catalog as structured JSON: each product with its title, handle, vendor, product type, tags, created/updated timestamps, the full variants array (price, SKU, options, availability), the images array, and options (size, color, and so on). /collections.json returns the store’s collections, and /collections/<handle>/products.json returns the products within a specific collection.
This is not a scraping trick that works by accident — it is the same JSON the theme uses. That is why it is stable.
How the public surface works
A few mechanics matter when you build against it:
- Pagination.
/products.jsonpages with?limit=250&page=N. 250 is the hard ceiling per page. You incrementpageuntil you get an emptyproductsarray. Some stores also honor a cursor-stylesince_id, butpageis universal. - Prices are strings. Variant prices come as decimal strings (
"29.99"), not floats. Parse them deliberately and keep the store’s currency separately — the JSON does not always restate the currency on every variant. - Availability is per variant. Each variant has an
availableboolean. A product can be partly in stock; do not collapse it to a single product-level flag. - Handles are your stable key. The product
handleandidare durable; titles change. Use them for dedupe and for building the canonical product URL (/products/<handle>). - Detecting Shopify. If you are not sure a domain runs Shopify, request
/products.json— a valid JSON catalog response is the confirmation. You can also check for theX-ShopIdheader or thecdn.shopify.comasset host.
What the public surface deliberately does not give you: inventory quantities (only the boolean), orders, customers, draft/unpublished products, or anything behind the merchant’s login. That is the line between public catalog data and private store data, and it is a sensible one to respect.
Rate limits and how to live with them
There is no documented quota on /products.json because it is a storefront asset, but it is fronted by Shopify’s CDN and the store’s own protections:
- Keep concurrency civil — a few parallel requests per store. CDNs will throttle bursts.
- For a list of many stores, parallelize across domains rather than hammering one store’s pagination.
- Cache the collections call; it changes far less often than prices.
- Some stores sit behind bot protection (password pages, Cloudflare challenges, or a disabled JSON endpoint). Treat a non-JSON response as “this store opted out” and move on rather than retrying forever.
▶ Try the Shopify Merchant Scraper on Apify — point it at any Shopify domain and export the full product catalog and merchant profile. No auth required.
A clean output schema
Flatten one row per variant, carrying the product and merchant context:
{
"store_domain": "example-shop.com",
"platform": "shopify",
"product_id": 7421567890123,
"product_handle": "classic-canvas-tote",
"title": "Classic Canvas Tote",
"vendor": "Example Brand",
"product_type": "Bags",
"tags": ["canvas", "tote", "new-arrival"],
"product_url": "https://example-shop.com/products/classic-canvas-tote",
"variant_id": 42567890123456,
"variant_title": "Natural / Large",
"sku": "TOTE-NAT-L",
"price": 39.0,
"currency": "USD",
"available": true,
"options": { "Color": "Natural", "Size": "Large" },
"image_url": "https://cdn.shopify.com/s/files/1/.../tote.jpg",
"created_at": "2025-11-02T10:00:00Z",
"updated_at": "2026-05-30T14:21:00Z",
"scraped_at": "2026-06-07T12:00:00Z"
}
Parse price to a number at ingestion, keep options as an object, and store both created_at and updated_at — the update timestamp is your price-change signal.
Use cases
- Competitive price monitoring. Snapshot a rival’s catalog daily and diff on
updated_atandpriceto catch repricing and promotions. - Merchant lead generation. Detect Shopify stores in a niche, size them by catalog breadth and vendor list, and prospect accordingly.
- Assortment and trend research. Aggregate product types and tags across many stores to see what a category is stocking.
- Dropship and reseller sourcing. Pull catalogs with variants, SKUs, and availability to seed a product feed.
Build it yourself vs. a managed actor
The endpoints make a basic build genuinely easy — a paginated fetch and a flatten. The work that piles up is the long tail: stores with bot protection, the string-price and per-variant-availability gotchas, Shopify detection across a messy list of domains, and respecting stores that disabled the JSON endpoint. A managed actor handles that tail and the proxy rotation. For one store, build it. For a monitoring or lead-gen pipeline across hundreds of stores, the managed route is what keeps it running.
Common pitfalls
- Treating prices as floats from the start — they arrive as strings; mis-parsing them silently corrupts every comparison.
- Collapsing variant availability — a product is not “out of stock” just because one size is.
- Assuming every domain is Shopify — verify with the JSON endpoint before trusting the schema.
- Ignoring the 250 cap — set
limit=250explicitly or you page slowly at the default. - Expecting inventory counts — you only get the boolean; quantities are private.
Wrapping up
You cannot read a competitor’s store through the Admin API, and you do not need to. Shopify store data without the Admin API is right there in /products.json and /collections.json — the same public catalog JSON that powers every storefront. Paginate it, parse the string prices, flatten per variant, and respect the line where public catalog ends and private store data begins. Build it yourself for a single shop, or let a managed scraper handle detection, pagination, and proxies across the whole market.
▶ Open the Shopify Merchant Scraper on Apify — full catalogs, variants, prices, and merchant intel from any Shopify store. Pay per product returned, no token needed.
Related guides
How to Scrape Avito: Russia's Largest Classifieds (avito.ru)
Scrape Avito.ru listings at scale — title, current and old price, location, category, images and URL. Search by keyword or browse by category and location.
How to Scrape Blocket.se Swedish Classifieds in 2026
Extract used cars, electronics and marketplace items from Blocket.se — Sweden's largest classifieds. Pull prices, images, location, coordinates and seller type via its internal APIs.
How to Scrape Craigslist Listings and Prices in 2026
Learn how to scrape Craigslist listings, prices, images and GPS coordinates across any city and category — no login, no API key — and build market-wide datasets at scale.