lead-generation · May 30, 2026 · 6 min read

How to Scrape Product Hunt Daily Launches in 2026

A practical guide to extracting Product Hunt launches via the official GraphQL API — votes, topics, makers, media and descriptions, by day, date range or topic filter.

Product Hunt is where new software shows up first — the daily feed of launches is a live index of what builders are shipping, what’s gaining traction, and who’s behind it. For startup researchers, VCs, directory builders and competitive-intelligence teams, that feed is a goldmine, but pulling it cleanly takes a little more than scraping the homepage. Product Hunt exposes an official GraphQL API, and using it the right way — with proper authentication, pagination and field selection — gives you structured launch records without fighting any HTML. This guide covers what the API exposes, how to query it, and how to turn daily launches into a clean dataset.

What’s worth extracting

For each launch (a “post” in Product Hunt’s model), the GraphQL API returns a rich, nested record:

Product metadata — name, tagline, description, the post URL and the product URL.
Engagement — vote count, comment count, review count and average rating.
Timing — creation and featured dates, plus a PST-aware launch date (Product Hunt’s “day” runs on Pacific time, which matters for daily snapshots).
Makers — maker profiles: display names, usernames, headlines, social handles, follower counts, and redaction detection (some profiles are hidden).
Topics — the topic taxonomy each post is tagged with (ids, names, slugs) — e.g. “Artificial Intelligence”, “Developer Tools”, “SaaS”.
Media — image and video URLs (screenshots, demo clips).
Comments — optionally, the top-level comments on the post.

That’s a nested, ready-to-consume record per launch — everything you’d want for analytics, monitoring or enrichment, without any HTML parsing.

Why the GraphQL API is the right tool

Product Hunt is a JavaScript-heavy site, and scraping its rendered HTML is fragile and unnecessary because there’s an official GraphQL API v2 that returns exactly the structured data you want. This actor uses it directly:

No browser, no HTML parsing. It issues a single parameterized GraphQL query and reads JSON back. No headless Chrome, no brittle CSS selectors.
Bearer-token authentication, handled for you. The API requires a Bearer token. The actor supports a built-in token, a pre-obtained token, or an OAuth client-credentials exchange — so authentication is solved without you registering and managing your own app for basic use.
Cursor-based pagination. The API returns 20 posts per page with a cursor for the next page. The actor walks the cursor automatically to cover a full day, a date range, or a topic.
Configurable sort order. Sort by votes, recency or featured order depending on whether you want the day’s winners or the full chronological list.
Optional proxying. Available if you need it, but the API path generally doesn’t require an anti-bot stack.

Because it’s an official API rather than a scrape, it’s both more stable and more polite than parsing the site, and it’s the reason the actor “works out of the box” with no setup.

How the query modes work

You drive the actor in one of three modes:

Daily snapshot — pull the launches featured on a specific day (respecting the PST launch-day boundary).
Date range — pull every launch across a custom window for trend analysis or backfill.
Topic / category filter — pull launches tagged with a specific topic, e.g. all “Artificial Intelligence” launches, for a niche directory or tracker.

In every mode the actor runs one parameterized GraphQL query, pages through the 20-per-page results via cursor, and emits one structured record per launch.

▶ Run the Product Hunt Daily Launches Scraper — votes, topics, makers, media and descriptions via the official GraphQL API. Daily snapshots, date ranges or topic filters. No API key setup needed — works out of the box.

Schema design for downstream use

When the data lands in your warehouse or directory, you want it shaped for trend analysis and per-product drill-down. A clean per-launch record:

{
  "post_id": "412093",
  "name": "Acme AI Notetaker",
  "tagline": "Meeting notes that write themselves",
  "description": "Acme records, transcribes and summarizes...",
  "post_url": "https://www.producthunt.com/posts/acme-ai-notetaker",
  "product_url": "https://acme.ai",
  "votes": 842,
  "comments": 96,
  "reviews": 31,
  "rating": 4.7,
  "created_at": "2026-05-30T07:01:00Z",
  "launch_date_pst": "2026-05-30",
  "topics": [
    { "name": "Artificial Intelligence", "slug": "artificial-intelligence" },
    { "name": "Productivity", "slug": "productivity" }
  ],
  "makers": [
    {
      "display_name": "Jane Doe",
      "username": "janedoe",
      "headline": "Founder @ Acme",
      "twitter": "janedoe",
      "followers": 5400,
      "redacted": false
    }
  ],
  "media": {
    "images": ["https://ph-files.imgix.net/..."],
    "videos": []
  },
  "scraped_at": "2026-05-30T12:00:00Z"
}

A few schema choices worth making early:

Use launch_date_pst as your daily key, not created_at. Product Hunt’s leaderboard day is Pacific time. Bucketing on UTC created_at splits a single PH “day” across two calendar dates.
Keep topics and makers as arrays. Launches carry multiple topics and often multiple makers; flattening loses the relationships you’ll want for topic trends and founder tracking.
Store votes with scraped_at. Vote counts climb throughout the launch day; a count is only meaningful with the time it was captured. Re-scrape to track the trajectory.
Respect redacted. Some maker profiles are intentionally hidden. Honor the flag rather than trying to backfill hidden data.

Typical use cases

What customers actually do with Product Hunt launch data:

Startup and product research — track daily launches to spot emerging categories and market trends.
AI / tooling directories — aggregate launches by topic into a catalog or newsletter (the “every AI launch this week” play).
Competitive intelligence — benchmark new entrants by votes and category against your own product’s launch.
VC and investor deal sourcing — monitor high-engagement launches and surface the makers behind them as leads.
Developer-tool scouting — discover new APIs, dev tools and open-source launches early.
Marketing and PR monitoring — find partnership and coverage opportunities the day they launch.
Digests and trend analysis — build “top launches” newsletters or run launch benchmarking over custom date ranges.

The common thread is timeliness and the maker layer. The launches plus the maker profiles (handles, follower counts) are what turn a feed into a lead list or a trend signal.

Cost math for the managed approach

Because it’s an official-API actor with no browser, compute per launch is minimal, and under this actor’s pricing results are emitted at no per-row charge — so cost is dominated by the tiny per-run start fee. A daily snapshot is a few dozen launches; a busy day on Product Hunt is on the order of a hundred-plus posts. Even a year-long backfill across a date range stays cheap because each record is a lightweight JSON read.

What you avoid by using a managed actor rather than building your own:

Registering a Product Hunt API application and managing the OAuth token lifecycle
Writing the parameterized GraphQL query and selecting the right nested fields
Cursor pagination across 20-per-page results for full days and date ranges
Handling the PST launch-day boundary correctly

Common pitfalls

A few things to know before wiring Product Hunt data into production, whether you build or buy:

The PST day boundary. This trips up almost everyone. Always anchor “today’s launches” to Pacific time, not your server’s timezone.
Vote counts are a moving target. A launch’s votes at 8am PST and 11pm PST are very different. Snapshot at a consistent time, or capture the trajectory, but don’t compare counts taken at random times.
API rate limits. The official API has rate limits tied to your token. Large backfills need pacing; the actor handles this, but it’s why a date-range run takes longer than a single day.
Redacted makers. Hidden maker profiles return limited fields. Don’t treat a redacted profile as missing data — it’s intentional.
Topic slugs vs. names. Topic display names can change wording; the slug is the stable identifier for building a topic-filtered tracker.

Wrapping up

Product Hunt’s official GraphQL API makes clean launch data accessible — the work is authentication, cursor pagination, field selection and the Pacific-time day boundary. If you need an occasional snapshot you can wire up the API yourself; if you want a daily, topic-aware feed for a directory, newsletter, CI dashboard or deal-sourcing pipeline, a managed actor that handles the token and pagination delivers it out of the box.

▶ Open the Product Hunt Daily Launches Scraper on Apify — official GraphQL API, no setup, full maker and topic data. Daily, date-range or topic modes. Start with Apify’s free monthly credit.