L logiover
jobs · Jun 3, 2026 · 6 min read

How to Scrape Lever Job Postings & Salary Data in 2026

A guide to the unofficial Lever Postings API — how to pull jobs, structured salary, workplace type and content blocks from 5,000+ Lever-powered career pages over plain HTTP.

If you want clean, structured hiring data from real companies — Netflix, Spotify, Shopify, Mixpanel, Lyra Health, and 5,000+ others — the smartest place to look isn’t a job aggregator. It’s the ATS those companies post through. Lever exposes a public Postings API behind every Lever-powered career page, and it returns far cleaner data than scraping the rendered job board: structured salary blocks, separated qualification and responsibility lists, workplace type, and proper location arrays with ISO country codes. This guide covers how that API is shaped, how to filter it, and how to turn it into a hiring-data pipeline.

Why scrape the ATS, not the aggregator

Aggregators re-host job data, lose structure in the process, and lag the source. Going straight to Lever’s Postings API gets you the canonical record:

  • No auth, no browser. It’s a public JSON API over plain HTTP. No headless Chromium, no login, no token rotation.
  • Already structured. Lever returns the posting as data, not as a rendered page you have to re-parse. Salary, location, team, level — all discrete fields.
  • Two instances covered. Lever runs a global instance and an EU instance (data-residency). A complete scrape has to know about both; this actor handles global and EU career pages.

The trade-off: you scrape one employer at a time (per Lever-powered career site), so coverage comes from batching many sites — which the actor does concurrently.

What’s worth extracting

One normalized, flat record per posting. The fields that matter:

  • Identity & timing — posting ID, created/updated timestamps, posting URL and application URL.
  • Taxonomy — employer, team/department, role title, commitment (full-time/contract), seniority level.
  • Location — a multi-location array (some roles list several offices) plus ISO country data.
  • Workplace type — on-site / hybrid / remote, as a structured field, not buried in prose.
  • Content — HTML and plaintext variants of the opening, body and additional sections, plus a concatenated full description.
  • Structured lists — qualifications, responsibilities and other sections separated out — not a single text blob you have to re-segment.
  • Structured salary — a salary block with currency, interval (annual/hourly) and numeric min/max, alongside the plaintext salary description.

Missing fields come back as null or empty arrays rather than vanishing, so your downstream schema stays stable.

How the filtering works: server-side vs client-side

This is the part worth understanding, because it determines what you can do efficiently:

Server-side filters (forwarded to Lever, narrow the fetch at the source):

team / department
location
commitment
level

Use these to fetch only the slice of an employer’s postings you care about — Lever does the narrowing.

Client-side filters (applied after fetch, on the normalized records):

title keyword         e.g. only "engineer" roles
workplace type        e.g. only remote
ISO country code      e.g. only DE, NL
minimum salary        e.g. only disclosed >= 120000

These run on the fetched records — handy when Lever doesn’t offer a server-side equivalent (it has no “minimum salary” server filter, for instance).

Two operational modes

  • Site-wide listing — enumerate all published postings for an employer (with native offset pagination).
  • Per-posting detail refresh — re-fetch specific posting IDs you’re tracking, to detect status changes (still open? closed? edited?).

The actor also handles concurrency and automatic backoff on transient and rate errors, so batching many employers at once stays stable.

Run the Lever Postings API actor — concurrent multi-site scraping of 5,000+ Lever career pages, structured salary blocks, separated qualification/responsibility lists, global + EU instances. Pure HTTP, no auth.

Schema design for downstream use

A clean per-posting record:

{
  "posting_id": "a1b2c3d4-...",
  "employer": "mixpanel",
  "title": "Senior Backend Engineer",
  "team": "Engineering",
  "commitment": "Full-time",
  "level": "Senior",
  "locations": ["New York", "Remote - US"],
  "countries": ["US"],
  "workplace_type": "hybrid",
  "salary": {
    "currency": "USD",
    "interval": "year",
    "min": 165000,
    "max": 210000,
    "description": "$165,000 - $210,000 + equity"
  },
  "lists": {
    "qualifications": ["5+ years backend", "Go or Python", "..."],
    "responsibilities": ["Own the ingestion pipeline", "..."]
  },
  "description_text": "About the role...\n\n...",
  "posting_url": "https://jobs.lever.co/mixpanel/a1b2c3d4",
  "apply_url": "https://jobs.lever.co/mixpanel/a1b2c3d4/apply",
  "created_at": "2026-05-20T00:00:00Z",
  "updated_at": "2026-06-01T00:00:00Z"
}

Schema choices worth making early:

  • Keep the salary block structured. The numeric min/max with currency and interval is the whole reason to use the API over an aggregator. Store the components, not just the description string — that’s what powers compensation benchmarking.
  • Preserve the separated lists. Qualifications and responsibilities arrive pre-split. Keep them separate — it’s a gift to any RAG/LLM ingestion that wants clean, sectioned context.
  • Store both created_at and updated_at. Hiring-velocity and “is this still open” analysis depends on the timestamps; the detail-refresh mode exists precisely to keep them current.
  • Treat countries (ISO) as the geo join key. Free-text locations vary (“Remote - US”, “NYC”); the ISO array is what you filter and group on reliably.
  • Expect nulls. Many employers don’t disclose salary; keep the field nullable and don’t drop the row.

Typical use cases

  • Recruitment intelligence — real-time pipelines of company hiring data; benchmark comp by role, level and region using the structured salary block.
  • HR-tech / ATS analytics — hiring velocity, role mix, remote/hybrid distribution, salary-disclosure trends across employers.
  • LLM / RAG ingestion — feed agents the plaintext content and pre-separated qualification/responsibility lists for cleaner retrieval context.
  • Job boards & aggregators — ingest Lever career pages at scale with multi-site batching and EU coverage.
  • Competitive sales intelligence — detect headcount growth, new team formation, and geographic expansion from posting timestamps and taxonomy; a company spinning up a new team is a buying signal.
  • Operational recipes — scheduled daily remote-job feeds across employers; filter engineering roles above a salary threshold in a target country; refresh tracked posting IDs for status changes; fetch from EU-only employers.

Cost math

Pay-per-event: a tiny run-start fee plus a small per-result charge. Because it’s pure HTTP with no proxy-by-default and no browser, the compute per posting is minimal and runs are fast.

The cost scales with how many employers and postings you pull. A daily feed across a few dozen tracked employers is low single-digit dollars per run; the per-posting detail-refresh mode lets you keep a watchlist current without re-listing entire sites. Compared to building your own multi-site Lever client with pagination, backoff and salary-block parsing — plus keeping up with Lever’s global/EU split — the actor removes the boilerplate and the maintenance.

Common pitfalls

  • Forgetting the EU instance. Employers with EU data residency live on a separate Lever instance. Scrape only the global one and you’ll silently miss those companies. Make sure both are covered.
  • Relying on free-text salary. Some postings put pay only in the description prose, not the structured block. Use the structured salary when present, fall back to parsing the description, and flag which source you used.
  • Treating one employer as “the data.” Lever is per-employer; coverage is the union of many career pages. Build your employer list deliberately.
  • Ignoring updated_at. A posting that’s been open six months and one posted yesterday look identical if you only read created_at. Use both for velocity analysis.
  • Over-trusting workplace_type. It’s structured, which is great, but employers self-tag inconsistently (“remote” that’s actually “remote within one country”). Cross-check against the location array.

Wrapping up

Going straight to Lever’s Postings API beats scraping rendered job boards on every axis: cleaner structure, real salary numbers, separated qualification/responsibility lists, and no browser or auth. The work is in the plumbing — covering the global and EU instances, batching thousands of employers concurrently, paginating, backing off on rate errors, and normalizing the salary block. A maintained actor that does all of that turns 5,000+ career pages into one consistent, query-ready feed.

Open the Lever Postings API actor on Apify — structured jobs and salary data from 5,000+ employers, global + EU, pure HTTP. Pay-per-event. Start on Apify’s free monthly credit.

Related guides