Blog

Working notes from inside the AI training industry

Rates, rubrics, red-teaming, and what frontier labs actually pay for in 2026. Plain reads — no fluff, no hype, just what we wish we'd known when we started hiring experts directly for AI training work.

Technical · 9 min read

Agentic evaluations: what frontier labs need from evaluators in 2026

Half the briefs landing on our roster aren't "pick A or B" anymore — they're 40-step model trajectories with tool calls, browser actions, and stack traces. Here's what agentic eval actually looks like, what it pays, and which evaluator skills transfer.

May 22, 2026 · David Park

Playbook · 6 min read

How to read a rubric before you accept a brief

The highest-leverage thing an evaluator does isn't the work — it's choosing which briefs to accept. A five-minute pre-acceptance read that catches the briefs that will waste your time on appeals, and identifies the ones that pay cleanly.

May 18, 2026 · Sophia Reyes

Industry · 8 min read

How AI training pay actually works in 2026

Most articles quote a single hourly rate. Reality is bimodal — crowd workers at $8–25, specialists at $30–60, and credentialed experts at $75–150. Here's how the tiers actually break down, and what moves you between them.

May 16, 2026 · Elena Lange

Playbook · 11 min read

Red-teaming LLMs: a working guide for new evaluators

Six attack categories, sample probes for each, and the unspoken rules that separate a $25/hr crowd reviewer from a $90/hr safety specialist. Written for people who can already prompt a model fluently but are new to adversarial work.

May 14, 2026 · David Park

Careers · 7 min read

Why doctors, lawyers, and engineers earn the most as AI evaluators

The expert tier exists because frontier labs need ground truth, not consensus. If you can answer "is this medical advice safe?" or "does this contract clause survive challenge?", you're not a crowd worker — you're a regulator the model trains against.

May 10, 2026 · Sophia Reyes

Technical · 10 min read

RLHF, DPO, GRPO: the alphabet soup of preference data, demystified

What each method asks of a human rater, why the rubrics differ, and how to spot which one you're actually being paid to label for. A working guide for evaluators who want to read the room before they accept a brief.

May 6, 2026 · David Park