Jobs

Our portfolio companies are hiring in Portland and beyond. They're looking for bright, resourceful folks to be part of their growth. Check out opportunities with our startups below.

« View all jobs

Senior Technical Program Manager (TPM) at Mpathic

Seattle, WA, US / San Francisco, CA, US / Remote

Our Story

mpathic is building the future of empathetic, trustworthy AI. Grounded in behavioral science and human-centered design, our technology delivers AI systems that are safe, aligned, and emotionally intelligent. As enterprises race to adopt AI, we believe the companies that win will be those that build trust first.

We are building a high-quality AI Safety Team to evaluate and strengthen advanced AI systems. Our work focuses on making models reliable, auditable, and scalable—so safety work can move fast without relying on heroics or sacrificing quality.

Position Overview

We’re looking for a Senior Technical Program Manager (TPM) to lead end-to-end AI Safety Human Data Programs.

This role sits at the intersection of:

Human data operations
Trust & Safety policy development
Rubric and Taxonomy development
AI evaluation and benchmarking
Red teaming and edge-case discovery

You will own programs that design, generate, evaluate, and scale high-quality human data — ensuring outputs are reliable, auditable, and actionable for Trust & Safety and ML teams.

This is not an infrastructure ML role. It is an ops and systems-building role focused on human signal, policy operationalization, and scalable evaluation. You will build the systems, workflows, and quality controls that allow clinicians, policy experts, and ML teams to collaborate efficiently at scale.

This is a full-time role (Seattle or Bay Area preferred, remote eligible), reporting to the Head of AI Safety or Evaluation Programs / GM.

What You’ll Accomplish

In your first 60–90 days you’ll…

Take ownership of one or more active AI safety human data program operations with a sharp focus on program execution and quality
Lead the team on timelines, prioritization, and risk management
Establish clear program milestones, throughput targets, and quality benchmarks
Audit and improve existing annotation and QA workflows
Deliver measurable improvements in quality, scalability, or cycle time
Align program outputs with Trust & Safety and ML partner needs

In your first year, you’ll…

Own multiple concurrent human data programs across safety domains, , ensuring consistent quality, prioritization, and delivery standards across initiatives
Establish durable and scalable systems for data generation, benchmarking, red teaming, and evaluation
Partner with our clinical leads to ensure scalable and reusable policy, rubric, and taxonomy frameworks that scale across customers and use cases
Reduce cost and lead time through smarter task design, workflow optimization and capacity planning
Launch reporting dashboards linking human data outputs to policy insights, model improvement and measurable safety improvements
Implement governance standards that ensure auditability and reproducibility across programs
Serve as the internal point of accountability for human data program execution, ensuring that strategic accounts are delivered on time, at quality, and aligned with executive sponsor expectations.

You’ll Thrive in This Role If You…

Have 6+ years of experience in:

Leading complex, cross-functional technical programs in fast-moving or ambiguous environments
Have managed expert, contractor, or vendor-based review programs and understand throughput, calibration, and QA tradeoffs
Are comfortable owning timelines, prioritization, and delivery accountability across multiple parallel workstreams
Focus on scalability of systems, people, etc.
Technical program and human data pipeline management
Operationalizing human data operations
Managing expert or vendor-based human review workflows

And have experience:

Building and scaling human data, Trust & Safety, or evaluation operations that required structured workflows, quality controls, and governanceUnderstand the realities of AI evaluation, red teaming, model benchmarking, or human-in-the-loop systems
Can operate independently with strong judgment while escalating risks early and clearly
Scaling annotation, evaluation, or red teaming programs
Leading cross-functional programs involving policy, product, and engineering
Working with researchers and QA to implement QA systems Working with LLM evaluation, alignment, or model benchmarking
Managing fast-paced, high-demand delivery environments

You are especially strong at:

Turning ambiguous safety goals into structured execution plans with clear milestones and risk management
Building repeatable systems, templates, and playbooks that scale across teams and use cases
Balancing quality, speed, and cost
Setting expectations and communicating clearly with both technical and non-technical stakeholders
Maintaining calm, clarity, and decisiveness in high-pressure or high-visibility environments

What You’ll Do

Own End-to-End Human Data Programs

This role operates in a fast-moving, high-demand environment with overlapping campaigns and tight delivery timelines. The ideal candidate thrives under pressure and can maintain quality while moving quickly.
Lead operations safety data programs from rubric development → data generation → annotation → QA → reporting
Define milestones, SLAs, staffing plans, and delivery timelines
Own prioritization, risk management, and cross-functional alignment
Manage parallel workstreams across internal teams and expert contributors
Identify and mitigate execution risk early

Drive Human Data Quality & Reliability

Design data workflows that balance nuance, speed, and consistency
Implement QA tiers and sampling strategies
Manage drift, quality metrics, and throughput performance
Ensure auditability, reproducibility, and scalable program governance

Cross-Functional Alignment

Partner with clinical QA, reviewers, and trainers ensure successful execution of human data projects
Support ML teams with structured evaluation signal for fine-tuning and benchmarking
Collaborate with Engineering to improve tooling and workflow automation
Deliver executive-ready reporting on program performance, risks and impact

Cross-Functional Collaboration

Work closely with:

TPMs and Evaluation Leads — delivery execution, workflows, escalation systems
Clinical & Behavioral Science Experts — rubric grounding, psychological frameworks
QA Leadership — agreement metrics, gold sets, drift monitoring
Engineering / Product — tooling support for review, audit trails, and escalation queues
Customer Delivery — ensuring findings are interpretable and trustworthy

We value calm execution, clinical rigor, operational excellence, and scalable systems that make high quality work sustainable.

Compensation & Benefits

Base Salary (US): $140,000–$200,000 (band depends on seniority, scope, and number of customer programs or pods owned)
Equity: Yes
Benefits: We offer 100% company-funded health, dental, and vision insurance for full-time employees. Additionally, we offer 401k, well-being programs, and flexible paid-time off.
Remote-first
Mission-driven work focused on AI safety, trust, and operational rigor

Apply Even If You Don’t Check Every Box

If you’re excited about bringing clinical judgment, training excellence, and quality systems into AI safety evaluation work—and want to help ensure emotionally grounded AI systems are safe and trustworthy—we’d love to hear from you.

Apply for this job