Urgently hiring Use left and right arrow keys to navigate
Hours Full-time, Part-time
Location Santa Rosa, California

About this job

Job Description

Job Title: Research Engineer – AI Post‑Training, Safety & Alignment

Location: San Francisco, Bay Area


About Us

We’re an independent research team working at the cutting edge of AI safety and post‑training technique development. Our mission is to make frontier models more reliable, interpretable, and aligned with human values — and we’re looking for experienced research engineers to help us push that frontier.


The Role

You’ll design and run experiments on the post‑training stack — from RLHF and preference optimization to scalable oversight and interpretability. We’re particularly interested in engineers who can bridge the gap between theoretical alignment research and robust, production‑level experimentation.


Responsibilities

  • Implement and evaluate novel post‑training techniques (RLHF, DPO, constitutional methods, etc.)
  • Collaborate with alignment researchers on safety‑critical model evaluations
  • Develop tooling, datasets, and metrics for measuring model behavior and robustness
  • Contribute to open‑source or internal research efforts with publishable outcomes


You Might Be a Fit If You:

  • Have experience at top AI labs (e.g., Anthropic, DeepMind, OpenAI, or similar research orgs)
  • Are fluent in modern deep learning frameworks (PyTorch, JAX, or equivalent)
  • Have hands‑on experience with large‑scale model training or fine‑tuning pipelines
  • Care deeply about building systems that do what we intend


Why Join Us

  • Work alongside top minds in alignment and post‑training research
  • Shape the direction of safe, aligned AI development
  • Flexible setup (remote, hybrid, or co‑location)
  • Competitive compensation and meaningful ownership


Apply for a confidential chat!


Nearby locations

Posting ID: 1222135151 Posted: 2026-02-24 Job Title: Research Engineer