About CaML
Compassion Aligned Machine Learning improves how today’s frontier models treat animals. We work empirically: we identify concrete failures in how models reason about animal welfare, build interventions that measurably correct them, and validate the results on benchmarks. We care about work that is concrete, measurable, and robust to further training.
The research agenda
This is hands-on, empirical research: find real problems in how models treat animals, and build interventions that measurably fix them on the models that exist today. The focus is concrete and near-term, not abstract philosophy, heavy theory, or bets on architectures a decade away. The kinds of problems you’d work on:
- Models absorb anti-animal bias from their training data and act on it when asked. The work is finding training and data interventions that measurably shift that behavior, and showing the shift survives later fine-tuning.
- We can’t improve what we can’t measure. Animal welfare is thinly covered by today’s evaluations, so part of the work is building benchmarks rigorous enough for frontier labs to actually use.
- Values added to a model can be trained back out. The work is understanding which interventions survive realistic post-training and which don’t.
Every project ends in a concrete artifact: a benchmark, a dataset, a trained model, or a paper that others can build on. You’d help set the direction, not just execute it.
What you’d do
- Read and digest AI safety papers to keep our agendas grounded in the field
- Design and run interventions like the ones above, then write and publish the results
- Lead research agendas from question to result
- Build relationships with academics across AI safety: finding collaborators, gathering feedback, and bringing outside expertise into our work
You should have
- A solid understanding of the AI safety field and its open problems
- Strong Python and hands-on ML engineering skills: you’ve trained or fine-tuned a model yourself
- A bias toward concrete, empirical work over abstract theorizing
- Familiarity with hyperstition, self-fulfilling alignment, and personas
- Concrete ideas for how to contribute papers to the emerging AI × animals field
Bonus
- Your name already on an arXiv ML paper
The details
- $80,000 / year, full-time
- Remote-friendly, with San Francisco preferred
- You’ll need to work substantially within London or California time zones to collaborate with our researchers
- We’re unable to sponsor visas for this role