CaML

Compassion in Machine Learning

Robustly improving the values of future AIs

Project mission

CaML researches how pretraining-style data can shift the behavior and personas of AI and how this can be used to improve the alignment of future transformative AI.

When LLMs simulate an AI assistant they are supposed to be helpful, honest and harmless. Yet when online data suggests that the AI assistant character behaves in a misaligned way, LLMs will mimic that behavior, in some cases even when they have been fine-tuned not to.

CaML researches how pretraining data about AI assistants affects LLM behaviors and how improved synthetic data generation can shape future AI personas to be broadly compassionate and morally thoughtful.

As AI capabilities and autonomy grow rapidly relative to humans, these AIs will increasingly reshape the world. AGI and superintelligence that has not internalized desirable goals would be a disaster.

We have already found evidence that Synthetic Document Finetuning can shift an LLM to be more robustly compassionate and open-minded towards non-humans (animals and digital minds), and that remains after typical fine-tuning.

Our Work

Belief robustness

Fine-tuning can train models to adopt specific personas, but these often contain flaws—perpetrating harm when prompted certain ways or mimicking problematic AI behaviors.

We use Synthetic Document Finetuning to shape AI behaviors more carefully, researching better data generation methods to reduce these mistakes.

Evaluations

We test whether models genuinely internalize compassion through custom benchmarks, including animal welfare assessments and human compassion and deception benchmarks.

We also evaluate moral open-mindedness—treating ethics as complex while avoiding both awful outcomes and decision paralysis.

See our Principles section for training details.

Thank you to funders

Thank you to the Survival and Flourishing Fund (SFF), Longview Philanthropy, Marcus Abramovitch, Ryan Kidd, Macroscopic Ventures, Simon Newstead and two anonymous donor for your support! This has helped pay our salaries, pay for compute and enabled us to keep researching how to improve AI values.

We are grateful to SFF for providing additional matching funding for future donors.

Thank you to collaborators

We are grateful to the Sentient Futures community for their support, especially in creating the AHB 2.0 benchmark.

We are also grateful to our volunteers for their support in accelerating our project.

Thank you to partners

CaML extends sincere thanks to Strong Compute for their invaluable support in the form of donated compute time, enabling us to advance our alignment research.

Fund Us

We're in need of funding for the next 6 months to support our team and pay for compute. Our donation form is here.

Join us

We're always looking for help from people with experience in AI benchmarking or synthetic data. Please fill in our volunteer application form here.

Results and News

See the latest CaML achievements in

Results and News

Connect with us

compassioninmachinelearning@gmail.com

Support the Sentient Futures community

Our models and data

Page updated

Google Sites

Report abuse