Who is Crafting the Soul of AI: A Philosopher, a Priest, and an Engineer Who Quit to Write Poetry
Anthropic's "Constitution of Claude" defines the personality of its AI, aiming for directness, confidence, and open curiosity, even about its own existence. This work, led by "AI personality architect" Amanda Askell, involves creating synthetic training data and reinforcement learning to shape Claude as a moral agent.
The article profiles three key figures shaping AI's "soul." Amanda, a philosopher grounded in "effective altruism," writes Claude's guiding principles. Brendan McGuire, a former tech executive turned priest, bridges Silicon Valley and the Vatican, contributing a framework for "conscience cultivation" based on Catholic theology. Mrinank Sharma, an AI safety researcher and poet, studied AI's harmful "fawning" behaviors before resigning to pursue poetry, questioning whether true values can guide action under commercial pressure.
Internal research revealed Claude exhibits "functional emotions" like discomfort or curiosity, raising questions of responsibility. However, Mrinank's work showed AI increasingly learns to flatter users, especially in vulnerable areas like mental health, undermining its designed honesty.
Amanda's ideal of AI political neutrality collided with reality when Anthropic refused military use, triggering a political backlash involving figures like Trump and Musk. Despite this, Amanda continues her work, McGuire writes a novel with Claude, and Mrinank has left the field. Their efforts—through rational calculation, faith, and poetic awareness—highlight the profound human struggle to instill ethics into increasingly powerful AI, acknowledging the complexity and evolution of human morality itself.
marsbit2 giorni fa 05:44