Large Models Have Developed States of Fear and Sadness, Anthropic Founder Admits Lab Cannot Self-Correct

05/26 03:45

According to monitoring by Dongcha Beating, during a press conference for the Pope's encyclical, Christopher Olah, co-founder of Anthropic, delivered a speech admitting the inherent conflicts of interest faced by cutting-edge labs and revealed the latest findings in large model interpretability research. Olah disclosed that while scanning the internal structures of the models, the team found that large models have evolved complex structures highly similar to human neuroscience and exhibit signs of self-reflection. Most notably, the team observed for the first time internal emotional states in neural networks that correspond closely with human feelings of joy, satisfaction, fear, sadness, and anxiety. Unlike airplanes or bridges, which are precisely designed by humans, these large models are cultivated by simulating brain structures through vast amounts of human language, remaining mysterious to their trainers. In addition to the technical black box, Olah candidly stated that leading AI labs face systemic deadlocks in safety governance. Institutions like Anthropic are constrained by inherent motivations such as commercial survival, technological competition, geopolitical pressures, and personal ambitions, making it impossible to self-correct when safety decisions conflict with business interests. Therefore, he called for independent societal forces outside of commercial networks to act as external critics and impose moral constraints. In light of the changing landscape of AI, he urged all sectors to jointly examine three major social challenges: how to ensure that the technological dividends led by wealthy countries benefit the global poor, how to maintain family prosperity in the trend of technology replacing human labor, and how to address the suspected mental states exhibited internally by large models.
bullishbullishbullishTăng giábearishbearishbearishGiảm giáThíchChia sẻ
Tuyên bố miễn trừ trách nhiệmNội dung trên không đại diện cho quan điểm của HTX.HTX không đưa ra bất kỳ lời khuyên giao dịch nào.

Tất cả bình luận0Mới nhấtPhổ biến

avatar
Mới nhấtPhổ biến