# Пов'язані статті щодо Alignment

Центр новин HTX надає останні статті та поглиблений аналіз на тему "Alignment", що охоплює ринкові тренди, оновлення проєктів, технологічні розробки та регуляторну політику в криптоіндустрії.

Can Humans Control AI? Anthropic Conducted an Experiment Using Qwen

Can Humans Control Superintelligent AI? Anthropic’s Experiment with Qwen Models Anthropic conducted an experiment to explore whether humans can supervise AI systems smarter than themselves—a core challenge in AI safety known as scalable oversight. The study simulated a “weak human overseer” using a small model (Qwen1.5-0.5B-Chat) and a “strong AI” using a more powerful model (Qwen3-4B-Base). The goal was to see if the strong model could learn effectively despite imperfect supervision. The key metric was Performance Gap Recovered (PGR). A PGR of 1 means the strong model reached its full potential, while 0 means it was limited by the weak supervisor. Initially, human researchers achieved a PGR of 0.23 after a week of work. Then, nine AI agents (Automated Alignment Researchers, or AARs) based on Claude Opus took over. In five days, they improved PGR to 0.97 through iterative experimentation—proposing ideas, coding, training, and analyzing results. The findings suggest that, in well-defined and automatically scorable tasks, AI can help overcome the supervision gap. However, the methods didn’t generalize perfectly to unseen tasks, and applying them to a production model like Claude Sonnet didn’t yield significant improvements. The study highlights that while AI can automate parts of alignment research, human oversight remains essential to prevent “gaming” of evaluation systems and to handle more complex, real-world problems. Anthropic chose Qwen models for their open-source nature, performance, scalability, and reproducibility—key for rigorous and repeatable experiments. The research demonstrates progress toward automated alignment tools but also underscores that AI supervision remains a nuanced, human-AI collaborative effort.

marsbit2 дні тому 09:28

Can Humans Control AI? Anthropic Conducted an Experiment Using Qwen

marsbit2 дні тому 09:28

Penetrating the Noise of Ethereum's 'Degeneration': Why is 'Ethereum Values' the Widest Moat?

Amidst recent debates questioning Ethereum's perceived "regression" compared to high-performance blockchains, this article argues that Ethereum’s core strength lies in its foundational values—decentralization, censorship resistance, and long-term reliability—rather than short-term efficiency. While other chains prioritize speed through centralized trade-offs, Ethereum emphasizes resilience under worst-case conditions. It has never experienced a full-network outage or rollback in nearly a decade of operation. This resilience stems from deliberate design choices: avoiding hardware centralization, maintaining low node operation costs, and ensuring ordinary users can verify the chain. The concept of "Ethereum Alignment" is clarified not as blind loyalty but as a multidimensional social contract involving technical alignment (using Ethereum’s consensus and open standards), economic alignment (value accrual to ETH), and ideological alignment (public good over extractive growth). Ethereum’s slower evolution reflects a conscious trade-off: performance improvements must integrate with existing security assumptions without compromising decentralization or censorship resistance. Despite criticism, growing ETH staking numbers indicate continued trust in its model. In essence, Ethereum’s “conservative” is strategic—prioritizing sustainable trust over temporary gains, making its value proposition the widest moat in Web3.

marsbit01/09 10:40

Penetrating the Noise of Ethereum's 'Degeneration': Why is 'Ethereum Values' the Widest Moat?

marsbit01/09 10:40

活动图片