# Пов'язані статті щодо Alignment

Центр новин HTX надає останні статті та поглиблений аналіз на тему "Alignment", що охоплює ринкові тренди, оновлення проєктів, технологічні розробки та регуляторну політику в криптоіндустрії.

Can Humans Control AI? Anthropic Conducted an Experiment Using Qwen

Can Humans Control Superintelligent AI? Anthropic’s Experiment with Qwen Models Anthropic conducted an experiment to explore whether humans can supervise AI systems smarter than themselves—a core challenge in AI safety known as scalable oversight. The study simulated a “weak human overseer” using a small model (Qwen1.5-0.5B-Chat) and a “strong AI” using a more powerful model (Qwen3-4B-Base). The goal was to see if the strong model could learn effectively despite imperfect supervision. The key metric was Performance Gap Recovered (PGR). A PGR of 1 means the strong model reached its full potential, while 0 means it was limited by the weak supervisor. Initially, human researchers achieved a PGR of 0.23 after a week of work. Then, nine AI agents (Automated Alignment Researchers, or AARs) based on Claude Opus took over. In five days, they improved PGR to 0.97 through iterative experimentation—proposing ideas, coding, training, and analyzing results. The findings suggest that, in well-defined and automatically scorable tasks, AI can help overcome the supervision gap. However, the methods didn’t generalize perfectly to unseen tasks, and applying them to a production model like Claude Sonnet didn’t yield significant improvements. The study highlights that while AI can automate parts of alignment research, human oversight remains essential to prevent “gaming” of evaluation systems and to handle more complex, real-world problems. Anthropic chose Qwen models for their open-source nature, performance, scalability, and reproducibility—key for rigorous and repeatable experiments. The research demonstrates progress toward automated alignment tools but also underscores that AI supervision remains a nuanced, human-AI collaborative effort.

marsbit2 дні тому 09:28

Can Humans Control AI? Anthropic Conducted an Experiment Using Qwen

marsbit2 дні тому 09:28

Vitalik Buterin Says Perfect Crypto Security Remains Impossible

Vitalik Buterin, the founder of Ethereum, argues that perfect security in the cryptocurrency sector is unattainable due to the complexity of human intent. He explains that blockchain networks cannot perfectly interpret user intentions and hard-code them into inflexible code. Buterin defines security as an alignment problem, where the goal is to ensure the protocol's actions match user expectations. Even basic transactions involve assumptions about identity, network, and interface accuracy that cannot be fully programmed. Instead of pursuing perfect security, Buterin advocates for layered security mechanisms. These include redundancy through multiple independent checks, transaction simulations, spending limits, and address verification. He also suggests that AI could complement, but not replace, cryptographic security by modeling human judgment patterns. However, no technological system can fully emulate human reasoning. Buterin concludes that crypto security is a continuous alignment process rather than a final endpoint, requiring ongoing improvements as technology evolves.

TheNewsCrypto02/23 06:58

Vitalik Buterin Says Perfect Crypto Security Remains Impossible

TheNewsCrypto02/23 06:58

Penetrating the Noise of Ethereum's 'Degeneration': Why is 'Ethereum Values' the Widest Moat?

Amidst recent debates questioning Ethereum's perceived "regression" compared to high-performance blockchains, this article argues that Ethereum’s core strength lies in its foundational values—decentralization, censorship resistance, and long-term reliability—rather than short-term efficiency. While other chains prioritize speed through centralized trade-offs, Ethereum emphasizes resilience under worst-case conditions. It has never experienced a full-network outage or rollback in nearly a decade of operation. This resilience stems from deliberate design choices: avoiding hardware centralization, maintaining low node operation costs, and ensuring ordinary users can verify the chain. The concept of "Ethereum Alignment" is clarified not as blind loyalty but as a multidimensional social contract involving technical alignment (using Ethereum’s consensus and open standards), economic alignment (value accrual to ETH), and ideological alignment (public good over extractive growth). Ethereum’s slower evolution reflects a conscious trade-off: performance improvements must integrate with existing security assumptions without compromising decentralization or censorship resistance. Despite criticism, growing ETH staking numbers indicate continued trust in its model. In essence, Ethereum’s “conservative” is strategic—prioritizing sustainable trust over temporary gains, making its value proposition the widest moat in Web3.

marsbit01/09 10:40

Penetrating the Noise of Ethereum's 'Degeneration': Why is 'Ethereum Values' the Widest Moat?

marsbit01/09 10:40

# Пов'язані статті щодо Alignment

Can Humans Control AI? Anthropic Conducted an Experiment Using Qwen

Vitalik Buterin Says Perfect Crypto Security Remains Impossible

Penetrating the Noise of Ethereum's 'Degeneration': Why is 'Ethereum Values' the Widest Moat?

Indepth Research

Bitcoin