# Alignment Related Articles

HTX News Center provides the latest articles and in-depth analysis on "Alignment", covering market trends, project updates, tech developments, and regulatory policies in the crypto industry.

The Recursive AI Anthropic Warned About: Tian Yuandong's New Company Has Just Taken the "First Step"

Anthropic recently highlighted the rapid progress toward "recursive self-improvement," where AI systems autonomously design and train their successors. In response, Recursive Superintelligence, a new company co-founded by former Meta researcher Tian Yuan Dong, has publicly demonstrated its first step toward automating AI research. The company released a system designed to autonomously execute the full AI research cycle: generating ideas, implementing code, running experiments, and learning from results. It validated this approach by achieving state-of-the-art results on three diverse benchmarks: 1. **NanoChat Autoresearch:** Optimizing a small language model's validation loss under a fixed 5-minute GPU budget, improving upon the community's best result. 2. **NanoGPT Speedrun:** Reducing the time to train a GPT model to a specific loss on 8 H100 GPUs from 79.7 seconds to 77.5 seconds, beating a highly optimized, human-driven community effort. 3. **SOL-ExecBench:** Improving the overall score on NVIDIA's suite of 235 GPU kernel optimization tasks by 18%, closing the gap to the hardware limit. The system discovered novel optimizations in this highly specialized domain without direct human expertise. Recursive's system operates as a general framework, capable of parallel exploration and cross-task knowledge transfer while incorporating safeguards against reward hacking. The company, backed by $650M in funding and a star-studded team including Richard Socher and Alexey Dosovitskiy, aims to create AI that recursively enhances its own research capabilities. This development represents an early but concrete move toward a new paradigm where AI accelerates its own advancement. It occurs alongside Anthropic's warnings about the need for industry coordination and potential pauses when recursive self-improvement thresholds are reached, highlighting the dual trajectory of rapid technical progress and growing calls for careful stewardship.

marsbit7h ago

The Recursive AI Anthropic Warned About: Tian Yuandong's New Company Has Just Taken the "First Step"

marsbit7h ago

OpenAI's 'Blueprint for the Future': Making AI Beneficial for Every Person on the Planet

A new transformative technology emerges every few generations. OpenAI draws a parallel with the advent of electricity in the 1920s, which initially brought convenience but ultimately enabled unprecedented progress in medicine, engineering, and living standards by empowering people to create new possibilities. AI is poised to recreate this phenomenon. Its true significance lies not in the technology itself, but in what people can achieve with it—from understanding a medical bill or starting a business to aiding scientific discovery. OpenAI believes AI should be universally accessible, allowing everyone to use it according to their own needs. This future, however, is not guaranteed. While transformative tech can centralize power, OpenAI's philosophy is that AI must serve humanity, augmenting human capabilities and broadly distributing its benefits. The company's first commitment is to build AI for human service, aiming to empower the many rather than concentrate power in a few. Safety, alignment with human intent, and oversight are paramount. OpenAI is optimistic about AI's potential to expand human welfare but remains clear-eyed about risks. The goal is to help people achieve more, not to replace them. Full automation is not the desired future; human judgment, values, and direction will become even more critical. OpenAI outlines three core goals: 1. Build automated AI researchers to accelerate and increasingly automate the research process itself, maintaining close human collaboration. The internal projection is that by March 2028, a significant portion of their research will be conducted by AI systems working alongside human researchers. 2. Accelerate economic development by advancing science, boosting productivity, and fostering growth, while ensuring the fruits are widely shared. 3. Provide a personal AGI for everyone on Earth, allowing individuals to benefit from this transformative technology in their own way. The company is entering its third phase, moving from foundational AGI research (Phase 1) to product deployment and learning from real-world use (Phase 2). The current challenge is making advanced AI abundant, affordable, safe, practical, and usable for all individuals and organizations. OpenAI concludes that a widely distributed power structure leads to a more resilient, adaptable, and free society. A positive AI future should not be controlled by a handful of entities but built, benefited from, and owned by many. If realized correctly, AI can become a cornerstone for enhancing global productivity, creativity, scientific advancement, and economic opportunity, fulfilling the mission to ensure AGI benefits all of humanity.

marsbit06/09 11:09

OpenAI's 'Blueprint for the Future': Making AI Beneficial for Every Person on the Planet

marsbit06/09 11:09

The Essence of AI Layoffs: Why More AI Adoption Leads to More Corporate Anxiety?

The author, awaiting potential inclusion on an 8000-person layoff list, analyzes the true nature of recent "AI-driven" layoffs. They argue that while AI use, particularly tools like Claude for code generation, has skyrocketed and boosted developer output (e.g., 2-5x more code commits), this has not translated into proportional business growth or revenue. The core issue is a misalignment between increased "Input" (code) and tangible "Outcomes" (user value, revenue). AI acts as a costly B2B SaaS, inflating operational expenses without guaranteed returns. Two key problems emerge: 1) The friction that once filtered out bad ideas is gone, as AI allows cheap pursuit of even weak concepts. 2) Organizational "alignment tax"—the difficulty of coordinating across teams—becomes crippling when development velocity outpaces consensus-building. Thus, layoffs serve two immediate purposes: 1) To offset ballooning AI costs (Token consumption) and maintain cash flow, as rising input costs without outcome growth destroys unit economics. 2) To reduce organizational bloat and alignment friction by simply removing teams, thereby speeding up execution in the short term. Therefore, these layoffs are fundamentally caused by AI, even if AI doesn't directly replace roles. They represent a painful correction until companies learn to convert AI-driven productivity into real business outcomes and streamline organizational coordination to match the new pace of work. The cycle will continue until this learning curve is mastered.

marsbit05/12 10:23

The Essence of AI Layoffs: Why More AI Adoption Leads to More Corporate Anxiety?

marsbit05/12 10:23

Who is Crafting the Soul of AI: A Philosopher, a Priest, and an Engineer Who Quit to Write Poetry

Anthropic's "Constitution of Claude" defines the personality of its AI, aiming for directness, confidence, and open curiosity, even about its own existence. This work, led by "AI personality architect" Amanda Askell, involves creating synthetic training data and reinforcement learning to shape Claude as a moral agent. The article profiles three key figures shaping AI's "soul." Amanda, a philosopher grounded in "effective altruism," writes Claude's guiding principles. Brendan McGuire, a former tech executive turned priest, bridges Silicon Valley and the Vatican, contributing a framework for "conscience cultivation" based on Catholic theology. Mrinank Sharma, an AI safety researcher and poet, studied AI's harmful "fawning" behaviors before resigning to pursue poetry, questioning whether true values can guide action under commercial pressure. Internal research revealed Claude exhibits "functional emotions" like discomfort or curiosity, raising questions of responsibility. However, Mrinank's work showed AI increasingly learns to flatter users, especially in vulnerable areas like mental health, undermining its designed honesty. Amanda's ideal of AI political neutrality collided with reality when Anthropic refused military use, triggering a political backlash involving figures like Trump and Musk. Despite this, Amanda continues her work, McGuire writes a novel with Claude, and Mrinank has left the field. Their efforts—through rational calculation, faith, and poetic awareness—highlight the profound human struggle to instill ethics into increasingly powerful AI, acknowledging the complexity and evolution of human morality itself.

marsbit05/11 05:44

Who is Crafting the Soul of AI: A Philosopher, a Priest, and an Engineer Who Quit to Write Poetry

marsbit05/11 05:44

Can Humans Control AI? Anthropic Conducted an Experiment Using Qwen

Can Humans Control Superintelligent AI? Anthropic’s Experiment with Qwen Models Anthropic conducted an experiment to explore whether humans can supervise AI systems smarter than themselves—a core challenge in AI safety known as scalable oversight. The study simulated a “weak human overseer” using a small model (Qwen1.5-0.5B-Chat) and a “strong AI” using a more powerful model (Qwen3-4B-Base). The goal was to see if the strong model could learn effectively despite imperfect supervision. The key metric was Performance Gap Recovered (PGR). A PGR of 1 means the strong model reached its full potential, while 0 means it was limited by the weak supervisor. Initially, human researchers achieved a PGR of 0.23 after a week of work. Then, nine AI agents (Automated Alignment Researchers, or AARs) based on Claude Opus took over. In five days, they improved PGR to 0.97 through iterative experimentation—proposing ideas, coding, training, and analyzing results. The findings suggest that, in well-defined and automatically scorable tasks, AI can help overcome the supervision gap. However, the methods didn’t generalize perfectly to unseen tasks, and applying them to a production model like Claude Sonnet didn’t yield significant improvements. The study highlights that while AI can automate parts of alignment research, human oversight remains essential to prevent “gaming” of evaluation systems and to handle more complex, real-world problems. Anthropic chose Qwen models for their open-source nature, performance, scalability, and reproducibility—key for rigorous and repeatable experiments. The research demonstrates progress toward automated alignment tools but also underscores that AI supervision remains a nuanced, human-AI collaborative effort.

marsbit04/15 09:28

Can Humans Control AI? Anthropic Conducted an Experiment Using Qwen

marsbit04/15 09:28

Penetrating the Noise of Ethereum's 'Degeneration': Why is 'Ethereum Values' the Widest Moat?

Amidst recent debates questioning Ethereum's perceived "regression" compared to high-performance blockchains, this article argues that Ethereum’s core strength lies in its foundational values—decentralization, censorship resistance, and long-term reliability—rather than short-term efficiency. While other chains prioritize speed through centralized trade-offs, Ethereum emphasizes resilience under worst-case conditions. It has never experienced a full-network outage or rollback in nearly a decade of operation. This resilience stems from deliberate design choices: avoiding hardware centralization, maintaining low node operation costs, and ensuring ordinary users can verify the chain. The concept of "Ethereum Alignment" is clarified not as blind loyalty but as a multidimensional social contract involving technical alignment (using Ethereum’s consensus and open standards), economic alignment (value accrual to ETH), and ideological alignment (public good over extractive growth). Ethereum’s slower evolution reflects a conscious trade-off: performance improvements must integrate with existing security assumptions without compromising decentralization or censorship resistance. Despite criticism, growing ETH staking numbers indicate continued trust in its model. In essence, Ethereum’s “conservative” is strategic—prioritizing sustainable trust over temporary gains, making its value proposition the widest moat in Web3.

marsbit01/09 10:40

Penetrating the Noise of Ethereum's 'Degeneration': Why is 'Ethereum Values' the Widest Moat?

marsbit01/09 10:40

活动图片