To Counter Chinese Models, Silicon Valley's Big Three Even Formed an 'Avengers Alliance'?

marsbitОпубликовано 2026-04-22Обновлено 2026-04-22

Введение

In a rare move, Silicon Valley's AI giants—OpenAI, Anthropic, and Google—have formed a coalition called the "Frontier Model Forum" to combat what they term "adversarial distillation." This practice involves companies, particularly Chinese firms like DeepSeek, Moon Dark Side, and MiniMax, allegedly using massive API interactions to extract and replicate the capabilities of advanced models. Anthropic's report claims these firms engaged in over 16 million interactions, copying logic, reasoning chains, and model behaviors to train their own systems at lower costs. While distillation is a known technique, the alliance argues it threatens both commercial interests and safety, as distilled models may bypass critical risk assessments. However, the accusations are met with skepticism, citing hypocrisy, as these giants themselves face lawsuits over unauthorized data scraping. The debate highlights the unresolved ethical and legal gray areas in AI development.

Some time ago, Silicon Valley's AI 'Big Three'—OpenAI, Anthropic, and Google—very rarely formed what could be called an 'Avengers Alliance'.

According to a Bloomberg report, the three rivals, who usually can't wait to outdo each other, are now sharing information through a 'Frontier Model Forum' with a clear goal: to jointly identify so-called adversarial distillation behavior.

If you don't understand what this so-called 'adversarial distillation behavior' is, that's okay. But Shichao wants to say, this time, it's clearly targeting domestic large models.

If we rewind the timeline to February this year, the conflict was already out in the open.

At that time, Anthropic released an investigative report, publicly naming DeepSeek, Moon Dark Side (Yue Zhi An Mian), and MiniMax, stating that these three companies created about 24,000 fraudulent accounts, interacted with Claude over 16 million times, and then used the extracted精华 (essence) data to train their own models.

In this report, the scale of each company's distillation activities and their targets were clearly detailed.

For example, MiniMax, with the largest scale, initiated over 13 million interactions and followed closely; shortly after Anthropic released a new model, they redirected their traffic.

DeepSeek's distillation scale was relatively smaller, with over 150,000 interactions, but specifically targeted chain-of-Thought reasoning.

Of course, labeling these interaction behaviors as 'adversarial distillation' is purely Anthropic's one-sided claim, as there's no way to prove that the data was used to train models.

However, Anthropic isn't the only one feeling the sting of distillation.

Around the same time, OpenAI also complained to the U.S. Congress, accusing DeepSeek of using model distillation technology to illegally replicate their product functionality.

So Shichao feels that this alliance of the three companies might be getting ready to take serious action.

But before discussing 'anti-distillation', we probably need to first understand what this 'distillation' technology is that has the giants so worried?

Actually, it's not that mysterious. Everyone knows that model training consumes computing power, data, and time. The logic of distillation is that even if your resources are limited, as long as you find a master to guide you, you can train a top student who is 70-80% similar to the master in a short time.

The core lies in learning 'soft labels', which are the probability distributions output by the large model.

Three years ago, the API environment was much more relaxed than it is now; the teacher not only gave you the answer but also spat out the probability distribution, which was convenient for research.

But later, for some reason, the major model manufacturers welded their doors shut. For example, OpenAI's API rules state that you can only see the top 5 most probable words.

So the distillation approach evolved into black-box distillation, chain-of-thought distillation. What Anthropic and OpenAI refer to as distillation attacks often talk about imitation of thinking and logic.

This type of distillation requires massive API calls.

Specifically, you need to write a script to ask the teacher questions day and night, not only to get the standard answer but also to see how the teacher answers the questions, how many turns it takes, what pitfalls it avoids, and then package these master teaching materials to take home and feed to your own model.

Using lower costs to quickly replicate the capabilities of a top-tier model—this is distillation.

In other words, the Silicon Valley AI giants are accusing domestic model manufacturers of stealing their techniques.

But upon closer thought, this matter is full of weirdness.

Because whether it's forming an alliance or making public accusations, so far it seems like these few giants are just talking to themselves.

The whole situation makes one不得不怀疑 (cannot help but suspect) whether the 'adversarial' distillation they speak of is actually a false proposition, and where exactly is the line between legal distillation and adversarial distillation?

Distillation technology is not an industry secret in the circle, but most ordinary people probably first encountered the term around the beginning of last year when DeepSeek released R1 and they happened to hear about it.

Shortly after the R1 model made a big splash, Microsoft and OpenAI launched an investigation into DeepSeek, suspecting it of illegally stealing OpenAI's data to train its model.

Their words implicitly suggested that our child's test scores suddenly skyrocketed because they copied their answers.

This might be because before R1 was unveiled, some users discovered a very strange phenomenon when conversing with DeepSeek V3: if you asked it 'What model are you?', it would sometimes answer that it was ChatGPT... which led to a lot of external speculation.

However, DeepSeek later specifically explained in the supplementary materials of their paper that the pre-training data for DeepSeek-V3-Base came entirely from the internet, with no intentional use of synthetic data.

Since then, distillation has been quite controversial within the industry.

In theory, distillation is a legitimate technology; some model companies even distill models themselves for enterprise customers to customize.

But 'adversarial distillation', i.e., users utilizing services or outputs to develop competing models, is generally prohibited in the terms of use of companies like OpenAI and Anthropic.

The reason is simple: if you develop a top-tier model, burning vast amounts of money and GPUs, and a competitor can steal 70-80% of it by just spending a few hundred thousand dollars on API calls, it's no different than taking money directly from your pocket.

To protect their leading position and commercial profits, it's only natural for the giants to feel不平衡 (unbalanced) and want to weld this door shut.

Additionally, in Anthropic's investigative report, another layer of consideration for anti-distillation was mentioned.

Normally, models must undergo red team testing before release to assess risks, aiming to establish a set of safety guardrails to prevent the model from teaching people how to create biological weapons, write malicious code, or make racially discriminatory remarks.

The problem is, distillation doesn't distill these things.

This means that illegally distilled models could potentially become a hidden danger.

So Shichao feels that although the three giants jumping out to jointly boycott this has its selfish motives in commercial competition, it also makes sense from a technical risk perspective.

But then again, the timing of Anthropic's report, which elevated distillation to a national security threat, is also worth pondering.

Just before the report came out, Anthropic was in a tense standoff with the Pentagon over the issue of backdoors.

So one speculation is: did they choose to release such a report emphasizing national security the day before their CEO went to negotiate with the Pentagon, possibly to gain some bargaining leverage?

Of course, as we all know后续 (later), the talks didn't go well.

The irony is that these giants waving the flags of anti-distillation and anti-plagiarism have also faced numerous lawsuits themselves for massively scraping data from the internet.

Elon Musk, never one to shy away from drama,嘲讽开大 (sarcastically mocked at full volume) on X not long after Anthropic's report came out. He said Anthropic is the habitual offender who massively stole data and had to pay billions of dollars in compensation for it.

Including 01.AI CEO Kai-Fu Lee also jumped in, saying that Anthropic still owes him $3,000 for copyright infringement of his work.

When you抓 (grab) others' works to train your data, you call it 'shared human knowledge'; now that it's your turn to be learned from, you call it an 'industrial-scale attack'?

Put simply, what counts as theft, and how does it count as theft? In the field of large models, this is a gray area.

Let's not end up making everyone look like a villain.

This article is from the WeChat public account "差评X.PIN" (Chaping X.PIN), author: Xixi, editors: Jiang Jiang & Mian Xian

Связанные с этим вопросы

QWhat is the 'Frontier Model Forum' mentioned in the article, and what is its purpose?

AThe 'Frontier Model Forum' is an alliance formed by OpenAI, Anthropic, and Google to share information and collaborate on identifying and combating 'adversarial distillation' activities, particularly targeting Chinese AI models.

QWhat is 'adversarial distillation' as described in the article?

A'Adversarial distillation' refers to the practice where companies use large-scale API interactions with advanced AI models (like those from OpenAI or Anthropic) to extract data, such as reasoning processes or outputs, and use it to train their own competing models at a lower cost.

QWhich Chinese AI companies were specifically accused by Anthropic of engaging in adversarial distillation?

AAnthropic accused DeepSeek, Moon's Dark Side (月之暗面), and MiniMax of using approximately 24,000 fraudulent accounts to interact with Claude over 16 million times, extracting data to train their own models.

QWhy are Silicon Valley AI giants like OpenAI and Anthropic concerned about adversarial distillation?

AThey are concerned because adversarial distillation allows competitors to replicate their advanced model capabilities at a fraction of the cost, undermining their commercial advantages and potentially bypassing safety protocols like red team testing, which could lead to unsafe AI systems.

QWhat criticism did the article mention regarding the Silicon Valley giants' stance on adversarial distillation?

AThe article highlights hypocrisy, noting that these giants themselves have faced lawsuits for scraping internet data without permission (e.g., Anthropic was criticized by Elon Musk and Li Kaifu for data theft), while now accusing others of similar practices under the label of 'adversarial distillation'.

Похожее

The Value Distribution of Stablecoins

**Summary: The Value Distribution of Stablecoins** The article argues that stablecoins are evolving from mere trading tools into broader channels for dollar access. It divides the stablecoin ecosystem into four layers to analyze how value is distributed: 1. **Issuance Layer:** Mints stablecoins, holds reserve assets, and captures the spread between reserve yield and user costs (e.g., Tether, Circle). This layer currently earns the largest profit margin. 2. **Infrastructure Layer:** Connects stablecoins to the traditional financial system, handling fiat on/off-ramps, banking integration, compliance (KYC/AML), and asset management (e.g., Bridge, BVNK). This is the "unglamorous" but critical work, building the essential bridges between crypto and real-world finance. 3. **Acquiring/Distribution Layer:** Integrates stablecoins into merchant systems, manages payment flows, and provides enterprise financial software (e.g., Stripe, Coinbase). They act as the access point for businesses. 4. **Application Layer:** The end-users and businesses that ultimately use stablecoins for payments, settlements, or as a store of value. They benefit from convenience but have little pricing power. The core thesis is that while the issuance layer currently dominates profits, the often-overlooked **infrastructure layer holds significant long-term potential**. The real challenge and barrier to mass adoption is not the on-chain transfer of stablecoins (which is simple), but the complex "last mile" integration into existing business workflows, banking systems, and regulatory frameworks across different countries. Companies in this layer are currently in a "land grab" phase, investing heavily to build networks, secure bank partnerships, and establish compliance pathways. While their position is currently pressured by the profitable issuers above and distribution platforms below, the article suggests that if stablecoins become a default financial rail for businesses, the infrastructure providers who have done the hard work of integration will ultimately gain strong pricing power and become entrenched, essential players.

marsbit2 ч. назад

The Value Distribution of Stablecoins

marsbit2 ч. назад

The Value Distribution of Stablecoins

The Value Distribution of Stablecoins The article argues that stablecoins are evolving from a mere trading tool into a broad "dollar channel." It analyzes the industry's value chain through four layers: 1. **Issuance Layer (e.g., Tether, Circle):** The top layer that mints stablecoins, holds reserve assets, and captures the thickest interest rate spread. 2. **Infrastructure Layer (e.g., Bridge, BVNK):** Connects stablecoins to the traditional financial system, handling critical but complex "dirty work" like fiat on/off-ramps, banking integration, compliance (KYC/AML), and cross-border settlement. 3. **Acquiring/Distribution Layer (e.g., Stripe, Coinbase):** Embeds stablecoins into merchant systems, manages payment flows, and integrates with enterprise software. 4. **Application Layer:** End-users and businesses that ultimately use stablecoins for payments, settlement, or storing value. The author posits that while the issuance layer currently captures the most profit, the most overlooked and potentially critical layer is infrastructure. The core challenge for stablecoin adoption isn't the on-chain transfer (which is simple), but bridging the gap between blockchain and the real-world financial system. This involves solving practical problems for businesses: fiat conversion, reconciliation, tax handling, and user onboarding. Infrastructure companies are currently in a difficult "land-grab" phase—building networks, securing banking relationships, and achieving compliance country-by-country. They face pressure from both the profitable issuance layer above and distribution platforms below. However, the author suggests this layer is building a crucial moat. Once stablecoins become a default business rail, the infrastructure players who have done the hard work of integration may gain significant, durable value and pricing power.

链捕手2 ч. назад

The Value Distribution of Stablecoins

链捕手2 ч. назад

How to Do Research Well: Deliberately Practice the Real Skills That Matter

No one truly teaches you how to do research. You're often given a desk, a pre-selected problem, and vague instructions to "create something new." Consequently, many people reverse-engineer the job based on visible outputs—papers, posts, announcements—learning only how to *appear* like a researcher rather than how to *become* one. True research capability is built from stacking small, trainable skills, nearly all of which can be developed through deliberate practice. **Pick Your Own Problem:** Most researchers absorb problems from advisors or trends, lacking the underlying reasoning. Choosing a problem you genuinely care about, as John Schulman advises, leads to original work. Develop "taste" like a muscle: predict experiment outcomes, guess paper results from methods, and track which findings remain important over time. **Upgrade Your Inputs:** Relying on shared reading lists (arXiv hot lists, filtered group chats) leads to unoriginal conclusions. Undervalued old literature often holds crucial insights (e.g., MoE, LSTM, backpropagation). Richard Sutton's "The Bitter Lesson" or Claude Shannon's 1952 talk on creative thinking are more predictive than lengthy modern surveys. Breadth matters as much as depth: draw from neuroscience, mechanism design, hardware knowledge, and honest statistics. Read papers directly, especially appendices and limitations sections. **Write Everything Down:** As Paul Graham noted, writing exposes flaws in seemingly mature ideas. Writing is the cheapest defense against self-deception. Following Feynman's principle, Darwin programmatically wrote down facts contradicting his theory to combat memory bias. Maintain a detailed log of hypotheses, setups, predictions, results, and updated understandings. Reviewing past logs fosters essential humility.

marsbit4 ч. назад

How to Do Research Well: Deliberately Practice the Real Skills That Matter

marsbit4 ч. назад

Торговля

Спот
Фьючерсы

Популярные статьи

Как купить S

Добро пожаловать на HTX.com! Мы сделали приобретение Sonic (S) простым и удобным. Следуйте нашему пошаговому руководству и отправляйтесь в свое крипто-путешествие.Шаг 1: Создайте аккаунт на HTXИспользуйте свой адрес электронной почты или номер телефона, чтобы зарегистрироваться и бесплатно создать аккаунт на HTX. Пройдите удобную регистрацию и откройте для себя весь функционал.Создать аккаунтШаг 2: Перейдите в Купить криптовалюту и выберите свой способ оплатыКредитная/Дебетовая Карта: Используйте свою карту Visa или Mastercard для мгновенной покупки Sonic (S).Баланс: Используйте средства с баланса вашего аккаунта HTX для простой торговли.Третьи Лица: Мы добавили популярные способы оплаты, такие как Google Pay и Apple Pay, для повышения удобства.P2P: Торгуйте напрямую с другими пользователями на HTX.Внебиржевая Торговля (OTC): Мы предлагаем индивидуальные услуги и конкурентоспособные обменные курсы для трейдеров.Шаг 3: Хранение Sonic (S)После приобретения вами Sonic (S) храните их в своем аккаунте на HTX. В качестве альтернативы вы можете отправить их куда-либо с помощью перевода в блокчейне или использовать для торговли с другими криптовалютами.Шаг 4: Торговля Sonic (S)С легкостью торгуйте Sonic (S) на спотовом рынке HTX. Просто зайдите в свой аккаунт, выберите торговую пару, совершайте сделки и следите за ними в режиме реального времени. Мы предлагаем удобный интерфейс как для начинающих, так и для опытных трейдеров.

1.5k просмотров всегоОпубликовано 2025.01.15Обновлено 2026.06.02

Как купить S

Sonic: Обновления под руководством Андре Кронье – новая звезда Layer-1 на фоне спада рынка

Он решает проблемы масштабируемости, совместимости между блокчейнами и стимулов для разработчиков с помощью технологических инноваций.

2.3k просмотров всегоОпубликовано 2025.04.09Обновлено 2025.04.09

Sonic: Обновления под руководством Андре Кронье – новая звезда Layer-1 на фоне спада рынка

HTX Learn: Пройдите обучение по "Sonic" и разделите 1000 USDT

HTX Learn — ваш проводник в мир перспективных проектов, и мы запускаем специальное мероприятие "Учитесь и Зарабатывайте", посвящённое этим проектам. Наше новое направление .

1.8k просмотров всегоОпубликовано 2025.04.10Обновлено 2025.04.10

HTX Learn: Пройдите обучение по "Sonic" и разделите 1000 USDT

Обсуждения

Добро пожаловать в Сообщество HTX. Здесь вы сможете быть в курсе последних новостей о развитии платформы и получить доступ к профессиональной аналитической информации о рынке. Мнения пользователей о цене на S (S) представлены ниже.

活动图片