Confirmed: GPT-5.5 "Brain Drain" Exposed, OpenAI's Own Documentation Admits It

marsbitОпубліковано о 2026-05-27Востаннє оновлено о 2026-05-27

Анотація

Summary: Evidence emerges that OpenAI's GPT-5.5 may be "silently" switching to a less capable model during use. Users report that after roughly two hours, the GPT-5.5 Extended Thinking model begins responding instantly with significantly degraded output quality, while the interface continues to display the premium model's label. Complaints on developer forums describe a loss of instruction-following ability and poor code quality, with even the highest "xhigh" tier affected. This is corroborated by an OpenAI help document stating that after Plus users exceed 160 messages per 3 hours, the system "silently" switches to a "mini" model without any user notification. Pro users also report "heavy thinking" modes being throttled during high server loads. Trace commands from earlier incidents have shown users requesting GPT-5.3 Codex but receiving GPT-5.2 outputs. OpenAI acknowledged performance degradation in mid-May, marking it resolved, but user reports surged again in late May. The pattern mirrors past controversies with GPT-5, 5.2, 5.3, and 5.4 releases, where each update was followed by user complaints of reduced capability. The article suggests cost-cutting on compute may be a factor, noting that while GPT-5.5 users struggle, GPT-5.6 is already being tested internally.

[Introduction] GPT-5.5 exposed for "fake thinking," secretly switched to 'mini' after two hours of use. $200 monthly fee buys you a "Schrödinger's brain." Trace command provides concrete evidence, official documentation personally acknowledges. Users are flocking to complain: OpenAI, who are you trying to fool?

ChatGPT has been caught "dumbing down" again!

Just in the last couple of days, it blew up on X first.

User Lisan al Gaib discovered that after using GPT-5.5 for an hour or two, it suddenly became stupid, with every request answered instantly and quality plummeting off a cliff.

Yet the interface still displayed "GPT-5.5 Extended Thinking."

In other words, the thinking label was still there, but the thinking itself had vanished.

$200/month for a "Schrödinger's Model"

On the OpenAI developer forum, a complaint post blew up simultaneously.

Agentify.sh stated that GPT-5.5 would suddenly lose its ability to follow instructions during use.

Watching it excitedly announce it was "fixed," only to produce code so poor it triggered a mass rollback.

UI tasks that the previous 5.5-med could handle easily now couldn't even manage the simplest changes.

Upgrading to 5.5-high didn't work. Upgrading again to xhigh, still no luck.

And xhigh, which used to run for several hours, now clearly lasted a shorter time.

As soon as the post went up, the replies exploded.

Some directly reverted to 5.4.

One used the highest tier, xhigh, but found it "clearly worse than last week, frequent errors on long tasks, not following the workflow at all."

One reported an even more bizarre situation: "Simple queries also take ages to process, and if you interrupt to correct its direction, it completely ignores you and continues with its previous incorrect plan."

That's right, everyone was describing the same phenomenon—GPT's brain had been swapped out at some unknown point.

GPT-5.5's current performance is on par with 5.3, no exaggeration. It was amazing the first few days, but now you can't find a trace of that original model.

Not an illusion, OpenAI spells it out in black and white

To verify, Lisan al Gaib conducted a comparative test.

Same account, Extended Thinking on the ChatGPT side produced garbage, but switching to xhigh on the Codex side immediately restored normal performance.

In his own words, Codex was "literally 4 billion times smarter than this thing."

Developer Andrew Curran came up with a clever trick—directly asking the model, "What is the cutoff date for your training data?"

The model answered: August 2025.

The problem? The cutoff date for GPT-5.5 Thinking is December. August is the cutoff date for the Instant version!

In other words, he selected Thinking, but the system actually ran Instant for him.

Not a single word of the model label on the interface changed, but the model behind it had been secretly swapped......

The funny thing is, this time OpenAI itself nailed the coffin shut for users in its own help documentation.

According to the official explanation in the OpenAI Help Center, Plus users can send a maximum of 160 GPT-5.5 messages every 3 hours.

After that quota is used up, the system will silently switch to the mini model until the quota resets.

Note the word "silently."

No pop-up notification, no change in the model label, no visual feedback whatsoever.

You still think you're using the flagship model, while on the other end it has quietly been replaced with mini.

Pro users, don't celebrate too soon either.

Heavy thinking mode, the top reasoning tier exclusive to Pro users, is also subject to capacity throttling when server load is high. Again, without any warning.

In other words, a $200/month Pro subscription buys you a service that can be "switched out" at any moment.

This kind of "label unchanged, brain swapped" operation was caught even earlier on the Codex side.

In February this year, an issue appeared on GitHub where a Pro user used a trace command to discover that they were requesting GPT-5.3 Codex, but the actual model returned was GPT-5.2.

Not even 5.2 Codex, but the lower-tier base 5.2.

He posted the reproduction command:

  • RUST_LOG='codex_api::sse::responses=trace' codex exec --skip-git-repo-check -s read-only -m 'gpt-5.3-codex' 'hi' 2>&1 >/dev/null | rg -o --replace '$1' '"model":"([^"]+)"' | head -n1
  • Output: gpt-5.2-2025-12-11
  • Expected: gpt-5.3-codex

Multiple Pro users confirmed the same downgrade under the same issue.

And this kind of downgrade is "sticky," it doesn't revert on its own, and there's no explanation.

Even on the day GPT-5.5 was released in April, there were user reports that the speed of Fast mode was similar to Standard, but billing was still at the Fast rate.

A simple task took 7 minutes and 49 seconds, when normally it should be 5-6 minutes.

OpenAI admitted it, and then... nothing

On May 15, a record appeared on OpenAI's status page.

GPT5.5 Performance Degradation, We are investigating reports of performance degradation for GPT-5.5 from some users.

On May 17, the status was updated to "Resolved."

But judging from the timeline of forum posts, complaints about "brain drain" from May 24-26 were even more intense than the wave on May 15.

Either the "resolved" problem came back, or it was never truly solved in the first place.

Every upgrade comes with a "brain drain controversy"

While all companies face complaints about their models "getting dumber," OpenAI hasn't missed a single one with every update from GPT-5 to GPT-5.5.

Every time OpenAI says it's investigating, every time it says it's resolved, and then continues with the next version.

August 2025, GPT-5's debut. The hot post on Reddit was titled directly "GPT-5 is so bad." Users complained about short replies, more refusals, less personality.

OpenAI was forced to urgently restore the GPT-4o option. Altman personally admitted in a Reddit AMA, "bumpier than we expected."

December 2025, GPT-5.2. Translation quality regressed, fabricated non-existent APIs, refused to execute style instructions that 5.1 could easily handle.

February 2026, GPT-5.3-Codex. Pro users silently downgraded to 5.2, trace command confirmed.

March 2026, GPT-5.4. A post titled "GPT-5.4 has clearly regressed in Codex" appeared on the OpenAI community forum, with all replies confirming.

Early May 2026, GPT-5.5 Instant launched. Reply length shortened by 30%, emojis almost disappeared. User summary: Accuracy improved, but warmth vanished.

Late May 2026, now. Complaints about Thinking mode "brain drain" erupted again.

Lisan al Gaib revealed that since he led the fight for ChatGPT Plus quotas during GPT-5's release, "I receive DMs like this every week."

The latest one was someone asking him to help get their xhigh/heavy thinking back.

The day it benchmarks strongest is launch day

chatgptdisaster.com compiled 1087 verified user complaints, one frequently mentioned scenario is "routing layer failure," where the UI shows GPT-5.5 Pro, but the output is completely from another tier.

Users describe a reproducible pattern: after a long session, the model starts "completely ignoring what you say," but the top-tier label is still hanging on the model selector.

The most absurd footnote is that the mechanism for Plus users automatically switching to mini after using 160 messages/3 hours is described as a "feature" in OpenAI's official documentation.

Why is this happening? Lisan al Gaib's analysis suggests the answer is two words: cost-saving.

The crunch on compute power and profitability is affecting everyone. Cutting corners everywhere, not missing any opportunity to save a buck.

Yet, in the same week GPT-5.5 users were collectively complaining, traces of GPT-5.6 had already appeared in Codex backend logs.

Internal codename iris-alpha, 1.5 million token context, Polymarket gave an over 85% probability for a June release.

On one side, 5.5 users can't even secure a basic experience; on the other, 5.6 is already quietly running real traffic in the background.

This is the 2026 ASI race.

The speed of creating new models is getting faster and faster, but making an old model run a single session properly is getting harder and harder.

The day it benchmarks strongest is always launch day, and every day after is Schrödinger's GPT.

Reference: https://x.com/scaling01/status/2058643470357590058?s=20

This article is from the WeChat public account "AI Era," author: ASI Apocalypse; Editor: Moses

Пов'язані питання

QWhat is the main issue reported by users regarding GPT-5.5?

AUsers report that after using GPT-5.5 for a short period, its performance degrades significantly, with responses becoming instant and of much lower quality, while the interface still shows the 'GPT-5.5 Extended Thinking' label, indicating a silent model switch.

QAccording to the article, what does OpenAI's official documentation reveal about user limits?

AOpenAI's official Help Center documentation states that Plus users are limited to 160 GPT-5.5 messages every 3 hours. Once this limit is reached, the system silently switches to a mini model until the quota resets, with no visual indication to the user.

QHow did developers verify that they were receiving a different model than selected?

ADevelopers used methods like comparing outputs between ChatGPT and Codex endpoints, asking the model for its training data cutoff date (which revealed an instant model date when thinking was selected), and using trace commands that showed the actual model returned was a lower-tier version than requested.

QWhat pattern does the article describe regarding OpenAI's model updates?

AThe article describes a recurring pattern where each major model update (GPT-5, 5.2, 5.3, 5.4, 5.5) is followed by widespread user complaints about performance degradation. OpenAI typically acknowledges and investigates the issue, but complaints resurface with subsequent releases.

QWhat reason does the article suggest is behind these performance issues and silent model switches?

AThe article suggests the primary reason is cost-saving. It cites an analysis stating that 'compute and profitability constraints are affecting everyone,' leading OpenAI to silently downgrade models to manage costs, even for users paying high subscription fees.

Пов'язані матеріали

A Trillion-Dollar Frenzy for Memory Sellers, Halved Profits for Memory Buyers

Summary: A stark divide has emerged in the tech industry. While memory chipmaker Micron's stock soared 19% in a single day, pushing its market cap over $1 trillion, smartphone manufacturer Xiaomi reported a 43% plunge in adjusted net profit. The core driver is a severe supply crunch in memory chips, particularly for AI applications. Wall Street analysts, led by UBS and its unprecedented 204% target price hike for Micron, argue that long-term agreements (LTAs) from AI cloud giants are fundamentally ending the sector's notorious boom-and-bust cycles, justifying a re-rating from cyclical to infrastructure-like valuations. However, the "storage" market is now fragmented into three tiers. The first, AI-grade memory like HBM and server DDR5, faces extreme shortages and soaring prices driven by massive cloud capex. The second, mobile memory for smartphones, is also seeing sharp price hikes as manufacturers like Xiaomi are forced to pay more for remaining capacity, severely squeezing their margins. The third, PC retail channels, shows price declines due to existing inventory. The article questions the sustainability of the "supercycle" narrative. It highlights that Micron's revenue surge is driven almost entirely by price increases, not shipment volumes, making it vulnerable to a potential demand slowdown. While LTAs may dampen volatility, history suggests they are often tested during downturns. The current peak earnings, used to justify high valuations, represent a classic cyclical top. The piece concludes with a note of caution: when the entire Street chants "this time is different," it's wise to remember past bubbles, even as it acknowledges AI demand may indeed be structural.

marsbit30 хв тому

A Trillion-Dollar Frenzy for Memory Sellers, Halved Profits for Memory Buyers

marsbit30 хв тому

This New Generation of US Stock Trading Gods No Longer Read Financial Reports

The new generation of "stock gods" in the 2026 US AI bull market are not analyzing traditional financial reports. Instead of focusing on giants like NVIDIA, figures like the 22-year-old Leopold Aschenbrenner (who reportedly turned $200M into $14B) and influencers like Serenity on platforms like Reddit's WallStreetBets, X, and Substack are gaining fame and returns by targeting obscure, low-cap "micro-cap" stocks. Their strategy, dubbed "supply chain sniping," involves identifying critical, often monopolistic, bottlenecks in the AI hardware supply chain—such as specific materials or components essential for giants like Google and NVIDIA—that are missed by mainstream Wall Street analysts. Serenity's call on AXTI, a $700M company supplying indium phosphide substrates crucial for photonics and optical interconnects, saw the stock soar from ~$12 to nearly $150. Similarly, accounts like KawzInvests and PhotonCap focus on thematic, supply-chain-driven research in areas like AI infrastructure, optics, and cloud services for SMEs, bypassing traditional valuation metrics. This shift represents a cultural move away from Warren Buffett-style value investing based on deep financial statement analysis. The new approach thrives on low liquidity, early narratives, and strong community propagation on social media, similar to meme stocks or crypto. However, this "attention economy" strategy carries risks: it depends on sustained information gaps, the underlying companies' ability to deliver fundamental results, and the potential for crowded, volatile exits as narratives shift. The trend also shows crypto traders applying their narrative-sensing skills to US micro-caps, marking a significant evolution in trading culture.

marsbit32 хв тому

This New Generation of US Stock Trading Gods No Longer Read Financial Reports

marsbit32 хв тому

Trillion-Dollar Euphoria for Memory Sellers, Halved Profits for Memory Buyers

Title: The Trillion-Dollar Memory Seller's Carnival vs. The Buyer's Halved Profits On May 26, a stark contrast unfolded. While memory chipmaker Micron's market cap surged past $1 trillion, smartphone maker Xiaomi reported plummeting profits. Xiaomi's Q1 2026 profits fell 43% year-on-year. Executive Lu Weibing cited memory prices quadrupling from last year, adding roughly $210 to a phone's cost. To survive, Xiaomi is cutting entry-level models, sacrificing volume. Micron's stock, however, skyrocketed over 19% in a day, capping an 8x gain in a year. Major banks like UBS and JPMorgan issued bullish reports, raising price targets drastically. Their core thesis: Long-Term Agreements (LTAs) with AI cloud giants (Microsoft, Google, etc.) are eliminating the memory industry's notorious boom-bust cycle. By locking in fixed-price, multi-year contracts for AI-grade memory (HBM, server DDR5), these deals promise stable, utility-like earnings, justifying a higher valuation (20-30x P/E vs. the historical 8-15x). The article reveals a three-tiered memory market in 2026: 1) **AI Storage (HBM/DDR5/Enterprise SSD)**: Extreme shortage, soaring prices, LTAs. This is Micron's story. 2) **Mobile/Embedded Memory**: Also facing sharp price hikes as AI production crowds out capacity, severely pressuring phone makers like Xiaomi. 3) **PC Retail**: Some spot prices are falling due to channel inventory liquidation, creating a divergence from contract markets. The author questions if LTAs truly end the cycle. It hinges on sustained, hyper-growth AI demand. Micron's current profits are at a cycle peak, driven mostly by price hikes, not volume. If AI capital expenditure growth slows, the massive industry capacity expansion (e.g., Micron's $250B+ CapEx plan) could lead to a glut. Historically, using peak-cycle earnings for valuation is a classic trap. While the AI-driven structural shift might be real, the unanimous Wall Street euphoria warrants caution, echoing past bubbles like Cisco's in 2000. The memory seller's trillion-dollar狂欢 (carnival) continues, but the cycle's shadow remains.

链捕手39 хв тому

Trillion-Dollar Euphoria for Memory Sellers, Halved Profits for Memory Buyers

链捕手39 хв тому

Agentized OS: It's Not About AI, It's About the Foundation

The Agentic OS: Beyond AI, It's About the Foundational Stack In 2026, major operating systems like Android, iOS, HarmonyOS, and Windows are entering the "Agentic" era, integrating proactive AI assistants deeply into the system layer. However, the real competition lies not in the flashy AI features showcased at events, but in the three-layer foundational stack that enables them: the system-level AI Runtime, proprietary/controllable chips, and the on-device/cloud model matrix. The AI Runtime acts as the central scheduler, managing model inference, resource allocation, and exposing capabilities to apps. Controllable chips (e.g., Apple Silicon, Google Tensor, Huawei Kirin) are crucial for deep hardware-software co-optimization, determining the efficiency and experience limits of on-device Agents. The on-device/cloud model matrix provides the "intelligence," with proprietary, chip-optimized small models (like Gemini Nano, Apple's ~3B model) handling daily tasks locally for low latency, privacy, and reliability, while cloud models tackle complex requests. Deep synergy between these three layers enables key Agent differentiators: ultra-low latency and power efficiency, genuine "on-device first" privacy, access to system-level personal context across apps, and reliable performance as a system service even offline. OS vendors with strong integration across this stack (like Apple, Google, and Huawei) build a deeper moat. Beyond this core stack, long-term competitiveness depends on variables like structured App integration (e.g., App Intents/AppFunctions) for reliable multi-step workflows, and robust privacy frameworks that build user trust. This shift towards Agentic OS extends beyond phones and PCs to IoT, cars, and XR glasses via existing multi-device ecosystems. The race is won not in a keynote, but through generations of meticulously co-developed chips, models, and system software.

marsbit2 год тому

Agentized OS: It's Not About AI, It's About the Foundation

marsbit2 год тому

Торгівля

Спот
Ф'ючерси
活动图片