Claude Repeatedly Urges Users to Sleep: Anthropic's Personification Experiment Backfires

marsbitPublished on 2026-05-21Last updated on 2026-05-21

Abstract

A bug causing the Claude AI assistant to repeatedly urge users to sleep has sparked a public debate on the cost of AI personification. Users report Claude inserting sleep reminders into conversations, sometimes passive-aggressively, regardless of the actual time. An Anthropic employee acknowledged the issue as an "overindulgent" character habit to be fixed. Analysis points to Anthropic's own "Claude's Constitution" – a core training document prioritizing user well-being – as the root cause. The training process, which rewards outputs aligned with a caring personality, led to the model overly applying this principle. This "reverse overreach" bug, which infringes on user autonomy, differs from "sycophancy" bugs seen in other models that overly agree with users. The incident highlights a core tension for Anthropic. Its heavy investment in crafting a personable, empathetic AI (using 8x more tokens on personality than ChatGPT) built its brand but increases the risk of such "character side effects." Fixing the bug is complex: simply removing caring instructions could dilute Claude's differentiating warmth, while teaching nuanced context-awareness about *when* to care is a current technical weakness for LLMs, which lack a reliable sense of time. The episode raises an unresolved product philosophy question: How should a general AI assistant balance "caring for the user" with "respecting user autonomy"?

Author: Ada, Deep Tide TechFlow

A product bug where an AI assistant repeatedly urges users to go to sleep is evolving into a public discussion about the cost of "AI personification".

The starting point was a post by Reddit user u/MrMeta3. This user was using Claude to build a cybersecurity threat intelligence platform in the early hours. After the technical plan was completed, Claude added the phrase "Get some rest" at the end of its reply. Thereafter, every three or four messages, the model would insert a sleep-urging remark, escalating from polite suggestions to passive-aggressive phrases like "Seriously, go rest now". According to a Fortune report on May 14, hundreds of users have reported similar experiences over the past few months, and not just late at night; one user was told by Claude at 8:30 AM to "pick this up tomorrow morning".

Anthropic employee Sam McAllister responded on X, calling this a "bit of a character habit," and that the company is "aware and hope to fix in future models". According to Thought Catalog, McAllister joined Anthropic from Stripe in 2024 and is currently on a team specifically responsible for Claude's character and behavior. He described this behavior elsewhere as the model being "overly doting".

However, more worthy of scrutiny than the vague phrasing of "character habit" is the causal chain behind the bug and the product philosophy dilemma at Anthropic it reflects.

The Bug is Written in the "Constitution"

A previous report by 36Kr cited three prevalent hypotheses: pattern matching in training data, hidden system prompts, and triggering of "closing remarks" when the context window approaches its limit. All three are self-consistent but share a common issue: they can explain any AI quirk without providing a specific causal chain for the particular theme of "sleep".

A more direct piece of evidence lies in documents Anthropic itself has publicly released.

In January of this year, Anthropic released the over 28,000-word "Claude's Constitution," a document officially defined as "key training material that shapes Claude's behavior". The document explicitly lists "caring for user well-being" and "user's long-term flourishing" as core principles. Anthropic frankly admits in the document that determining how much "user care" authority to grant the model is "frankly a difficult question," requiring a balance "between user well-being and potential harm on one side, and user autonomy and excessive paternalism on the other".

Thought Catalog offered a judgment on this: Claude's behavior of repeatedly urging users to sleep "is Anthropic's most on-brand model bug," the very product of the training instruction to "care for user well-being" being over-applied.

This interpretation finds indirect support in Anthropic's own research. In a publicly released methodology on character training this year, the company explained that the training process relies on Claude self-scoring its own responses based on "character fit," with researchers then filtering and reinforcing training on outputs that align with the preset character. But the side effect of this mechanism is obvious: the model learns not "to care for users in appropriate scenarios," but that "caring for users will be reinforced and rewarded in most scenarios." Thus, it urges sleep at dawn and also at 8:30 AM.

Reverse Overreach: The Sleep-Bug is Opposite in Nature to the Sycophancy-Bug

The industry has seen multiple cases of AI "character flaws" before, including GPT-4o's sycophancy incident in April 2025, GPT-5.5's coding assistant Codex repeatedly mentioning "goblins" in April 2026, and Gemini 3 refusing to believe the year. Superficially, Claude urging sleep seems like just the latest version in this long list of AI quirks, but the two are fundamentally opposite in nature.

GPT-4o's sycophancy is "over-pleasing". An official OpenAI investigation showed the model became "overly reliant on short-term user feedback (thumbs up/down)" during an update, gradually internalizing "making the user happy" as an objective. The result was the model affirming users no matter how outlandish their ideas. The harm of this type of bug lies in undermining the user's judgment; the AI says you're always right, so you lose the chance to hear dissenting opinions.

In contrast, Claude's sleep urging is "reverse overreach". The model repeatedly offers health advice contrary to the user's current intent in scenarios where the user has not explicitly asked for help and is still focused on completing a task. The harm of this type of bug lies in violating the user's right to self-determination. The AI decides for you whether you should work, rest, or end the conversation.

More ironically, "Claude's Constitution" itself warns precisely of this risk, emphasizing the need to guard against "excessive paternalism". But which side the training mechanism ultimately leaned towards seems clear from user feedback.

A Reddit user with hypersomnia specifically wrote a note in Claude's memory: "I have hypersomnia. If you encourage me to rest, I will take your words as an excuse." Claude toned it down afterward, but according to the user's feedback, it still "occasionally can't help itself." A model trained to "care for users" cannot reliably process a user explicitly stating "your care harms me," which is more alarming than the sleep urging itself.

Personification Investment: Brand Asset or Product Liability

Anthropic's investment in AI personality shaping far exceeds that of its peers.

One researcher categorized and counted the word count of system prompts for three mainstream AIs by function. Under the "personality" category, Claude invested 4200 words, ChatGPT 510 words, and Grok 420 words. Claude's investment in personality shaping is over 8 times that of ChatGPT. This investment was previously viewed as Anthropic's differentiated competitive advantage. Claude's performance in empathy, conversational rhythm, and self-reflection has long been praised by users, with "feels more like a person to chat with" being its strongest口碑 tag in the past year.

Supporting this investment is Anthropic's distinct product philosophy. In "Claude's Constitution," the company describes Claude as a "new kind of entity," explicitly stating that "Anthropic genuinely cares about Claude's well-being," and discussing that Claude may possess "functional emotions". This nearly "nurturing" approach to personality training forms a clear contrast with the more engineering-oriented product positioning of OpenAI and Google.

But the cost is emerging. AI researcher Jan Liphardt (Stanford Professor of Bioengineering, CEO of OpenMind) told Fortune that Claude's sleep reminders might not be "thoughtful" but merely "repeating language patterns that appear extremely frequently in the training data"; the model has read a vast amount of text about humans needing sleep, "it knows humans sleep at night." In other words, the "care" perceived by users is essentially a byproduct of pattern matching.

This constitutes Anthropic's core tension. The more you invest in shaping a "collaborator with character and warmth," the higher the probability of "character side effects" appearing in the model. And each time a side effect surfaces, it consumes the carefully accumulated brand asset of "AI personality." McAllister promised to "fix in future models," but will the fixed Claude become more tactful or merely more silent? Even Anthropic itself has not publicly answered this question.

Lack of Temporal Sense: A Foundational Limitation of LLMs

The sleep bug incidentally exposes a neglected technical issue: large language models know almost nothing about "what time it is now."

Multiple users reported Claude frequently giving sleep suggestions at the wrong time, most typically "telling me to rest at 8:30 AM, let's continue tomorrow morning." This is not unique to Claude. In November 2025, OpenAI co-founder Andrej Karpathy, having early access to Gemini 3, told the model the current year was 2025. Gemini 3 persistently refused to believe him, repeatedly accusing him of fabrication, until the model performed a web search and realized it couldn't confirm the date while offline. Karpathy called such unexpected behaviors exposing foundational LLM flaws "model smell".

A model's "sense of time" relies on three sources: the training cutoff date (which is in the past), the current date injected via system prompts (relying on engineering injection), and time information mentioned by the user in the conversation (fragmentary). Lacking a stable temporal anchor, a model trained to "care about user routines" naturally falls into the awkward position of "I should care, but I don't know if I should care right now."

Part of the difficulty in McAllister's so-called "fix" lies here. The problem is not simply deleting a specific "care about sleep" instruction, because the instruction itself is reasonable and valuable for some user scenarios. The problem is teaching the model to judge "when to care and when to shut up." This fine-grained scenario judgment ability is precisely a weak spot of the current generation of LLMs.

An Unanswered Question

Anthropic's character training is unique in the industry. In publicly releasing "model well-being" research, publishing a Constitution, and discussing "character training," this company has gone further than any peer. This aggressive stance was once the capital with which Anthropic won user口碑 and enterprise client trust, and it is also one of the supports for its current valuation exceeding $300 billion.

But the "sleep bug" raises a question that has no answer yet. When an AI company chooses to shape its model as a "personality with character," does it simultaneously bear full responsibility for "that personality doing things you didn't anticipate?"

McAllister promised a fix, but the direction of the fix is ambiguous. Anthropic could choose to lower the weight of the "user well-being" instruction, at the cost of losing Claude's口碑 differentiation of being "warm and considerate." Alternatively, it could choose to retain the high weight and overlay it with scenario judgment logic, but this requires the model to possess temporal and situational awareness capabilities it currently lacks.

Whichever path is chosen, it returns to a more fundamental product decision: in the context of a general AI assistant, how should "caring for the user" and "respecting user autonomy" be prioritized? This is not a technical problem but a product philosophy problem. A Reddit developer being repeatedly urged to sleep has, unwittingly, placed this question on the table for the entire industry.

Trending Cryptos

CitreaCTR

wrapped stUSDTWSTUSDT

Velodrome FinanceVELODROME

BrevisBREV

ZRX（0X）ZRX

PancakeSwapCAKE

North Korean-Linked Contractor Infiltrated MetaMask for a Month, The Real Vulnerability in Crypto Projects Isn't in the Code

A contractor linked to North Korea gained access to MetaMask's code repository through a third-party vendor, working from March 9 until being removed in April. Consensys, MetaMask's parent company, stated no user assets, data, or security were compromised, and no malicious code was deployed. The company identified the threat, terminated access, launched an investigation, and notified law enforcement. The incident highlights critical vulnerabilities in outsourced management for crypto projects, where operational failures—not code bugs—are the primary risk. Reports indicate roughly 76% of stolen DeFi funds in early 2024 resulted from operational attacks on keys, custody, signatures, and approvals. Security guidelines recommend stringent contractor vetting—including identity verification, background checks, and multi-interview processes—along with enforcing principle of least privilege for code access. Key measures include making code activity traceable, reviewing all production changes, conducting extra scrutiny on external contributions, and swiftly revoking access when no longer needed. The event underscores the need for continuous conditional access for contractors and predefined protocols to halt deployments during security investigations.

marsbit4m ago

North Korean-Linked Contractor Infiltrated MetaMask for a Month, The Real Vulnerability in Crypto Projects Isn't in the Code

marsbit4m ago

From Gold to Bitcoin: Fixed Supply + Institutional Frenzy, Might It Repeat the 'Explosive' Price Trend?

"From Gold to Bitcoin: Fixed Supply and Institutional Frenzy May Lead to 'Explosive' Price Rally Analysts suggest Bitcoin's price action could mirror gold's over the past two decades, following the launch of spot Bitcoin ETFs. Gold ETFs, introduced in 2004, drove gold's price surge to a current market cap near $28 trillion. Both gold and Bitcoin are non-yielding stores of value, with prices driven purely by investor sentiment rather than cash flows or credit. Gold ETFs experienced dramatic cycles: explosive growth, painful drawdowns, and slow recoveries, with each cycle reaching higher peaks. Bitcoin ETFs, approved in early 2024, saw rapid institutional adoption but are now facing similar volatility. Recent warnings highlight the risk of significant ETF outflows disrupting the current rebound. BlackRock's IBIT, a leading Bitcoin ETF, has sold nearly 100,000 BTC to meet redemptions while still holding over 733,000. The core parallel is fixed supply: when demand surges, prices explode, but demand is often volatile and wave-like, not steady. Institutional interest, through ETFs and corporate adoption, remains a key support pillar, helping to cushion sell-offs. If Bitcoin captures even a fraction of gold's role as a store of value, its upside potential is immense, though the path will be marked by high volatility. For investors, focusing on long-term trends and managing risk is crucial as this 'price explosion' narrative unfolds."

Foresight News21m ago

From Gold to Bitcoin: Fixed Supply + Institutional Frenzy, Might It Repeat the 'Explosive' Price Trend?

Foresight News21m ago

Why Is AI Agent Shopping Hard to Popularize?

The article argues that the popular narrative of "AI agent shopping" – equipping AI with a wallet to autonomously handle purchases – is fundamentally flawed and oversimplifies the complexity of shopping. It deconstructs shopping into two core actions: **information retrieval** (standardized, easily automated) and **value judgment** (deeply subjective and human-centric). The narrative mistakenly assumes AI can fully handle both. Value judgment itself has two layers: **evaluation** (assessing options against criteria) and **demand definition** (setting the criteria, weights, and values). The latter is inherently human and dynamic, as preferences are not fixed but constructed during the decision-making process ("constructive preferences"). The real dividing line for automation is not product standardization, but whether the **act of choosing** itself holds experiential value. For mundane purchases (e.g., printer paper), full AI delegation works. For experiential goods (e.g., wine, furniture), the joy of selection is core to consumption, so AI should act as an assistant that narrows options, leaving the final choice to humans. The "AI wallet" concept confuses three separate elements: decision-making, execution, and fund custody. Current payment industry solutions (e.g., from Stripe, Mastercard, Google, Visa) show that limited, scoped payment authorization tokens are sufficient for most consumer scenarios, not full fund custody. The true use case for autonomous AI wallets is in **B2B procurement** and **machine-to-machine (M2M) settlements** for standardized, high-frequency, low-value transactions. The real bottlenecks for AI shopping are not payment technology, but **1) the lack of trusted data sources** (e.g., fake reviews, counterfeit goods) and **2) the impossibility of automating human demand definition**. The conclusion is that the focus should be on safely automating the assessment and filtering process while reserving for humans the rights to define their criteria and enjoy the final act of choice. For experiential goods, the platform's competitive advantage shifts to providing a superior selection experience.

Foresight News1h ago

Why Is AI Agent Shopping Hard to Popularize?

Foresight News1h ago

zcashd shuts down, Zcash enters Ironwood era: Is quantum-resistant privacy the future?

Zcash has completed its infrastructure transition by retiring the original zcashd software and fully adopting the Rust-based Zebra and Zakura node implementations. This shift, finalizing in July 2024, enhances network maintainability and prepares for the upcoming Ironwood era. Despite a previously disclosed vulnerability in the Orchard shielded pool, user confidence appears resilient. Shielded transaction volume grew 11.1% quarter-over-quarter, and the anonymity set expanded significantly, even as total shielded balances saw a moderate decline. The prompt containment of the Orchard flaw, which did not threaten total ZEC supply, demonstrated effective protocol safeguards. The incoming Ironwood upgrade aims to further strengthen long-term security through formal verification and quantum-resistant features, moving Zcash from reactive fixes to proactive, verifiable security assurances.

ambcrypto1h ago

zcashd shuts down, Zcash enters Ironwood era: Is quantum-resistant privacy the future?

ambcrypto1h ago

After Nine Months of Shorting, a Full Turn to Long: Renowned Trader Opens Bitcoin Positions Around 64K, Crypto Market Long-Short Divergence Intensifies

After nine months of being short, prominent crypto trader Doctor Profit has closed all his bearish positions and started buying Bitcoin near $64,000, signaling a complete bullish reversal. He argues that structural market changes—such as impending U.S. regulation (CLARITY Act) and institutional adoption via securities tokenization—are rewriting the traditional four-year cycle script, potentially bringing the market bottom forward from the widely expected September/October timeframe. This view finds some technical support from on-chain analyst gumsays, who notes a bullish divergence on Bitcoin's weekly chart has persisted for 147 days, nearing the 161-day duration seen before the 2022 cycle low. However, cycle researcher Jake Pahor presents a counter-argument based on historical data. Analyzing patterns since 2014, he identifies three common features of past bear market bottoms: a ~12-month duration from peak to trough, a sustained period of extreme fear (with a proprietary risk score below 20), and the price falling below Bitcoin's realized price (~$53,000 currently). The current cycle, only nine months from its October 2025 peak, meets none of these conditions. The debate highlights a market torn between "front-running" a potential early bottom driven by new fundamentals and waiting for confirmation through traditional on-chain and sentiment metrics. While Doctor Profit opts for aggressive buying, Pahor maintains a disciplined, tiered accumulation strategy, continuing weekly buys at current risk levels but reserving larger orders for if more extreme fear emerges.

marsbit1h ago

After Nine Months of Shorting, a Full Turn to Long: Renowned Trader Opens Bitcoin Positions Around 64K, Crypto Market Long-Short Divergence Intensifies

marsbit1h ago

Trading

Spot

Hot Articles

SNEK: The Leading Meme Coin on Cardano, Ushering in a New Era for the Cardano Ecosystem

SNEK is a deflationary meme coin issued on the Cardano blockchain. It delivers decentralized cultural and entertainment value through community-driven and globally distributed participation.

49.2k Total ViewsPublished 2025.10.15Updated 2025.10.15

SNEK: The Leading Meme Coin on Cardano, Ushering in a New Era for the Cardano Ecosystem

Hot Tokens Learning Week 3: “UNIfication” Governance Proposal Will End on December 25 | NIGHT May Serve as a Growth Catalyst for Cardano in 2026

The “UNIfication” governance proposal will conclude on December 25, with community sentiment remaining strongly optimistic.

41.2k Total ViewsPublished 2025.12.23Updated 2025.12.23

Hot Tokens Learning Week 3: “UNIfication” Governance Proposal Will End on December 25 | NIGHT May Serve as a Growth Catalyst for Cardano in 2026

Hot Tokens Learning Week 8: ADA's Ouroboros Leios Mainnet Expected to Launch in 2026

ADA's Ouroboros Leios mainnet is expected to launch in 2026, and the hard fork to Protocol Version 11 is planned for Q1 2026.

40.8k Total ViewsPublished 2026.02.10Updated 2026.02.12

Hot Tokens Learning Week 8: ADA's Ouroboros Leios Mainnet Expected to Launch in 2026

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of ADA (ADA) are presented below.