A Year Consumes a Solid-State Drive: Codex Log Bug Slammed as 'Slopware'

marsbit發佈於 2026-07-02更新於 2026-07-02

文章摘要

OpenAI's flagship AI coding tool, Codex, was found to have a critical bug causing its feedback logging system to silently and rapidly wear out users' SSDs. A developer reported that Codex was writing approximately 640 TB of data per year to a local SQLite database (`logs_2.sqlite`) through a constant cycle of inserting and immediately deleting log entries, primarily at the verbose TRACE level. While the database file itself remained around 1 GB, the underlying write-amplification from SQLite's WAL mechanism meant the physical SSD endured the full write load. This was enough to exceed the typical 600 TBW endurance rating of a consumer SSD within a year. The root cause was a hardcoded default logging level (`Level::TRACE`) in the configuration, which overrode any user attempts to reduce logging via environment variables. Analysis showed that over 96% of the logged data—including noisy WebSocket packet dumps and repeated system file events—was useless debug information. The issue, which had at least nine related bug reports in the Codex repository, remained latent because it didn't visibly consume disk space, only silently accumulated write cycles. After the report gained traction on Hacker News, OpenAI merged fixes estimated to reduce writes by about 85%. However, even post-fix, the tool would still write an estimated 96 TB annually. The incident sparked broader criticism of "slopware" in AI-assisted development tools, highlighting a lack of resource budgeting for disk, CPU, ...

A year 'eats' a 1TB SSD?

OpenAI's flagship programming tool, Codex, is burning through your solid-state drive with 640TB of writes per year.

Not long ago, a developer submitted an issue on GitHub. This now-closed GitHub issue, #28224, bears the title:

Codex's SQLite feedback log writes 640TB/year, rapidly depleting SSD lifespan.

According to the reporter's actual measurements, his primary SSD lost 37TB of write endurance after 21 days of continuous operation. At this rate, it's about 640TB per year—enough to wear out a consumer-grade drive with a 600 TBW (Total Bytes Written) rating.

As evidence, he posted two tables.

In evidence 1, the log database always appears to be only 1.2GB, seemingly like nothing happened; yet its auto-incrementing row ID has surged to 5.5 billion, while the actual retained rows are just over 500,000, a difference of ten thousand times.

The key is that SSD wear counts the total amount written, not what remains now: all those 5.5 billion rows were written to disk, and deleting them doesn't undo the writes already incurred. So you only ever see those 500,000 rows when checking the file, but the drive has already endured the write load of 5.5 billion rows.

Evidence 2 reveals the distribution of these 5.5 billion rows: over 90% are debug noise that even the developers themselves wouldn't look back at. Simply copying down each entire WebSocket data packet accounted for half of it.

The culprit is a default configuration of Level::TRACE, treating your drive's write endurance as free scratch paper.

A highly upvoted comment on Hacker News directly defined the nature of this issue:

This is one of the most egregious examples of "slopware."

This netizen also helplessly added:

This is a tragedy. The world needs someone to compete with Anthropic.

What's even more awkward is that this problem wasn't unreported.

Sporadic feedback has existed since April this year, dragging on for over two months. Only after users calculated it themselves, wrote reports, and pushed it to the top of Hacker News did it receive serious attention. Even then, this round only cut about 85% of the log writes.

Some tried to fix it themselves but found they couldn't: the desktop versions of these tools are closed-source.

There was also a classic comment in the thread: How did the review process not catch such an obvious bug? Oh right... @codex, review this.

How exactly did the 640TB get written?

What does 640TB mean?

Mainstream consumer SSDs have a rated write endurance of roughly 150 to 600 TBW, enough for the average user for over a decade or two.

Yet Codex's "record what I did" logging feature can write that much in a year.

The story begins when this user checked his drive usage. His machine, running continuously for 21 days, saw his primary SSD endure 37TB of writes.

At this speed, it's about 640TB per year.

What's more absurd is the write pattern.

Codex maintains a local SQLite database, logs_2.sqlite, specifically for feedback logs. This user monitored it for 15 seconds—36,211 rows were inserted, while the total retained row count remained 681,774 from start to finish, not a single one more.

For every row inserted, one was deleted. The row count stayed constant, but the disk was being rewritten tens of thousands of times.

This mechanism has a nickname: insert-and-prune.

Even more ridiculous is what it records: a bunch of filesystem inotify events.

ld.so.cache was logged 128,764 times, locale.alias 37,982 times, passwd 23,843 times.

The same file, by the same program, logged hundreds of thousands of times repeatedly.

The auto-incrementing ID in the logs had exceeded 5.5 billion, while only about 500,000 rows were retained.

A difference of ten thousand times.

This isn't a bug; it's more like an AI programming tool chanting to its own hard drive.

The file is only 1GB, but writes amount to 640TB

If it writes and deletes simultaneously, how big can logs_2.sqlite remain? About 1GB.

This leads to the most counterintuitive point of the whole affair: SSD lifespan depends on "write amount," not "file size." A 1GB file rewritten 640 times equals 640TB of writes for the drive.

SQLite uses a WAL (Write-Ahead Logging) mechanism. Changes are first written to a -wal file, then checkpointed back to the main database in batches. Codex performs over thirty thousand inserts and deletions every 15 seconds. Each one goes through WAL, index updates, and checkpoints—the same storage area, erased over and over.

An analogy: a 1GB notebook, where you erase and rewrite it 1,750 times a day for a year. The notebook is the same, but the paper is worn through.

This is also why this bug could remain hidden for so long: it doesn't take up space, only burns through lifespan.

Checking available disk space shows nothing unusual; the file size stays quiet. Only by reading the drive's own SMART health counters can you see the write amount silently accumulating.

Root cause: an ignored RUST_LOG line

Why were so many logs recorded?

The answer lies in a single line of configuration in the Codex source code: the SQLite feedback log sink is initialized with Targets::new().with_default(Level::TRACE).

In short, the log is set to TRACE level by default—the highest, most verbose, record-everything level.

Codex's logging framework is Rust's `tracing` ecosystem. The standard practice is to read the RUST_LOG environment variable. Users certainly tried setting RUST_LOG to info, warn, or even turning it off entirely.

No use.

with_default(Level::TRACE) hard-locks the global default to TRACE. RUST_LOG simply doesn't take effect on this path. You think you turned off logging, but it writes regardless.

The most deceptive part of this bug isn't that "you forgot to configure," but that "you configured, and it pretended not to hear."

Even more glaring is a proportion.

Breaking down the retained logs by category, TRACE accounts for 70.7%, about 732.5 MB. Adding the two mirrored telemetry logs from codex_otel (log_only and trace_safe) takes up another 25.3%.

70% of writes are TRACE noise. Combined with mirrored telemetry, 96% is pure nonsense nobody would read.

Only 4% is actually meaningful content.

This isn't the first one; at least the ninth

The reporter checked the Codex repository and found at least 9 issues of this "unbounded log growth" type.

#17320: WAL writes wildly during streaming responses, root cause identical to this one—TRACE ignoring RUST_LOG.

#24275: Desktop version logs_2.sqlite explodes.

#22444: WAL grows infinitely and doesn't free up space.

#26374: Writes 0.75GB per day, no rotation.

#27911: A 4KB goals_1.sqlite gets written at 11MB/s.

#20563: Disk writes wildly even when process is idle.

#27020: Disk at 100% activity on Windows.

The earliest trace leads back to #12969, the very PR that connected the SQLite feedback log sink at the TRACE level.

A 4KB database being written at 11MB/s is enough for a standalone article. Yet, both it and the 640TB issue are symptoms of the same product, the same telemetry system.

This shows that from the beginning, Codex's logging and telemetry systems lacked the concept of a "resource budget."

The entire field is competing over token budgets, context lengths, and model capabilities.

But almost no one asks: who manages the disk, memory, and CPU budget for an Agent that resides on a user's machine, running 24/7?

Fixed, but in a very OpenAI way

Reported on GitHub on June 14th. On June 23rd, the reporter updated: three PRs merged. According to his own Codex feedback, they reduced logs by about 85%, so he closed the issue.

First, about that 85%—it's not 100%, and it's not fully deployed yet.

Of the three fixes, #29432 and #29457 were released with version 0.142.0, cutting out per-WebSocket logging and noisy targets. The third, #29599, stops another type of bridged redundant log and will be in version 0.143.0.

Even with all three in place, the remaining ~15% still amounts to ~96TB of writes per year, merely reducing it from "burning out a drive in a year" to "burning it out in six years."

Some defended it: trace logs are stored by design for debugging, not a bug, and indeed help OpenAI track down edge cases.

But that's precisely the issue: using paying users' SSD endurance as free storage for the vendor's debugging—did the users ever agree to this?

In the programming battlefield, more than just SSDs are being burned through

Interestingly, Codex wasn't the only one singled out.

Comments quickly added: Claude Code also writes debug logs heavily to local storage. Some had to symlink the log directory to a RAM disk (tmpfs) to extend their SSD's life.

Both flagship products suffer from the same type of flaw.

Community discussion soon expanded from one bug to the broader quality issues of AI programming tools.

Some complained these agents keep the GPU maxed out and memory usage at 70GB+. Others simply coined a name for this generation of software: slopware.

The original developer's suggestion was simple: set a limit for the app, don't exceed 3GB. Just this one line took Codex 9 issues and several months to finally consider drawing.

The question is: why would a company that constantly talks about "AGI" stumble on a problem even an intern could spot?

Why could this flaw hide for so long? One comment also hit the nail on the head.

A decade ago, setting logs to TRACE would cause the program to freeze instantly, fixed the same day. Today, CPUs are fast enough, memory large enough, disks robust enough that such flaws are quietly absorbed by hardware performance. The program runs, the interface works, users feel nothing—until the SSD dies prematurely one day.

In recent years, software has been stuffed with AI-generated code. Features pile up, abstraction layers thicken, and resource consumption skyrockets, all barely propped up by hardware makers releasing faster chips every year.

Thus, an absurd cycle emerged: software gets worse, hardware gets more powerful. Users pay for new machines under the illusion that "it doesn't seem slower," when in reality, the new hardware is just barely supporting worse software.

A single small bug can't crush OpenAI, of course. But competition between Codex and Claude Code has already spread from model capabilities to the entry point of developers' workflows.

On this front, making quick changes and responding to developer needs was never a bonus—it's just the entry ticket.

References:

https://github.com/openai/codex/issues/28224

https://news.ycombinator.com/item?id=48626930

This article is from the WeChat public account "New Zhiyuan," author: ASI Revelation

你可能也喜歡

加密货币市场本周赢家与输家 – ADA、LIT、VVV、PI

加密货币市场本周迎来技术性反弹。比特币在周初下跌后重新站稳关键支撑位，带动市场情绪改善，资金轮动至山寨币，多个中低市值代币大幅上涨。但主要上涨动力并非仅来自市场普涨，而是项目方的具体催化剂、协议升级和公告，显示基本面驱动取代了纯投机。本周涨幅领先的包括：MemeCore [M] 因宣布1000万美元回购计划大涨110%，呈现趋势反转；Cardano [ADA] 在Van Rossum硬分叉升级预期下上涨33.5%，创2025年第一季度以来最强单周表现；Lighter [LIT] 上涨31%，进入价格发现阶段。此外，Pop Planet [P]、Vanta Network [SN8] 等低市值代币也出现极高涨幅。跌幅方面：Venice Token [VVV] 下跌14%，跌破关键支撑，卖压持续；Pi [PI] 下跌超8%，接连失守重要支撑位；Canton [CC] 下跌近6%，已连续四周走低。其他弱势代币如SkyAI、Xeffy等跌幅更为显著。总体来看，本周市场波动剧烈，上涨项目多受基本面事件推动，而下跌资产则普遍呈现技术面破位和卖压主导的特征。投资者需保持谨慎，做好独立研究。

ambcrypto1 小時前

ambcrypto1 小時前

卡尔达诺的5亿ADA国库计划是否会开启下一阶段增长？

卡尔达诺（Cardano）提议将国库净变动限额（NCL）从3.5亿ADA提高至5亿ADA，增幅达43%，以增强其资助基础设施、DeFi及生态系统项目的能力。目前国库持有约14.7亿ADA，但至今仅支出约6800万，显示资金能力远超实际使用。随着治理进入伏尔泰时代，加强监督和资本高效配置对于实现增长至关重要。链上数据显示，自6月23日低点以来，网络新增了约1.48万个非空钱包，逆转了此前持有者增长放缓的趋势。同时，ADA价格从低点反弹35%，回升至约0.20美元，表明市场参与度提升，投机性减弱。持续的钱包增长和价格复苏共同反映出网络信心正在逐步恢复。总之，卡尔达诺的增长取决于国库资金的合理运用与网络参与的持续扩大，二者结合或将推动生态系统进入新的发展阶段。

ambcrypto5 小時前

ambcrypto5 小時前

为什么VELVET加密货币下跌12%可能是一个看涨形态的开始

VELVET代币在过去24小时内下跌12%，市场抛压导致资金流出，永续合约市场日内清算约68.9万美元。然而，图表分析显示其走势与6月中旬的形态相似：当时剧烈波动后进入了约13天的盘整，随后出现上涨。目前价格可能在0.417美元（支撑）和0.577美元（阻力）之间继续震荡约两周，之后有望选择方向。技术指标显示，Aroon指标中下降线走低、上升线攀升，配合交易量倾向，向上突破的可能性略高。市场情绪方面，尽管看涨比例从昨日高点88%降至74%，但多数交易者仍持乐观态度。同时，持币地址数创新高，表明部分投资者将下跌视为累积机会而非趋势反转。总体来看，当前下跌可能只是上涨过程中的盘整阶段，若后续能突破0.577美元阻力，则可能开启新一轮上涨。

ambcrypto6 小時前

ambcrypto6 小時前

XRP 1美元价位是熊市陷阱吗？数据揭示了什么

市场关于新一轮山寨币周期的讨论再次升温。本周表现强化了这一叙事：比特币周内上涨超5%，而以太坊资金流入量是其两倍，ETH/BTC比率创下去年8月以来最强周涨幅，显示出“山寨币引领”的动能正在积聚。瑞波币（XRP）并未落后，本周上涨近9%，最高触及1.18美元，为六月中旬以来首次重返该水平，表明资金正轮动至高贝塔值山寨币。有分析师监测到某鲸鱼以1.10美元均价建立了1600万美元的XRP多头头寸，目前已产生约47.7万美元未实现利润，这并非偶然的高杠杆赌注，而是暗示市场对XRP在后市继续上涨的信念增强。然而，XRP的多头杠杆正在急剧增加，这在价格接近1美元供应区时，增加了因情绪轻微转变而引发多头清算的风险。短期回调的论点因此获得关注。但机构资金流向描绘了另一幅图景：美国现货XRP ETF本周录得1719万美元净流入，尽管期间有两日出现流出，但需求仍具韧性。这已是该ETF连续第九周净流入，显示出“持续”的机构兴趣。相比之下，同期以太坊ETF则持续录得净流出。这种分化可能使XRP区别于其他山寨币。简言之，ETH的短期上涨似乎主要由资金轮动驱动，而XRP则显示出更强的长期动能，使其在1美元附近的盘整看起来更像一个潜在的“多头陷阱”，持续的多头头寸和机构资金流入正共同构成看涨压力。

ambcrypto7 小時前

ambcrypto7 小時前

以太坊的‘精益’路线图将如何在未来4年重塑ETH

以太坊开发进入新阶段，研究人员提出了“精益以太坊”路线图，这是一项为期约3-4年的核心协议全面改革计划。该计划旨在通过多项关键技术提升网络的可扩展性、安全性与长期韧性。核心升级包括引入递归STARKs、量子安全密码学、多维Gas定价模型以及重新设计的状体架构，以提高交易处理速度并降低成本。同时，路线图计划逐步提高网络Gas上限，并利用更可扩展的状体架构来加快最终确定性。为应对长期技术挑战，开发已着手推进后量子安全协议和可扩展的状体管理。通过测试网评估抗量子密码学，并借助Verkle树、状体到期模型等技术重构状体管理，以降低验证成本、缩短节点同步时间，同时保持去中心化特性。此外，升级将注重向后兼容，确保现有应用平稳过渡。整体而言，“精益以太坊”路线图致力于为以太坊未来十年的持续发展奠定更高效、安全的基础。

ambcrypto9 小時前

ambcrypto9 小時前

交易

現貨

A Year Consumes a Solid-State Drive: Codex Log Bug Slammed as 'Slopware'

文章摘要

How exactly did the 640TB get written?

The file is only 1GB, but writes amount to 640TB

Root cause: an ignored RUST_LOG line

This isn't the first one; at least the ninth

Fixed, but in a very OpenAI way

In the programming battlefield, more than just SSDs are being burned through

熱門幣種推薦

相關問答

你可能也喜歡

加密货币市场本周赢家与输家 – ADA、LIT、VVV、PI

卡尔达诺的5亿ADA国库计划是否会开启下一阶段增长？

为什么VELVET加密货币下跌12%可能是一个看涨形态的开始

XRP 1美元价位是熊市陷阱吗？数据揭示了什么

以太坊的‘精益’路线图将如何在未来4年重塑ETH

交易

熱門文章

如何購買T

相關討論

熱門問答

熱門分類

熱門標籤