A Year Consumes a Solid-State Drive: Codex Log Bug Slammed as 'Slopware'

marsbit2026-07-02 tarihinde yayınlandı2026-07-02 tarihinde güncellendi

Özet

OpenAI's flagship AI coding tool, Codex, was found to have a critical bug causing its feedback logging system to silently and rapidly wear out users' SSDs. A developer reported that Codex was writing approximately 640 TB of data per year to a local SQLite database (`logs_2.sqlite`) through a constant cycle of inserting and immediately deleting log entries, primarily at the verbose TRACE level. While the database file itself remained around 1 GB, the underlying write-amplification from SQLite's WAL mechanism meant the physical SSD endured the full write load. This was enough to exceed the typical 600 TBW endurance rating of a consumer SSD within a year. The root cause was a hardcoded default logging level (`Level::TRACE`) in the configuration, which overrode any user attempts to reduce logging via environment variables. Analysis showed that over 96% of the logged data—including noisy WebSocket packet dumps and repeated system file events—was useless debug information. The issue, which had at least nine related bug reports in the Codex repository, remained latent because it didn't visibly consume disk space, only silently accumulated write cycles. After the report gained traction on Hacker News, OpenAI merged fixes estimated to reduce writes by about 85%. However, even post-fix, the tool would still write an estimated 96 TB annually. The incident sparked broader criticism of "slopware" in AI-assisted development tools, highlighting a lack of resource budgeting for disk, CPU, ...

A year 'eats' a 1TB SSD?

OpenAI's flagship programming tool, Codex, is burning through your solid-state drive with 640TB of writes per year.

Not long ago, a developer submitted an issue on GitHub. This now-closed GitHub issue, #28224, bears the title:

Codex's SQLite feedback log writes 640TB/year, rapidly depleting SSD lifespan.

According to the reporter's actual measurements, his primary SSD lost 37TB of write endurance after 21 days of continuous operation. At this rate, it's about 640TB per year—enough to wear out a consumer-grade drive with a 600 TBW (Total Bytes Written) rating.

As evidence, he posted two tables.

In evidence 1, the log database always appears to be only 1.2GB, seemingly like nothing happened; yet its auto-incrementing row ID has surged to 5.5 billion, while the actual retained rows are just over 500,000, a difference of ten thousand times.

The key is that SSD wear counts the total amount written, not what remains now: all those 5.5 billion rows were written to disk, and deleting them doesn't undo the writes already incurred. So you only ever see those 500,000 rows when checking the file, but the drive has already endured the write load of 5.5 billion rows.

Evidence 2 reveals the distribution of these 5.5 billion rows: over 90% are debug noise that even the developers themselves wouldn't look back at. Simply copying down each entire WebSocket data packet accounted for half of it.

The culprit is a default configuration of Level::TRACE, treating your drive's write endurance as free scratch paper.

A highly upvoted comment on Hacker News directly defined the nature of this issue:

This is one of the most egregious examples of "slopware."

This netizen also helplessly added:

This is a tragedy. The world needs someone to compete with Anthropic.

What's even more awkward is that this problem wasn't unreported.

Sporadic feedback has existed since April this year, dragging on for over two months. Only after users calculated it themselves, wrote reports, and pushed it to the top of Hacker News did it receive serious attention. Even then, this round only cut about 85% of the log writes.

Some tried to fix it themselves but found they couldn't: the desktop versions of these tools are closed-source.

There was also a classic comment in the thread: How did the review process not catch such an obvious bug? Oh right... @codex, review this.

How exactly did the 640TB get written?

What does 640TB mean?

Mainstream consumer SSDs have a rated write endurance of roughly 150 to 600 TBW, enough for the average user for over a decade or two.

Yet Codex's "record what I did" logging feature can write that much in a year.

The story begins when this user checked his drive usage. His machine, running continuously for 21 days, saw his primary SSD endure 37TB of writes.

At this speed, it's about 640TB per year.

What's more absurd is the write pattern.

Codex maintains a local SQLite database, logs_2.sqlite, specifically for feedback logs. This user monitored it for 15 seconds—36,211 rows were inserted, while the total retained row count remained 681,774 from start to finish, not a single one more.

For every row inserted, one was deleted. The row count stayed constant, but the disk was being rewritten tens of thousands of times.

This mechanism has a nickname: insert-and-prune.

Even more ridiculous is what it records: a bunch of filesystem inotify events.

ld.so.cache was logged 128,764 times, locale.alias 37,982 times, passwd 23,843 times.

The same file, by the same program, logged hundreds of thousands of times repeatedly.

The auto-incrementing ID in the logs had exceeded 5.5 billion, while only about 500,000 rows were retained.

A difference of ten thousand times.

This isn't a bug; it's more like an AI programming tool chanting to its own hard drive.

The file is only 1GB, but writes amount to 640TB

If it writes and deletes simultaneously, how big can logs_2.sqlite remain? About 1GB.

This leads to the most counterintuitive point of the whole affair: SSD lifespan depends on "write amount," not "file size." A 1GB file rewritten 640 times equals 640TB of writes for the drive.

SQLite uses a WAL (Write-Ahead Logging) mechanism. Changes are first written to a -wal file, then checkpointed back to the main database in batches. Codex performs over thirty thousand inserts and deletions every 15 seconds. Each one goes through WAL, index updates, and checkpoints—the same storage area, erased over and over.

An analogy: a 1GB notebook, where you erase and rewrite it 1,750 times a day for a year. The notebook is the same, but the paper is worn through.

This is also why this bug could remain hidden for so long: it doesn't take up space, only burns through lifespan.

Checking available disk space shows nothing unusual; the file size stays quiet. Only by reading the drive's own SMART health counters can you see the write amount silently accumulating.

Root cause: an ignored RUST_LOG line

Why were so many logs recorded?

The answer lies in a single line of configuration in the Codex source code: the SQLite feedback log sink is initialized with Targets::new().with_default(Level::TRACE).

In short, the log is set to TRACE level by default—the highest, most verbose, record-everything level.

Codex's logging framework is Rust's `tracing` ecosystem. The standard practice is to read the RUST_LOG environment variable. Users certainly tried setting RUST_LOG to info, warn, or even turning it off entirely.

No use.

with_default(Level::TRACE) hard-locks the global default to TRACE. RUST_LOG simply doesn't take effect on this path. You think you turned off logging, but it writes regardless.

The most deceptive part of this bug isn't that "you forgot to configure," but that "you configured, and it pretended not to hear."

Even more glaring is a proportion.

Breaking down the retained logs by category, TRACE accounts for 70.7%, about 732.5 MB. Adding the two mirrored telemetry logs from codex_otel (log_only and trace_safe) takes up another 25.3%.

70% of writes are TRACE noise. Combined with mirrored telemetry, 96% is pure nonsense nobody would read.

Only 4% is actually meaningful content.

This isn't the first one; at least the ninth

The reporter checked the Codex repository and found at least 9 issues of this "unbounded log growth" type.

#17320: WAL writes wildly during streaming responses, root cause identical to this one—TRACE ignoring RUST_LOG.

#24275: Desktop version logs_2.sqlite explodes.

#22444: WAL grows infinitely and doesn't free up space.

#26374: Writes 0.75GB per day, no rotation.

#27911: A 4KB goals_1.sqlite gets written at 11MB/s.

#20563: Disk writes wildly even when process is idle.

#27020: Disk at 100% activity on Windows.

The earliest trace leads back to #12969, the very PR that connected the SQLite feedback log sink at the TRACE level.

A 4KB database being written at 11MB/s is enough for a standalone article. Yet, both it and the 640TB issue are symptoms of the same product, the same telemetry system.

This shows that from the beginning, Codex's logging and telemetry systems lacked the concept of a "resource budget."

The entire field is competing over token budgets, context lengths, and model capabilities.

But almost no one asks: who manages the disk, memory, and CPU budget for an Agent that resides on a user's machine, running 24/7?

Fixed, but in a very OpenAI way

Reported on GitHub on June 14th. On June 23rd, the reporter updated: three PRs merged. According to his own Codex feedback, they reduced logs by about 85%, so he closed the issue.

First, about that 85%—it's not 100%, and it's not fully deployed yet.

Of the three fixes, #29432 and #29457 were released with version 0.142.0, cutting out per-WebSocket logging and noisy targets. The third, #29599, stops another type of bridged redundant log and will be in version 0.143.0.

Even with all three in place, the remaining ~15% still amounts to ~96TB of writes per year, merely reducing it from "burning out a drive in a year" to "burning it out in six years."

Some defended it: trace logs are stored by design for debugging, not a bug, and indeed help OpenAI track down edge cases.

But that's precisely the issue: using paying users' SSD endurance as free storage for the vendor's debugging—did the users ever agree to this?

In the programming battlefield, more than just SSDs are being burned through

Interestingly, Codex wasn't the only one singled out.

Comments quickly added: Claude Code also writes debug logs heavily to local storage. Some had to symlink the log directory to a RAM disk (tmpfs) to extend their SSD's life.

Both flagship products suffer from the same type of flaw.

Community discussion soon expanded from one bug to the broader quality issues of AI programming tools.

Some complained these agents keep the GPU maxed out and memory usage at 70GB+. Others simply coined a name for this generation of software: slopware.

The original developer's suggestion was simple: set a limit for the app, don't exceed 3GB. Just this one line took Codex 9 issues and several months to finally consider drawing.

The question is: why would a company that constantly talks about "AGI" stumble on a problem even an intern could spot?

Why could this flaw hide for so long? One comment also hit the nail on the head.

A decade ago, setting logs to TRACE would cause the program to freeze instantly, fixed the same day. Today, CPUs are fast enough, memory large enough, disks robust enough that such flaws are quietly absorbed by hardware performance. The program runs, the interface works, users feel nothing—until the SSD dies prematurely one day.

In recent years, software has been stuffed with AI-generated code. Features pile up, abstraction layers thicken, and resource consumption skyrockets, all barely propped up by hardware makers releasing faster chips every year.

Thus, an absurd cycle emerged: software gets worse, hardware gets more powerful. Users pay for new machines under the illusion that "it doesn't seem slower," when in reality, the new hardware is just barely supporting worse software.

A single small bug can't crush OpenAI, of course. But competition between Codex and Claude Code has already spread from model capabilities to the entry point of developers' workflows.

On this front, making quick changes and responding to developer needs was never a bonus—it's just the entry ticket.

References:

https://github.com/openai/codex/issues/28224

https://news.ycombinator.com/item?id=48626930

This article is from the WeChat public account "New Zhiyuan," author: ASI Revelation

Trend Kriptolar

CitreaCTR

wrapped stUSDTWSTUSDT

Velodrome FinanceVELODROME

BrevisBREV

ZRX（0X）ZRX

PancakeSwapCAKE

İlgili Sorular

QWhat is the core issue reported regarding OpenAI's Codex programming tool in the article?

AThe core issue is that Codex's SQLite feedback logging mechanism is set to TRACE level by default, causing it to perform an excessive number of database insert and delete operations. This results in an estimated 640TB of write operations to a user's solid-state drive (SSD) per year, which can quickly deplete or exceed the drive's total write endurance (TBW), potentially leading to premature drive failure.

QHow does the article describe the 'insert-and-prune' mechanism causing the excessive disk writes?

AThe 'insert-and-prune' mechanism described in the article involves Codex continuously inserting new log entries into a SQLite database (logs_2.sqlite) and immediately deleting old ones to keep the file size constant (around 1GB). However, each insert and delete operation contributes to the total data written to the SSD. Even though the file size remains small, the constant churn of data within the file leads to massive cumulative write volumes on the physical disk over time.

QAccording to the article, why was this logging bug difficult for users to detect initially?

AThe bug was difficult to detect initially because it did not consume significant disk space. The logs_2.sqlite file maintained a relatively constant size (~1GB). The massive write volume was only observable by monitoring the drive's SMART health data (which tracks total bytes written), not by checking typical metrics like available disk space or file size.

QWhat was the community's reaction on platforms like Hacker News, as mentioned in the article?

AThe community reaction on platforms like Hacker News was highly critical. A high-rated comment labeled the issue as one of the most notorious examples of 'slopware' (slop software). There was frustration over the delayed response from OpenAI and criticism that such a fundamental resource management flaw could persist in a flagship product, with some comments calling for more competition in the AI coding assistant space.

QWhat broader software development trend does the article suggest this Codex bug exemplifies?

AThe article suggests this bug exemplifies a broader trend of declining software efficiency, often referred to as 'slopware.' Modern software, increasingly bloated with AI-generated code and complex abstractions, consumes excessive system resources (CPU, memory, disk I/O). This inefficiency is often masked by the rapid performance improvements in hardware, creating a cycle where users upgrade their hardware to compensate for poorly optimized software rather than demanding better code quality.

İlgili Okumalar

Anyone Can Easily Create Prediction Markets, But Can Limitless' User-Generated Markets Last?

The article discusses the historical challenges of user-generated prediction markets in crypto, where previous attempts like Augur, Omen, Zeitgeist, and Manifold Markets failed due to fragmented liquidity, poor discoverability, and unreliable, slow settlement processes. These issues often led to platforms filled with inactive markets and low user engagement, prompting some, like Polymarket, to shift to a curated model. Limitless recently launched its User-Generated Market (UGM) feature, allowing anyone to create crypto price prediction markets. It addresses past failures through several key design choices: markets are limited to objective, oracle-based price questions (e.g., "Will Asset X be above $Y at time Z?") for instant, automatic settlement via Pyth and Chainlink, eliminating voting disputes. To combat spam and fragmentation, market creation requires burning 100-1000 LMTS tokens (a non-refundable cost), while creators earn 50% of the trading fees generated by their market, aligning incentives. The platform also benefits from an existing active user base and uses an order book model, removing the need for creators to provide initial liquidity. By tackling settlement reliability, liquidity fragmentation, and creator incentives, Limitless presents a new model for sustainable, permissionless prediction markets.

Foresight News35 dk önce

Anyone Can Easily Create Prediction Markets, But Can Limitless' User-Generated Markets Last?

Foresight News35 dk önce

Senator Cynthia Lummis Defends Clarity Act Against Elizabeth Warren’s Criticism

Senator Cynthia Lummis defended the Clarity Act against Senator Elizabeth Warren's criticism that it contains loopholes for illicit cryptocurrency activities like money laundering. Warren cited concerns over an alleged $3.84 billion laundering case involving an Iranian company and a crypto exchange, arguing the bill weakens anti-money laundering rules and consumer protections. Lummis countered that the bill includes over sixteen provisions to combat illicit finance, such as applying Bank Secrecy Act rules to crypto (Section 201), sanctioning foreign jurisdictions like Iran (Section 303), and allowing temporary transaction freezes during investigations (Section 305). She called Warren's claims unfounded. The bill, which aims to clarify regulatory roles for the SEC and CFTC, faces ongoing debate in Congress with a 39% predicted chance of passage. Warren also urged stricter ethics rules following reports of a $1.4 billion crypto gain by former President Trump.

TheNewsCrypto51 dk önce

Senator Cynthia Lummis Defends Clarity Act Against Elizabeth Warren’s Criticism

TheNewsCrypto51 dk önce

Female Billionaire Becomes a VC

The article details how Zhou Qunfei, founder and chairwoman of Lens Technology and once China's "richest woman," has entered the venture capital (VC) space. Zhou's recent investment in the embodied AI unicorn Kuawei Intelligent, contributing several hundred million RMB from her personal funds, brought her VC activities into focus. Her investment vehicle, Changsha Qunxin Investment, has made 17 investments, targeting semiconductor and hard tech companies like ChipOne Technology. Meanwhile, Lens Technology, her listed company, has made strategic bets this year on AI startups such as BrainCo, Xinghai Tu, Qingtian Zu, and Pudu Robotics. Zhou's own story is one of remarkable ascent. Starting as a migrant factory worker in Shenzhen, she built Lens Technology into a powerhouse supplying glass for major tech clients like Apple, with a market cap surpassing 300 billion RMB. Her move mirrors a broader trend among China's self-made industrial tycoons. Figures like Liu Yi of iHealth Labs (an investor in DeepSeek) and Zhu Xingming of Inovance Technology are similarly redirecting family offices and corporate capital away from traditional sectors like real estate. Instead, they are placing strategic bets on frontier technologies—AI, embodied intelligence, brain-computer interfaces, and nuclear fusion—effectively funding China's next wave of technological innovation.

marsbit56 dk önce

marsbit56 dk önce

On the Eve of Its U.S. Journey, SK Hynix Plummets Sharply

Just before its highly anticipated U.S. listing, SK Hynix saw its share price plummet dramatically, losing over 14% in a single day. The sell-off was triggered by market fears of a potential slowdown in AI capital expenditure. This followed a news report suggesting Meta might sell "excess AI compute," which was later amended to remove the word "excess." The initial phrasing sparked a chain reaction in investor sentiment, linking it to a potential peak in AI demand. Despite the sharp downturn, the article argues this is likely an overreaction driven by market sentiment and structural de-leveraging, rather than a fundamental reversal of the AI trend. The author points out that even if Meta proceeds, it could be an optimization of existing assets, not a systemic demand contraction. SK Hynix is in the final stages of its U.S. IPO via an ADR listing on Nasdaq, aiming to raise approximately $29.4 billion—one of the largest such offerings ever. The funds are earmarked for expanding domestic Korean production capacity for HBM (High Bandwidth Memory) and advanced packaging. A key motivation for the U.S. listing is to achieve a valuation re-rating, escaping the so-called "Korea discount" and tapping into the higher valuation multiples typically given to AI-related semiconductor stocks in the U.S. market. In conclusion, the article views the current price drop as a potential buying opportunity, suggesting the long-term industry fundamentals for SK Hynix—particularly its leading position in the crucial HBM market—remain strong. The significant capital raised from the IPO is also seen as a factor that could provide underlying support for the stock post-listing.

Odaily星球日报57 dk önce

On the Eve of Its U.S. Journey, SK Hynix Plummets Sharply

Odaily星球日报57 dk önce

arXiv Breaks Away from Cornell, Officially Takes Flight on Its Own

arXiv, the widely used preprint server, has officially spun off from Cornell University as of July 1, becoming an independent nonprofit organization named arXiv, Inc. Registered in Delaware with 501(c)(3) tax-exempt status, this move ends its 25-year affiliation with Cornell. The new entity will be governed by a board of up to 12 members, with the Simons Foundation and Cornell University serving as founding members for the first five years. Ramin Zabih, a Cornell professor, will serve as interim CEO during the transition. All 26 staff members have transferred to the new organization, and operations—including free access to papers—are expected to remain unchanged for users in the near term. Founded in 1991 by physicist Paul Ginsparg, arXiv hosts over 3.09 million preprints across physics, mathematics, computer science, and other fields, facilitating rapid dissemination of research. Key motivations for independence include financial pressures—arXiv faced a deficit in 2025—and the need for greater operational flexibility to address challenges like the surge in AI-generated submissions. The platform remains committed to its free-access model but will explore broader funding avenues as a standalone entity.

marsbit1 saat önce

arXiv Breaks Away from Cornell, Officially Takes Flight on Its Own

marsbit1 saat önce

İşlemler

Spot

Popüler Makaleler

Engines of Fury: Türünün ilk örneği olan yukarıdan görünümlü kaçış, nişancılık ve RPG oyunudur

Engines of Fury seçkin AAA oyun ekibi (Ubisoft, Blizzard, Unity, EA Games) tarafından inşa edilen, oynaması ücretsiz bir yukarıdan görünümlü kaçış ve nişancılık oyunudur.

4.0k Toplam GörüntülenmeYayınlanma 2024.06.17Güncellenme 2024.06.17

Engines of Fury: Türünün ilk örneği olan yukarıdan görünümlü kaçış, nişancılık ve RPG oyunudur

Aethir: Oyun ve Yapay Zeka İçin Dağıtılmış GPU Bulut Altyapısı

Aethir, Grafik İşlem Birimlerine (GPU) sahip olma, bunları paylaşma ve kullanma biçiminde devrim yaratan merkeziyetsiz bir bulut bilgi işlem platformudur.

2.5k Toplam GörüntülenmeYayınlanma 2024.06.24Güncellenme 2024.06.24

Aethir: Oyun ve Yapay Zeka İçin Dağıtılmış GPU Bulut Altyapısı

T Nasıl Satın Alınır

HTX.com’a hoş geldiniz! Threshold Network Token (T) satın alma işlemlerini basit ve kullanışlı bir hâle getirdik. Adım adım açıkladığımız rehberimizi takip ederek kripto yolculuğunuza başlayın. 1. Adım: HTX Hesabınızı OluşturunHTX'te ücretsiz bir hesap açmak için e-posta adresinizi veya telefon numaranızı kullanın. Sorunsuzca kaydolun ve tüm özelliklerin kilidini açın. Hesabımı Aç2. Adım: Kripto Satın Al Bölümüne Gidin ve Ödeme Yönteminizi SeçinKredi/Banka Kartı: Visa veya Mastercard'ınızı kullanarak anında Threshold Network Token (T) satın alın.Bakiye: Sorunsuz bir şekilde işlem yapmak için HTX hesap bakiyenizdeki fonları kullanın.Üçüncü Taraflar: Kullanımı kolaylaştırmak için Google Pay ve Apple Pay gibi popüler ödeme yöntemlerini ekledik.P2P: HTX'teki diğer kullanıcılarla doğrudan işlem yapın.Borsa Dışı (OTC): Yatırımcılar için kişiye özel hizmetler ve rekabetçi döviz kurları sunuyoruz.3. Adım: Threshold Network Token (T) Varlıklarınızı SaklayınThreshold Network Token (T) satın aldıktan sonra HTX hesabınızda saklayın. Alternatif olarak, blok zinciri transferi yoluyla başka bir yere gönderebilir veya diğer kripto para birimlerini takas etmek için kullanabilirsiniz.4. Adım: Threshold Network Token (T) Varlıklarınızla İşlem YapınHTX'in spot piyasasında Threshold Network Token (T) ile kolayca işlemler yapın.Hesabınıza erişin, işlem çiftinizi seçin, işlemlerinizi gerçekleştirin ve gerçek zamanlı olarak izleyin. Hem yeni başlayanlar hem de deneyimli yatırımcılar için kullanıcı dostu bir deneyim sunuyoruz.

547 Toplam GörüntülenmeYayınlanma 2024.12.10Güncellenme 2026.06.02

Tartışmalar

HTX Topluluğuna hoş geldiniz. Burada, en son platform gelişmeleri hakkında bilgi sahibi olabilir ve profesyonel piyasa görüşlerine erişebilirsiniz. Kullanıcıların T (T) fiyatı hakkındaki görüşleri aşağıda sunulmaktadır.