Apple Re-invented Image Compression with AI: Same Quality, One-Third the File Size

marsbitPublished on 2026-05-30Last updated on 2026-05-30

Abstract

Apple’s PICO: An AI-Powered Image Codec That Cuts File Size by Two-Thirds at Equal Perceived Quality In 2025, JPEG AI became the first international standard for learned image compression. However, it, like most codecs, still prioritizes mathematical metrics like PSNR over true perceptual quality—what the human eye finds pleasing. Apple researchers have introduced PICO (Perceptual Image Codec), a neural codec designed to optimize for human perception. It tackles key practical challenges: 1) Speed: A novel "one-shot context model" accelerates entropy encoding without sacrificing compression efficiency. 2) Artifacts: A dedicated TextFidelity loss preserves text clarity, and a TilingArtifact loss eliminates color seams between image tiles processed in parallel. 3) Control: It avoids the "hallucinations" common in GAN-based perceptual models. In a large-scale human evaluation (74,925 comparisons), PICO achieved the same perceived quality as standards like AV1, VVC, and JPEG AI while using only 30-43% of the bitrate. It also outperforms other learned perceptual codecs by 20-40%. Remarkably, it runs in 230ms (encode) and 150ms (decode) on an iPhone 17 Pro Max. While less efficient on synthetic graphics, PICO represents a significant shift from optimizing mathematical scores to directly targeting human visual experience, making high-quality perceptual compression practical for consumer devices. The work builds on expertise from WaveOne, whose team joined Apple and previously adv...

How small can an image be compressed?

In February 2025, the Joint Photographic Experts Group (JPEG) quietly announced a milestone celebrated within the industry: the official release of JPEG AI, the first end-to-end learned image coding international standard, which had been years in the making and was highly anticipated.

The news spread, with many researchers reposting on social media, adding comments like 'AI has finally entered the standards.'

The JPEG standard was born in 1992 and has been a fundamental language for digital images for over three decades. Now, artificial intelligence is starting to rewrite the grammar of this language.

However, behind the celebration lies a subtle reality: even JPEG AI still has considerable distance from true 'perceptual compression.'

Engineers know that traditional metrics like Peak Signal-to-Noise Ratio (PSNR) have little to do with what the human eye perceives as 'good-looking.' An image scoring high on PSNR might look mediocre to a person, while another image with lower PSNR might appear detailed and realistic. Optimizing mathematical metrics and optimizing for human perception are two entirely different things.

For decades, from JPEG to VVC and now JPEG AI, the design logic of almost all codecs has revolved within the framework of mathematical metrics. Perceptual compression (directly optimizing for the human visual experience) has always seemed like a distant goal in academic papers, not an engineering reality that could fit into a phone.

At this critical juncture, a team of engineers at Apple quietly published a paper with their answer, codenamed: PICO.

Paper Title: What Matters in Practical Learned Image Compression

Paper Address: https://arxiv.org/pdf/2605.05148

Why is 'Looking Better' Much Harder Than 'Scoring Higher'?

To understand PICO, one must first understand what image compression is actually doing.

Saving a photo as a file is essentially a problem of 'choosing what to forget and what to remember.' With limited storage space, some information must be discarded while making it as unnoticeable as possible to the viewer. Different codecs follow different 'discarding' rules.

Traditional codecs like JPEG, AV1, and VVC are manually designed rule-based systems. They divide the image into blocks, transform, quantize, and entropy code—each step based on decades of accumulated human expertise. These systems can perform excellently on mathematical metrics like PSNR, but their design is inherently oriented toward 'reducing pixel error,' not 'reducing visual discomfort for the human eye.'

The problem is that the human eye is not a pixel error meter. The human eye's sensitivity to texture, text, and detail is far more complex than mathematical formulas. When you compress a street scene photo heavily, the PSNR might still be respectable, but you might see blurred building edges or distorted text on street signs—precisely what the human eye detects first.

The emergence of learned codecs theoretically opened a new door: neural networks could be trained end-to-end directly for human perception, rather than for mathematical formulas. But before PICO, existing perceptual learned codecs were either too slow for practical use, lacked cross-device compatibility, or couldn't flexibly control bitrate, making them impossible to integrate into a consumer-grade product.

Three Core Problems, Three Solutions

The full name of PICO is Perceptual Image Codec. This name directly states its goal: to satisfy the human eye.

The research team systematically explored millions of model configurations and introduced several key technological innovations.

First Problem: Entropy Coding is Slow. What to Do?

A major challenge in image compression: to compress further, a codec needs an 'entropy model' to accurately estimate the information content of each pixel. The most accurate method is autoregressive coding: compressing each pixel requires looking at the surrounding already-compressed pixels for sequential prediction. It's like a chef checking the pot's state after adding each ingredient before deciding the next step. Accurate, but extremely slow.

PICO's solution is the 'One-shot Context Model': decoupling the crucial 'scale parameter' in entropy coding and computing it all in one forward pass, eliminating the need for waiting back and forth; other parameters can be computed in parallel. This retains the precision of autoregressive methods while circumventing their speed bottleneck. The result: removing this module degrades model performance by 10.28%; with it, speed is almost unaffected.

Second Problem: Perceptual Training Can Cause Hallucinations. What to Do?

Images trained with GANs (Generative Adversarial Networks) often 'look realistic,' but it might be a fabricated realism—hair strands turning into non-existent patterns, smooth surfaces gaining false textures. More troublesome, the human eye is extremely sensitive to text; even a slight distortion of a single letter is immediately noticeable.

PICO specifically designed TextFidelityLoss for text: using an off-the-shelf text detector to automatically find text regions in the image, then applying strict pixel fidelity constraints in these areas while suppressing the GAN's 'creative freedom' in text regions. Experiments showed that adding this loss function halved the absolute error in text regions.

Third Problem: Processing Images in Blocks Leaves Color Block Boundaries. What to Do?

To run fast on mobile phone chips, PICO divides images into 504×504 pixel tiles, processes them separately, and then stitches them back. However, GANs during training tend to ignore low-frequency color, often causing visible color discrepancies between adjacent tiles, similar to a poorly 'stitched' feeling in photo editing. The research team specifically introduced TilingArtifactLoss, a multi-resolution L1 loss, forcing the model to maintain color consistency across multiple spatial frequencies. This measure reduced errors at tile boundaries by more than half.

Experimental Results

The Apple team didn't rely solely on benchmark metrics. They commissioned a third-party platform, Mabyduck, to organize a large-scale human subjective evaluation.

The evaluation used a blind, pairwise comparison method: 610 screened evaluators (required to pass color blindness and compression artifact detection tests) compared reconstructed results of the same image using different codecs in paired comparisons, ultimately aggregated into a Bayesian ELO score. A total of 74,925 pairwise comparisons were collected.

The final numbers tell the story: At the same visual quality, PICO's file size is only one-third to one-half that of AV1, AV2, VVC, ECM, and JPEG AI—in other words, to store the same image, it requires only 30%-43% of the bits needed by these standards. Compared to the strongest existing perceptual learned codecs (HiFiC, MRIC, etc.), PICO also saves 20%-40% in file size.

In terms of speed, on an iPhone 17 Pro Max, PICO encodes a 12MP photo in just 230 milliseconds and decodes in 150 milliseconds. Most top-tier ML codecs running on NVIDIA V100 server GPUs are slower than this.

Notably, the paper also specifically recorded a 'counterexample': on the traditional PSNR metric, PICO performed average, even inferior to DCVC-RT and VVC. This恰好印证了团队的基本判断 perfectly illustrates the team's fundamental judgment: optimizing perceptual quality and optimizing mathematical metrics are inherently two different directions; you cannot have your cake and eat it too.

A Milestone, Not the Finish Line

PICO certainly has limitations. The paper acknowledges that for highly regular synthetic images like cartoons or schematic diagrams, PICO's compression efficiency is inferior to traditional codecs, as such content is inherently more suitable for rule-driven autoregressive modeling than perceptual generation.

But these limitations do not diminish the significance of this work.

For the past thirty years, technological progress in image compression has almost exclusively occurred on the track of 'making the numbers look better.' From JPEG to HEVC to VVC, engineers optimized metrics like PSNR and SSIM generation after generation. Human visual perception remained a 'difficult problem' that was circumvented.

PICO is the first time someone has systematically and directly tackled this difficult problem: from architecture search and loss function design to large-scale human subjective evaluation, culminating in a codec that can run in real-time on a mobile phone.

The next time you share a photo from an Apple device, you might not notice anything different. But perhaps within that quiet compression process, an algorithm tailored for human perception is deciding which information is worth keeping and which can be quietly forgotten.

The Team: From WaveOne to Apple

The corresponding author of this paper is Oren Rippel, an Apple researcher and a familiar face in the compression field.

His name first gained widespread attention in 2017. At that time, he was at the startup WaveOne, publishing a paper titled 'Real-Time Adaptive Image Compression,' using neural networks to outperform all mainstream codecs while maintaining real-time speeds. That paper caused significant waves in academia and established Rippel's standing in the field of learned compression.

Afterwards, the same core personnel continued their work at WaveOne, introducing ELF-VC for video compression, achieving a 44% bitrate saving compared to H.264 on the UVG video test set while running over five times faster than similar ML codecs.

This team from WaveOne later joined Apple as a group. And this PICO is their first systematic answer on perceptual image compression, backed by Apple's computing power and platform resources.

This article is from the WeChat public account "Almost Human" (ID: almosthuman2014), author: Compression is Intelligence

Trending Cryptos

CitreaCTR

wrapped stUSDTWSTUSDT

Velodrome FinanceVELODROME

BrevisBREV

PancakeSwapCAKE

JUSTJST

Michael Saylor: 'We Never Said We Would Never Sell Bitcoin'

Michael Saylor stated that his company never made a commitment to never sell its bitcoin holdings, though it expects to remain a net buyer of bitcoin long-term. His comments came following reports that the company had received new authorization to sell up to $5 billion in bitcoin. Saylor clarified that this authorization is not new and was announced on June 29th as part of the company's capital management strategy. He emphasized that the authorization permits but does not obligate sales for specific purposes and that no new approval has been announced. Saylor also noted the company never officially adopted a "bitcoin will never be sold" policy.

cryptonews.ru27m ago

Michael Saylor: 'We Never Said We Would Never Sell Bitcoin'

cryptonews.ru27m ago

The 'Summer Saw' Continues: A Break Above $67,000 Could Signal the Start of Bitcoin's Rally

Bitcoin continues to consolidate within a $58,000–$67,000 range, with its price dropping to $62,217 on August 1st. Analysts are divided on the next direction. Trader Crypto Candy suggests a potential drop towards $60,000 if the price remains below $66,000. Investor Jelle refers to the prolonged sideways movement as a "summer saw" and maintains a dollar-cost averaging strategy. The key upside scenario hinges on a breakout above $67,000. Daan Crypto Trades states that without this, the movement risks being just an extended pause. Roman projects a sharper rise to $70,000–$80,000+ if a breakout occurs with sufficient volume. Macro-analyst Gert van Lagen views this as an accumulation phase within a multi-year "cup and handle" pattern. He notes that long-term holders are refusing to sell, as indicated by the NUPL metric staying far from capitulation. In summary, the market is in an accumulation phase, with the $60,000 and $67,000 levels being critical. A break above $67,000 could initiate significant growth, while a fall below $60,000 may lead to further decline. The recent pullback shows that legislative catalysts have provided only short-lived momentum, raising questions about the sustainability of any future breakout attempts.

cryptonews.ru42m ago

The 'Summer Saw' Continues: A Break Above $67,000 Could Signal the Start of Bitcoin's Rally

cryptonews.ru42m ago

Must-Watch Events Next Week｜CLARITY Act Could Face Senate Vote; SpaceX, Circle to Report Earnings (8.3-8.9)

**Summary: Key Events and Developments to Watch (August 3-9)** The upcoming week is marked by significant financial disclosures, key legislative deadlines, and notable product updates. **Major Financial Events:** Several companies are scheduled to release their Q2 2026 earnings. American Bitcoin (ABTC) will report on August 3, followed by SpaceX and Hut 8 Mining Corp. on August 4, and Circle on August 5. Notably, a significant portion of SpaceX shares (up to 12% of total shares) will be unlocked on August 6 following their earnings release. **Key Legislative Deadline:** The U.S. Senate faces an August 7 deadline to secure 60 votes for the CLARITY Act, a bipartisan bill aiming to establish a federal regulatory framework for cryptocurrencies. The Senate may hold a full vote on the bill during the week. **Economic Data:** The U.S. July Non-Farm Payrolls report will be released on August 7, providing crucial labor market data. **Technology & Product Updates:** * **Shutdowns:** DeFi portfolio tracker Zapper and wallet app Ctrl Wallet will cease operations on August 3. * **Upgrades:** LayerZero will deprecate its v1 relayers on August 3. XRP Ledger's new version 3.3.0, featuring five new functions, is expected next week. * **AI:** Elon Musk announced that the advanced Grok 4.6 AI model is set for release around August 7. * **Bitcoin:** The BIP-110 forced signaling for a potential Bitcoin network change is scheduled to begin around August 8. **Other Notable Events:** Chinese robotics firm Unitree Tech has set its preliminary price inquiry for its IPO for August 5. South Korean exchange Upbit will delist AQT and AERGO tokens on August 3.

marsbit1h ago

Must-Watch Events Next Week｜CLARITY Act Could Face Senate Vote; SpaceX, Circle to Report Earnings (8.3-8.9)

marsbit1h ago

Stocks Are Plummeting More Sharply Than Cryptocurrencies. Where Has the Money Gone?

Stock Markets Plunge Deeper Than Cryptocurrencies: Where Did the Money Go? In late July, Seoul's Kospi index triggered circuit breakers for two consecutive days, plummeting over 40% from its June high. The collapse was led by heavyweight stocks like SK Hynix, whose record profits still disappointed investors, and devastating leveraged ETFs, with one major product losing over 83% of its value. This signaled a global, forced deleveraging targeting the most crowded trades. Interestingly, while stocks exhibited extreme volatility akin to crypto markets, Bitcoin rose nearly 15% in July after a prior steep drop. Analysis shows the money fleeing equities did not flow into Bitcoin. Instead, Bitcoin had already absorbed its sell-off in May-June, when U.S. spot Bitcoin ETFs saw historic outflows. The true safe-haven beneficiary was gold, whose price rose over 20% year-on-year, highlighting a decoupling between Bitcoin and gold as "digital gold." The sell-off was a targeted unwinding of leveraged positions in tech and semiconductors, accelerated by broker-dealer risk management and shifts in the AI narrative, including new competition from Chinese memory chipmakers. The retreat path was clear: from high-valuation tech stocks to cash and U.S. Treasuries, then to gold. For Bitcoin to attract sustained institutional inflows, conditions like eased global liquidity pressure, a "soft-landing" Fed rate cut, and U.S. regulatory clarity via legislation like the stalled CLARITY Act are needed. Currently, Bitcoin is not a safe haven but an already-cleared asset. Its low correlation with tech stocks, however, makes it a potential diversification play for institutional portfolios once the storm passes. The money isn't here yet, but the positioning is underway.

marsbit1h ago

Stocks Are Plummeting More Sharply Than Cryptocurrencies. Where Has the Money Gone?

marsbit1h ago

In Conversation with Ray Dalio: We Are Currently in an AI Bubble, with 1% of My Portfolio in Bitcoin

Ray Dalio, founder of Bridgewater Associates, warns in an interview that the current AI boom shows classic bubble characteristics, which could lead to significant economic downturns as seen in past cycles like 1929 or 2000. He explains that speculative enthusiasm, fueled by debt and overvaluation, often precedes a crash when rising rates or taxation force asset sales, causing widespread losses and recession. Dalio also outlines his "Big Cycle" theory, describing an approximate 80-year pattern where widening wealth gaps, massive government deficits, and shifting geopolitical power (like China's rise) create internal conflict and global instability. He emphasizes that we are in a late-cycle, transitional phase where traditional powers like the US and UK face decline. For personal wealth protection, Dalio advises diversification beyond cash into assets like stocks, bonds, real estate, and particularly gold, which he prefers over Bitcoin. While he holds about 1% of his portfolio in Bitcoin as a non-printable hard asset, he views gold as more secure from technological or governmental threats. Regarding AI's impact, Dalio believes it will disproportionately benefit capital owners, worsening inequality by replacing both physical and cognitive labor. He suggests that human intuition and emotional intelligence, combined with AI, will be key for future workers. On taxation, Dalio argues that wealth taxes are impractical and risk triggering asset sell-offs, reducing productive investment. He points to the UK as a cautionary example of debt, low productivity, and political strife. Geopolitically, Dalio foresees a more regionalized world, with the US showing weakness in prolonged conflicts like with Iran, akin to past imperial declines. The ideal outcome, he suggests, is coexisting powerful blocs (e.g., Americas, China-Asia Pacific) without major war.

marsbit5h ago

In Conversation with Ray Dalio: We Are Currently in an AI Bubble, with 1% of My Portfolio in Bitcoin

marsbit5h ago

Trading

Spot

Hot Articles

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

Talus is a decentralized AI Agent framework built on the Sui, designed to solve the structural problems of current AI systems: centralization, opacity, and a lack of native economic identity.

43.4k Total ViewsPublished 2026.03.18Updated 2026.03.18

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

By 2026, the integration of artificial intelligence and cryptocurrency has advanced from proof-of-concept to a new stage of "system-level integration".

3.0k Total ViewsPublished 2026.03.26Updated 2026.03.26

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

U.S. Equity TradFi Assets: Traditional Finance as a Steady Anchor Amid the AI IPO Boom

In 2026, the U.S. IPO market has regained momentum.

36.9k Total ViewsPublished 2026.07.08Updated 2026.07.08

U.S. Equity TradFi Assets: Traditional Finance as a Steady Anchor Amid the AI IPO Boom

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

Apple Re-invented Image Compression with AI: Same Quality, One-Third the File Size

Abstract

Why is 'Looking Better' Much Harder Than 'Scoring Higher'?

Three Core Problems, Three Solutions

First Problem: Entropy Coding is Slow. What to Do?

Second Problem: Perceptual Training Can Cause Hallucinations. What to Do?

Third Problem: Processing Images in Blocks Leaves Color Block Boundaries. What to Do?

Experimental Results

A Milestone, Not the Finish Line

The Team: From WaveOne to Apple

Trending Cryptos

Related Questions

Related Reads

Michael Saylor: 'We Never Said We Would Never Sell Bitcoin'

The 'Summer Saw' Continues: A Break Above $67,000 Could Signal the Start of Bitcoin's Rally

Must-Watch Events Next Week｜CLARITY Act Could Face Senate Vote; SpaceX, Circle to Report Earnings (8.3-8.9)

Stocks Are Plummeting More Sharply Than Cryptocurrencies. Where Has the Money Gone?

In Conversation with Ray Dalio: We Are Currently in an AI Bubble, with 1% of My Portfolio in Bitcoin

Trading

Hot Articles

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

U.S. Equity TradFi Assets: Traditional Finance as a Steady Anchor Amid the AI IPO Boom

Discussions

Top Questions