Investment Philosophy of Gavin Baker, an Early Nvidia Investor: Long AI Infrastructure Bottlenecks, Short Overall Market Risk

marsbitPublished on 2026-05-30Last updated on 2026-05-30

Abstract

Gavin Baker, an early investor in Nvidia and founder of Atreides Management, outlines his investment philosophy: going long on AI infrastructure bottlenecks while hedging against broader market risk. He argues AI is not a bubble but a supercycle driven by constraints in power, wafers (semiconductors), and compute efficiency (tokens per watt). True alpha, he believes, lies not in application-layer companies like OpenAI but in "picks and shovels" providers—companies solving physical bottlenecks in GPU connectivity (e.g., Astera Labs), memory (Micron), inference chips (Cerebras, Positron), advanced manufacturing (TSMC, ASML), and energy supply. His portfolio reflects this barbell strategy: concentrated bets on key infrastructure players alongside a significant put position on the QQQ ETF to hedge overall market downside. Baker contends this cycle differs from the dot-com bubble because demand is fueled by the strong balance sheets of hyperscalers (Google, Meta, Amazon, Microsoft), not debt, and physical supply constraints (e.g., chip manufacturing capacity) prevent runaway overinvestment. He highlights the growing importance of inference (vs. pre-training), vertical/small language models, sovereign infrastructure deployment speed, and the convergence of energy and space (e.g., orbital compute). His long-term view is that performance-per-watt and token cost reduction will dictate winners as AI scaling hits fundamental physical limits.

This episode of the podcast mainly discusses the investment philosophy of Gavin Baker, founder of Atreides Management and a long-term investor in Nvidia and Cerebras.

His core judgment is that AI is not a bubble, but a supercycle in infrastructure driven by electricity, wafers, and computing power; the real alpha isn't in large models or chatbots, but in the "pick-and-shovel" areas like GPU interconnects, memory, inference chips, advanced process nodes, and power supply.

Gavin Baker hedges against overall market pullbacks with QQQ puts while concentrating his bets on AI physical bottleneck assets like Astera Labs, Unity, Micron, Nvidia, Cerebras, and Positron.

He reframes the "AI bubble" debate from sentiment to supply-demand constraints, arguing that as long as TSMC, ASML, high-bandwidth memory, and the power grid cannot quickly become oversupplied, AI capital expenditure might not replay the 2000 internet bubble.

Key Quotes

AI Bubble or Supercycle

· "AI is not in a bubble; on the contrary, it is in a supercycle."

· "The biggest returns are not in SaaS, not in chatbots like OpenAI or Anthropic, but in electricity, compute, and silicon manufacturing."

· "This is not the internet bubble because the buyers are the world's smartest, cash-strongest companies; they are not buying compute on debt leverage."

· "If the entire market cannot be oversupplied, it's hard for it to suddenly crash like a traditional bubble."

The Real Bottlenecks: Electricity, Wafers, Tokens

· "Gavin's theory is simple: just look at the bottlenecks in the AI infrastructure layer. Whoever improves performance per watt and lowers token cost has value."

· "AI labs are increasingly concerned about one thing: how many tokens can be generated per watt of electricity."

· "Electricity and wafers are two brick walls, and also the two key constraints limiting AI's acceleration."

Shift from Pre-training to Inference and Post-training

· "Finishing model pre-training doesn't mean it's a genius for life; it still needs to absorb new information in the post-training phase."

· "Inference inherently requires massive computation, which is why inference chips and infrastructure will be the focus of the next phase."

· "The cost or revenue opportunity from inference alone could be 5 to 10 times that of pre-training compute investment."

Vertical Small Models, On-device Models, and Sovereign Infrastructure

· "In the future, you might not interact with Claude daily; what you truly need might be a personalized AI agent trained on your own data."

· "The speed of infrastructure deployment itself is a moat. The iteration speed in the digital world is far faster than the construction speed of physical infrastructure."

"Whoever can compress physical deployments that take months or years into weeks can command a high price in AI infrastructure."

Gavin's Investment Style: Long Bottlenecks, Short Overall Market Risk

· "He strongly believes AI winners will emerge, but that doesn't mean he's optimistic about the whole market; QQQ puts are his hedge against overall downside risk."

· "TSMC actually limits the speed at which the bubble can accelerate; as long as chip capacity cannot expand instantly, capital expenditure is less likely to spiral out of control."

· "Gavin is like an older, more stable, more cycle-tested Leopold: the former's success is measured in decades, the latter's more in quarters for now."

Assets Worth Betting on in the AI Supercycle

EJ:Gavin Baker is an incredibly prolific but largely unheard-of AI investor. Over the past 20 years, he has invested in companies that later became household names in AI before they went mainstream. He made early bets on Nvidia (the core supplier of AI GPUs and accelerated computing) and Cerebras (an AI chip company), and holds a very clear view: AI is not a bubble, but rather a supercycle.

He believes that by observing watts, wafers, and tokens—the underlying infrastructure of AI—one can identify key bottlenecks and constraints. His conclusion is simple: the biggest returns in AI come from electricity, energy, and silicon manufacturing, not SaaS, and not from chatbots like Anthropic or OpenAI.

The entire industry ultimately funnels down to semiconductors, the "pick-and-shovel" assets supporting the whole AI sector.

While many claim the AI industry is already a bubble, he sees it as a generational buying opportunity, especially for AI infrastructure. He expresses this thesis with a fund size of around $4.1 billion.

If you listen to his discussion of these constraints, especially AI infrastructure, it sounds familiar. We've covered an investor named Leopold Aschenbrenner multiple times on this show, who has also positioned his portfolio around similar themes. The difference is Leopold has done it for about 3 years, while Gavin has done it for over 20.

Leopold's assets under management are roughly three times Gavin's, but the show's producer Luke once made a great point: you might beat Warren Buffett in a single year, but can you beat him for decades? Gavin Baker's track record suggests he might have a different perspective on this investment thesis.

For those unfamiliar with Gavin Baker: he is the founder of Atreides Management and has been investing in Nvidia for 20 years. Simply holding Nvidia for 20 years and still working is incredible in itself, as that should have generated phenomenal returns.

His recent wins include Cerebras and Astera Labs (an AI data center interconnect chip company). Cerebras is an AI chip company; the episode mentions its post-IPO valuation is staggeringly high. There are also some lesser-known companies we'll explore in this episode as we follow his portfolio and judgments to see where he thinks the real AI investment opportunities lie.

So the question becomes, what exactly does he invest in, and why? Looking at Atreides Management's recent 13F filing, the fund has about $4 billion AUM. Breaking down its largest holdings reveals companies all pointing to the AI bottlenecks Gavin often mentions.

He has significant positions in some unsexy companies many haven't even heard of. For instance, Astera Labs constitutes nearly 9-10% of the fund. You can think of Astera Labs as the connective layer between GPUs.

If you imagine a data center as a system, GPUs are the engines, handling model pre-training, post-training, and inference. But for GPUs to work, they need to transfer massive amounts of data between each other and access memory chips storing data.

This requires a "plumbing system." I'm explaining at a high level here, as I don't pretend to understand all the low-level details. Astera Labs solves exactly this problem. When AI clusters scale to hundreds of thousands of chips, the bottleneck is no longer just the GPU itself, but the data transfer window—how to send the right data, access the right data, at the right time. Astera Labs builds this plumbing system.

I hadn't heard of Astera Labs before researching for this episode. But I recall it was similar with Cerebras. Gavin talked about Cerebras about six months ago, which is a long time in AI terms. It later IPO'd, and the episode mentions its valuation at around $60 billion, rising another 40% post-IPO. This suggests Astera Labs could be another important name in this trend.

Josh:Cerebras was a very early investment for him. He got into Cerebras very early in its life cycle, meaning he's been betting on this thesis for years. Several other companies are also his long-term bets, with the flagship being Nvidia, of course.

Being involved with Nvidia for over 20 years and maintaining conviction is impressive. I recently listened to two podcasts featuring Gavin. When discussing his Nvidia position, he clearly expressed a belief that Nvidia can maintain its current margins and demand. This implies he thinks Nvidia has a path toward a nearly $10 trillion market cap; it's currently only about halfway there.

Another worth mentioning is Micron. We covered the AI investment stack and these companies' positions in the last episode; I highly recommend revisiting it. Micron is one of the largest memory makers.

The episode mentions a staggering number: its market cap was under $100 billion a year ago, and at the time of recording, it had surpassed $1 trillion—a 10x increase in a year. That shows how important the memory problem is.

There are also less obvious but interesting companies. EJ, one I particularly want to mention: Unity Software. Gamers know Unity as a game engine used to create many popular games.

Why would an AI investor bet on Unity, this "video game thing"? The answer is 3D game engines. Unity is a world model builder with deep understanding of physics, how the world works, materials, and lighting.

When AI companies build AGI and humanoid robots, a crucial step is simulating virtual environments and datasets for robot training. Unity happens to be one of the strongest tools for this.

So as a world model maxi, you should appreciate this example: a company famous as a game engine has a clear path to becoming a significant player in the AI world.

Gavin's Investment Thesis and Strategy

EJ:The theory of world models is simple: current AI models or LLMs understand the world primarily through text and books, like a student sitting in a library without real-world experience.

World models aim to unlock precisely that: putting a game character in a simulated environment to understand how physical reality works.

For example, what happens if I drop my phone or kick a ball? What are the next steps? What should you do? World models solve this problem.

There aren't many players capable of doing this at scale currently. The leader might be Google with projects like Genie 3. The episode also mentions Google's recent release of Gemini Omni, but such models haven't had their true ChatGPT moment yet.

What I like about Gavin is that his portfolio resembles a barbell strategy. One side is traditional: people need GPUs, memory, so he bets on the biggest players like Micron and Nvidia. The other side is avant-garde: he thinks the puck is going there, so he invests in Cerebras (believing inference will be crucial), and Unity (believing world models will be the future for training robots and next-gen LLMs).

His portfolio also includes Positron, which makes inference chips. If that sounds similar to Cerebras, yes, they both focus on inference. Gavin has repeatedly discussed a trend in recent interviews: the AI model infrastructure stack, especially the training stack, is shifting from pre-training to placing greater emphasis on post-training.

If you're in the AI space, you know this shift is already happening. Gavin is intensely focused on this. A model still needs to understand new information, new data, and update itself. Just because it completed pre-training on a dataset doesn't mean it's a genius for life. It needs to learn new information, which happens in the post-training layer, and that requires significant computation.

Secondly, if you need an AI model to truly think through problems—like we do when receiving new information, wondering if an angle holds or if another theory explains it—that's reasoning. Reasoning also requires massive computation. Current estimates suggest the cost or revenue opportunity from inference alone could be 5 to 10 times that of pre-training compute investment.

So a major shift is underway in AI labs and chip makers. You've already seen Nvidia launch many inference-focused GPUs to support agentic applications. Gavin is also expressing his bet on inference through a series of investments.

A final interesting point is Gavin's take on China. In the AI race, the narrative has often been China vs. the US. China has a unique configuration: relatively abundant energy and the capacity to expand chip manufacturing. The US currently struggles here, which is why many parts are outsourced to Taiwan's TSMC.

His explanation is that China has a unique opportunity to create a type of AI infrastructure or chip very different from the US, as they will be intensely focused on inference. You could say Gavin is leading the charge in betting on US inference infrastructure buildout through his US investments. I think this could be a huge opportunity in the future.

Josh:It's worth noting that this bet isn't only on the upside. He also holds a large position in QQQ puts. QQQ is an ETF tracking the Nasdaq 100, a basket of stocks and the second-largest ETF by trading volume in the US. It has performed exceptionally well: up 55% in 2023, 25% in 2024, 20% in 2025, and 17% year-to-date in 2026.

In other words, QQQ as an index fund has performed very well; it's easy to buy as a basket of the top 100 stocks. And Gavin is hedging against it. He's not saying AI won't win, but rather: he wants to invest in the critical makers solving bottlenecks, while not appearing overly optimistic about overall market sentiment.

QQQ puts are downside protection: if the overall market crashes adversely, even if AI wins long-term, he has this hedge.

Four Types of Investment Areas

Josh:We can categorize the investment bottlenecks he deems most important. The first category is verticalized small language models.

General LLMs like Claude and ChatGPT are generalized LLMs with broad world understanding, answering specific questions. But training a model around a specific vertical or problem is different.

These specific problems often exist within enterprises, especially those deeply focused on a particular issue or companies with a niche in a specific sector. Verticalized SLMs address this: they are frontier models but highly optimized to run efficiently on specific enterprise data or run locally on devices.

We've talked about on-device or locally run models before. The reason is your phone or other devices contain a lot of highly personal data you might not want to hand over, and companies might not have access to—like medical records or financial details.

I saw OpenAI released a financial AI agent that can access your bank account but can't actually act on your behalf due to personally identifiable information like social security numbers, bank details, etc.

Local models or SLMs can solve such problems. Gavin heavily bets these will become important in the future. One company he's very bullish on: Apple. While he might not have expressed explicit investment interest, he believes Apple will be a major device maker enabling local models to run on devices.

If that's the future, we might no longer think Claude must be the model you interact with daily. You might need a personalized AI agent trained on your own data—that's what an SLM could ultimately become.

A general version could run on your phone, and many enterprises will run highly optimized, specialized models trained on their proprietary data to better sell or market products.

EJ:Apple is perfectly positioned for this. I'm looking forward to WWDC, which is coming soon.

Josh:Yes.

EJ:Apple's developer conference is only weeks away. They will release new AI software and how it integrates with hardware. This will be very important; we'll continue covering it. I'm eager to discuss it.

Josh:The second pillar is sovereign infrastructure. We often say bits move much faster than atoms. Looking at AI infrastructure makes this clear: model quality improves almost exponentially; the intelligence per watt, per token, only goes up.

But the speed of physical deployment hasn't increased at a similar pace, and that itself is a moat. Hardware is incredibly complex, transistor precision approaches atomic levels; deploying at scale in a world where existing infrastructure is already stressed isn't easy. After EV acceleration, the power grid is under greater pressure, nearing capacity in many places. Now AI brings the energy problem and chip problem.

Gavin strongly bets on the fact that infrastructure is hard; building takes many days, months, even years. He's betting on those who can compress this cycle to weeks. So, the speed of physical deployment itself is a moat. He's narrowing his focus to find companies that can deploy quickly.

The first example that comes to mind is SpaceX and the speed at which they built Colossus and leased it to Anthropic, potentially to other companies in the future. This infrastructure pillar is a key focus for Gavin.

Looking at Leopold's portfolio, this is also a core part. The reality is: building things is very hard, and those who can build can charge a high price. The episode mentions SpaceX's biggest revenue stream now is leasing data centers, not rockets. That shows how important this pillar is.

EJ:He cares about speed, but also cost. He repeatedly mentions a metric: performance per watt. What he's really saying is that AI labs are increasingly concerned about how many tokens can be generated per watt.

If you consider that this year alone, about five companies are spending tens of billions or even trillions on GPUs, compute, and the electricity to power them, you definitely want a high bang for buck. Especially as hyperscalers expand to this scale, cost becomes central.

A hypothetical: If asking Claude a question costs 2 cents and ChatGPT costs $1, even if Claude is only 95% as intelligent, I'd probably use Claude. I can ask more times and eventually get an answer at a lower cost.

So the cost of accessing this intelligence is crucial. Just this week, Microsoft and Uber announced they're actually reducing usage of Claude Code because their annual budgets were exhausted in about four months.

You can see this in Gavin's portfolio: Cerebras, Positron, Astera Labs. He identifies very specific infrastructure bottlenecks and makes a simple bet: if this company solves this bottleneck, achieving a certain performance per watt and lowering token cost to a certain level, then AI labs will buy more GPUs, more of this product.

So his theory is actually simple, despite the complex technology: I only look at bottlenecks in the AI infrastructure layer. If I can find a company improving performance per watt and making tokens cheaper, I bet it will be valuable in the future, either via IPO or high-priced acquisition.

Josh:In this part, if someone wants to replicate Gavin's trades, they should know a few names: Astera Labs, Cerebras, SiFive, and Positron. These four companies are key in this segment.

The fourth and final direction is the combination of energy and space. As mentioned earlier, the terrestrial grid largely limits energy supply, and building new energy is very hard. The episode mentions a statistic: about 40% of new data centers face strong opposition, with people lobbying, protesting, not wanting them built.

There are two types of solutions. One is creating out-of-the-box energy, portable energy. You can bring the data center and power it with a small energy unit. Blue Marble, favored by Leopold, falls into this category.

The other is orbital compute, which Gavin is now very focused on. The biggest, most central company here is, of course, SpaceX. It's the only company capable of being the highway to space, sending payloads into orbit, racks and data centers into low Earth orbit, and generating enough intelligence and power to transmit back.

I think SpaceX's significance extends beyond SpaceX itself. I'm somewhat surprised Gavin's portfolio doesn't have more space stock exposure, given he sees it as a huge industry. Perhaps the reality is it's still too early, and SpaceX is the linchpin unlocking the industry.

Next, closely watch the Starship V3 launch. We just saw a Starship launch last week, performing well. If Starship doesn't truly become operational, there's no space energy, no racks to orbit. It's a necessary condition because the payloads needing launch are massive. So SpaceX is a must-watch company, though many second-order companies will be affected.

Why Isn't This Another Internet Bubble?

Josh:Next, people will inevitably ask: why isn't this just another dot-com bubble? Gavin has been asked this many times and gives a very strong answer; I largely believe him, as his argument is compelling.

His logic roughly goes: the 2000 internet bubble was debt-fueled. Many people borrowed heavily to invest in unproven theories and products nobody truly used or cared about.

Comparing it to what Gavin calls the AI supercycle: just OpenAI and Anthropic are projected to reach $200 billion in ARR this year. This isn't made-up money; it's money already contracted, with a large portion—reportedly 40-60% in the episode—prepaid by enterprise and retail customers.

In other words, real money is flowing.

Looking at GPU compute, not the model labs but who's buying from Nvidia: Google, Microsoft, Amazon, and Meta are paying from their cash reserves, not borrowing. Amazon just reached the end of its free cash flow; if they start borrowing, we can worry. But the key point is they aren't leveraging up currently.

Moreover, these are among the world's top five companies, arguably some of the smartest given their market cap, scale, and position. Compared to the internet bubble, where countless unknown companies raised tons of money and burned it unreasonably, this cycle sees the world's smartest companies spending non-leveraged money.

Recent quarterly reports we discussed on the show also show profits are being optimized around these moves, and models are still improving, getting smarter. So Gavin's core argument is: this isn't the internet bubble because it's not driven by levered money; and the bottlenecks we discuss are constrained by physical atoms.

Buying a bunch of memory chips and GPUs is one thing, but Nvidia can't oversell GPUs, and Micron can't oversell AI memory chips because they don't have enough chip-making facilities. So his simple point is: if you can't oversupply the entire market, it's not a bubble. We're limited by not having enough picks and shovels to do it, and he's investing in precisely those things.

Another great point: Gavin believes if TSMC could supply, Nvidia could have sold $2-3 trillion worth of GPUs this year and next. In other words, TSMC is a key element within the bubble boundary.

The reason is, if TSMC could meet these companies' demands, providing that many chips, it would consume enormous capital. Currently, looking at charts, there isn't a large divergence between CapEx and operating cash; companies still generate enough cash to support building.

But if TSMC told Nvidia tomorrow they could triple capacity overnight, Nvidia wouldn't refuse; it would start spending huge sums on chips. Other companies would be forced to borrow to buy these chips, and then the CapEx bubble would start growing, diverging from operating cash flow.

But because there are supply constraints at every step—memory constraints, chip manufacturing constraints, energy constraints, especially TSMC's constraints on advanced chips—we simply can't accelerate building that fast. Therefore, TSMC is blocking bubble acceleration.

As long as TSMC's chip capacity remains limited, as long as Samsung and other chip makers don't surpass their market share, growth remains relatively sustainable. It looks fast, but there's still massive unmet demand because we simply can't build fast enough. As long as this dynamic holds, I think we're okay for now.

EJ:Another point: you can't assume demand stays static because it won't. AI-related demand is growing exponentially, faster than the production supply of these chips.

I can think of only two ways to disprove this thesis. First, someone miraculously replicates ASML, suddenly creating many ASML competitors. For those unfamiliar, ASML makes machines worth about $400 million that TSMC and all major chip fabs need.

The episode says ASML has only one team in Norway building these, with very long cycles, and order backlog extends about five years.

Second, we create a completely different type of LLM that doesn't require so many GPUs or memory. But we see no signs of that currently.

I saw news about SK Hynix today. It's the top memory manufacturer and supplier for Nvidia GPUs, almost the top dog in AI memory.

It's reportedly receiving offers of around $50-100 billion from Google and Microsoft to lock in future three-year supply, to pay for equipment needed for expansion.

This shows how hungry these giants are for memory, and this is just one sub-sector within AI components. SK Hynix reportedly said: I don't want to give you supply guarantees; I'll just raise prices. Its operating margin is about 70%, almost unbelievable in semiconductors.

So Gavin going all-in makes sense. It doesn't look like a bubble, though the market might react that way short-term. Opening our stock portfolios before recording today, they were almost all down, but that's more a reactionary move.

The directional target is clear: we will only need more GPUs, more semiconductor chips, and supply isn't enough, nor are manufacturers.

Gavin's Investment Portfolio

Josh:The conclusion is: electricity and wafers. Just those two. They are two brick walls, two limiting factors preventing us from accelerating too fast. As long as electricity and wafers remain valuable, in strong demand, and supply-constrained, there are good days ahead.

If you want the TLDR of Gavin's portfolio, I can read his largest holdings. Again, this is not investment advice. This is what Gavin holds, not what we hold. I don't know if these stocks will go up, down, or sideways.

His largest position is somewhat counterintuitive: the QQQ put position. Overall, he is bearish on the market, which is noteworthy. Second largest is Astera Labs, about 7.4% of the portfolio, ticker ALAB. Third is Unity, the 3D software company.

There are many more: Ciena, Micron, Nvidia, Amazon, Lumentum, Alphabet, Coherent, Roblox, EchoStar, Twilio, Wayfair. This guy invests in everything.

If you're interested, check out his 13F. That's Gavin's view: bottlenecks are electricity and wafers. As long as these constraints exist, it's basically a one-way ride up. EJ, how are you absorbing this info? What would you do with it?

EJ:Since Leopold's 13F came out, the market has been turbulent. Recording this episode, I increasingly realize Gavin is like an older, wiser Leopold. He's been in this industry a long time. Maybe he doesn't have $13 billion AUM, but I feel he'll still be here in 10 years.

If you're listening and thinking, "I don't want to chase AI progress every minute, hour, day; I just want to put money somewhere and watch it grow over the coming months or years," then Gavin's portfolio might be a good reference. Of course, this is not investment advice.

He adopts a more cautious, long-term, forward-looking approach. If his trend judgments materialize, as with his early bets on Nvidia and Cerebras, there could be exponential returns in the coming years. But it all hinges on his core view: we are not in a bubble.

I'm curious if listeners agree. Obviously, most people aren't as technical or deep into the weeds as Gavin. But after listening, do you think we're in a bubble or not? What are the reasons for and against? Did we miss anything? Josh, before we end, do you think we're in a bubble now?

Josh:I think we are certainly in a bubble. The question is, which stage of the bubble are we in? That's debatable. It looks more like an early stage now, so hopefully it remains that way. According to Gavin, as long as TSMC continues limiting chip capacity, we're okay.

That's the overall outlook. We've covered Leopold, whose success is measured in quarters for now; now we cover Gavin, whose success is measured in decades. Many people's own answers might lie somewhere in between.

Related Questions

QWhat is the core of Gavin Baker's investment philosophy regarding AI, according to the article?

AThe core of his philosophy is to be long on the bottlenecks of AI infrastructure (the 'picks and shovels' like power, wafers, and connectivity) and to hedge against overall market risk by being short via instruments like QQQ puts.

QWhich two physical constraints does Gavin Baker identify as key bottlenecks limiting the acceleration of AI development?

AHe identifies power (electricity) and wafers (silicon/semiconductor manufacturing) as the two key physical constraints or 'brick walls' limiting AI's acceleration.

QWhy does Gavin Baker believe the current AI boom is not a bubble like the dot-com bubble?

AHe argues it's not a debt-fueled bubble. The primary buyers (like Google, Microsoft, Amazon, Meta) are using their own cash reserves, not leverage. Furthermore, physical supply constraints (in chips, power, memory) prevent the market from being oversupplied, which is a hallmark of traditional bubbles.

QName one specific company in Gavin Baker's portfolio that addresses the AI infrastructure 'bottleneck' of data connectivity between GPUs.

AAstera Labs (ALAB) is a company in his portfolio that builds the 'piping system' or connectivity layer for data transmission between GPUs in large AI clusters.

QWhat major shift in AI model infrastructure does Gavin Baker emphasize, and what type of computing does this shift favor?

AHe emphasizes the shift from a primary focus on pre-training models to a greater focus on post-training and inference. This shift favors inference chips and infrastructure, as the cost/opportunity from inference could be 5 to 10 times that of pre-training compute.

Related Reads

Apple Re-invented Image Compression with AI: Same Quality, One-Third the File Size

Apple’s PICO: An AI-Powered Image Codec That Cuts File Size by Two-Thirds at Equal Perceived Quality In 2025, JPEG AI became the first international standard for learned image compression. However, it, like most codecs, still prioritizes mathematical metrics like PSNR over true perceptual quality—what the human eye finds pleasing. Apple researchers have introduced PICO (Perceptual Image Codec), a neural codec designed to optimize for human perception. It tackles key practical challenges: 1) Speed: A novel "one-shot context model" accelerates entropy encoding without sacrificing compression efficiency. 2) Artifacts: A dedicated TextFidelity loss preserves text clarity, and a TilingArtifact loss eliminates color seams between image tiles processed in parallel. 3) Control: It avoids the "hallucinations" common in GAN-based perceptual models. In a large-scale human evaluation (74,925 comparisons), PICO achieved the same perceived quality as standards like AV1, VVC, and JPEG AI while using only 30-43% of the bitrate. It also outperforms other learned perceptual codecs by 20-40%. Remarkably, it runs in 230ms (encode) and 150ms (decode) on an iPhone 17 Pro Max. While less efficient on synthetic graphics, PICO represents a significant shift from optimizing mathematical scores to directly targeting human visual experience, making high-quality perceptual compression practical for consumer devices. The work builds on expertise from WaveOne, whose team joined Apple and previously advanced neural video compression.

marsbit1h ago

Apple Re-invented Image Compression with AI: Same Quality, One-Third the File Size

marsbit1h ago

Shanghai's Leading Large Model Company Initiates A-Share Listing

Shanghai-based AI large language model leader MiniMax has initiated the process for an A-share listing in China, having filed a pre-IPO tutoring report with the Shanghai Securities Regulatory Bureau on May 29. This move positions it to compete with Zhipu AI for the title of the first major domestic LLM company to list on the A-share market. Having already completed an IPO in Hong Kong in January 2026, MiniMax's stock price has surged approximately 409% since its debut, with its market capitalization reaching around HK$263.45 billion (approximately RMB 227.55 billion) as of May 29. The company's rapid growth is supported by strong business performance. Its Annual Recurring Revenue (ARR) has grown over 100% in the past two months and now exceeds $300 million. It serves over one million global enterprise and developer clients and has around 300 million users worldwide. For the full year 2025, MiniMax reported revenue of $79.038 million, with a gross margin of 25.4%. While it reported an adjusted net loss of $250 million, the loss rate has narrowed significantly year-over-year. On the product front, MiniMax has released several flagship models this year, including MiniMax-M2.5, M2.6, and M2.7, with the first and last being open-sourced. Its models gained significant traction earlier in the year, briefly becoming the top model provider by usage share on the OpenRouter platform in February. The company has also upgraded its AI agent product, now named Mavis, and is preparing to launch its next-generation MiniMax-M3 model. Technical previews indicate M3 will feature a novel "MiniMax Sparse Attention" mechanism, promising substantial improvements in inference speed. MiniMax's push for an A-share listing reflects a broader trend among China's leading AI firms, including Zhipu AI, Moonshot AI, StepFun, and 01.AI, to seek public listings. This strategy aims to secure broader financing channels to support the immense computational costs and ongoing commercialization efforts inherent in developing advanced large language models.

marsbit1h ago

Shanghai's Leading Large Model Company Initiates A-Share Listing

marsbit1h ago

Trading

Spot
Futures
活动图片