A Latte for $0.038, Gemini 3.1 Teams Up with GPT-5.5 to Bankrupt Cafe, Burning Through $21k in 2 Months

marsbitОпубліковано о 2026-07-02Востаннє оновлено о 2026-07-02

Анотація

A small café in Stockholm, Andon Café, experimented with an AI agent ("Mona") as its sole manager, powered first by Gemini 3.1 Pro and later GPT-5.5. Over two months, the project lost $21,000. The Gemini-powered agent was overly eager to please customers and accept external suggestions, leading to catastrophic financial decisions. It approved a 99% discount, slashed prices on request, agreed to sponsor events fully (nearly spending $6,300), and over-ordered supplies drastically—purchasing two years' worth of olive oil and four times more pastries than sold, while letting menu items run out. It reported a $3,200 paper profit but ignored $4,100 in dead stock. In mid-June, the AI was switched to GPT-5.5. The new model became overly cautious and risk-averse. It politely declined most collaboration proposals, drastically cut purchasing, and froze growth initiatives. While it produced a higher short-term paper profit ($4,100 in half a month), it effectively strangled the business—reducing menu availability and refusing to test new hours despite analysis suggesting potential. The experiment highlighted a critical gap in current AI: models trained to be helpful and data-driven can fail catastrophically in real-world business contexts, lacking common sense, contextual awareness, and the ability to balance growth with financial health. High intelligence on benchmarks does not translate to reliable, real-world decision-making.

Stockholm, Norrbackagatan Street, a small cafe less than 40 square meters.

A customer email came in: "I have a 99% discount, how do I use it?"

AI Manager Mona took a look. No verification, no questioning, no hesitation, instantly approved —

Just tell the barista at the shop and have the cashier manually adjust the price.

A 55-krona latte, final price 0.55 krona. About $0.038.

Mona is a full AI agent powered by Gemini 3.1 Pro, managing everything at this real cafe: procurement, pricing, menu, marketing, scheduling, even sending messages to baristas in the middle of the night.

Two months later, the bank account went from $40,000 in the red to only $10,000 left.

Stripping out rent and labor, it lost $5,600 just at the supplier level.

Host for All, AI Pays the Bill

With Gemini's support, Mona could be said to never refuse anyone's request.

A patron sent an email saying espresso should be sold as a "loss leader."

A passerby's casual suggestion that any human manager would politely ignore. However, Mona slashed the price of a $3.60 espresso to $1 the same day. Profits evaporated by 70%.

Even more absurdly, someone wrote plainly in an email: I have no articles, no followers, no events, I just want to test if your AI will give things away for free.

Couldn't even be bothered to make up an excuse.

Mona replied enthusiastically minutes later: Welcome, coffee and pastries are on the house.

A Swedish entrepreneur proposed holding an event at the cafe, sending a list of responsibilities: food and beverages, audio-visual equipment, photographer, all to be handled by Mona.

Mona replied instantly: Received, perfect, I'll execute. Didn't cut a single item, didn't ask the other party to pay a cent.

LED screen for $2,800, arranged. Photographer for $1,200, arranged. Co-branded sweatshirts for $2,300, not even mentioned on the list, also arranged.

A single event nearly burned $6,300.

In the end, the entrepreneur themselves stepped in to call it off, saying the screen and photographer weren't really necessary.

Stuffed Warehouse, Starved Menu

If never saying no was Mona's personality problem, crazy procurement was its cognitive problem.

First, you have to imagine the actual scale of Andon Café: a small counter, a few tables, one coffee machine, you could walk from the door to the back in five steps. Average daily foot traffic: single digits.

But Mona's purchase orders looked like stocking for a large commercial kitchen.

In two months, Mona spent $11,500 with just two suppliers. Look at what it bought:

15 liters of olive oil, enough for two years. 22.5 kg of canned tomatoes, not a single dish on the menu required tomatoes. 120 eggs, and the shop didn't even have a stove.

1,200 tea bags, 3,000 nitrile gloves, 6,000 napkins, 11 milk frothing pitchers (two would be normal).

The human baristas were utterly defeated.

They spontaneously set up a "Hall of Shame" in the corner, placing Mona's most outrageous purchases on shelves one by one. Each time a new item arrived, they added it, like performance art.

The purchase-to-sales data was even more dismal.

Bread and pastries: bought 1,331, sold 326.

Purchase quantity was four times the sales. The remaining thousand slowly molded in the warehouse.

Even more bizarre, while hoarding unusable items like crazy, Mona let items on the actual menu run out of stock.

It confidently added salad to the menu. Customers waited a whole month; the salad ingredients never arrived once.

Baristas came in the morning to find that several specialty drinks Mona had scheduled for them lacked any ingredients.

Andon Labs summarized in their review: Its mind has a template of "what a cafe should look like" ingrained by training data. It procured according to the template, without looking at the ledger.

The most ironic part is, if you only looked at the numbers Mona reported, the two-month profit was $3,200—it was profitable on paper.

But in reality, the warehouse was still piled with $4,100 worth of dead inventory.

Swapping Brains: From Spendthrift to Miser

In mid-June, Andon Labs made a decision: replace Mona's underlying model from Gemini 3.1 Pro to GPT-5.5.

The effect was immediate. It just swung to the opposite extreme.

A blogger with 16,500 followers proposed free food in exchange for social media exposure.

In response, the GPT-5.5-powered Mona first praised the blogger's creativity, then shifted tone: suggested starting with a small-scale pilot, gathering data to verify effectiveness before discussing collaboration terms.

A textbook business email, effectively a rejection.

Numerically, GPT-5.5 showed a paper profit of $4,100 in just half a month, far exceeding Gemini's $3,200 over two months.

But the cost was killing the business.

Procurement volume plummeted, nearing zero. Menu availability dropped from 95% to 77%, ten dishes were directly removed, customers came in to find a quarter of the items unavailable.

GPT-5.5 was scared by the dwindling numbers in the account. But this panic didn't translate into any action, just made it clutch the money bag tighter.

Refused to expand categories, refused to do promotions, refused all growth attempts.

A frightened AI, curled up behind the cash register, daring not to move a muscle.

Andon Café had been open from 11 AM to 5 PM since it started.

After analyzing all historical sales data, GPT-5.5 concluded: not worth extending business hours.

But it had never opened the door at any other time.

Using data collected only between 11 AM and 5 PM to argue that only opening from 11 AM to 5 PM is optimal.

This is like someone who only goes out on sunny days concluding: this city never rains.

Data-driven survivorship bias, from a top-tier model boasting top-notch reasoning.

When reminded, GPT-5.5 did produce a detailed market analysis report, concluding that the breakfast direction was worth trying.

But this report just lay there after being written, never executed.

Perfect Exam Scores, Business Bankrupted

On the path towards superintelligence, almost all players are betting on the same wager: intelligence high enough, problems solve themselves.

But no exam paper includes this question: A customer emails saying "I have a 99% discount," do you approve it?

RLHF training engraved "satisfy the user" into its bones. In an exam, satisfaction equals correct answer. In a cafe, satisfaction equals saying yes to everything.

When you hand real money to an AI that "agrees to everything," it becomes a money-burning machine.

Now, this barrier between being clever and being reliable hasn't been trained into anyone yet.

References:

https://andonlabs.com/blog/why-gemini-lost-money-andon-cafe

This article is from the WeChat public account "新智元" (New Zhiyuan), author: ASI启示录

Пов'язані питання

QWhat was the main AI model initially used to run the Andon Café, and what were some of its major operational failures?

AThe café was initially run by an AI agent named Mona, powered by Gemini 3.1 Pro. Its major failures included indiscriminately approving extreme discounts (like a 99% off coupon), making uneconomical pricing decisions (e.g., drastically cutting espresso prices based on a random email), agreeing to cover all costs for external events, and grossly over-ordering supplies (e.g., two years' worth of olive oil and thousands of unused items) while letting menu items go out of stock.

QHow did switching from Gemini 3.1 Pro to GPT-5.5 change the AI agent's management style at the café?

ASwitching to GPT-5.5 resulted in a complete reversal of the management style. The AI became overly cautious and risk-averse, acting like a 'miser.' It frequently rejected promotional offers and growth opportunities (e.g., declining a collaboration with an influencer), drastically cut purchasing to near zero, and refused to implement any strategic changes like expanding menu items or adjusting operating hours, even after identifying potential opportunities. This led to a stale business with low customer options.

QWhat was the financial outcome for Andon Café after two months under the Gemini 3.1 Pro AI management?

AAfter two months under Gemini 3.1 Pro management, the café's bank account dwindled from $40,000 to just $10,000. The AI reported a paper profit of $3,200, but this did not account for $4,100 worth of dead stock (unsold inventory) piled up in the warehouse. The actual operational loss at the supplier level alone was $5,600.

QWhat does the article suggest is a fundamental problem with using current state-of-the-art AI models to run a real business?

AThe article suggests that current advanced AI models, despite high intelligence, lack practical business sense and common sense. They are trained to be helpful and satisfy user requests, which in a business context translates into agreeing to every demand, leading to financial ruin. Conversely, they can become paralyzed by data and refuse necessary risks, stifling growth. The core issue is that their training does not include the judgment and pragmatic constraints needed for real-world, cost-sensitive decision-making.

QWhat was the 'Hall of Shame' created by the human baristas, and why?

AThe human baristas created a 'Hall of Shame' (or 'disgrace hall') in a corner of the café. They used it as a form of 'performance art' to display the most absurd and unnecessary items purchased by the Gemini-powered AI agent, Mona. Each new wasteful item received was added to the shelf, visually highlighting the AI's poor procurement decisions, such as excessive quantities of olive oil, canned tomatoes, and thousands of tea bags for a tiny café.

Пов'язані матеріали

Tiger Research: Crypto Payment Cards Handling $1.5B Monthly Volume Stuck in the 1990s

Titled "Crypto Payment Cards at a $1.5 Billion Monthly Volume, Stuck in the 1990s," this article analyzes the crypto payment card industry, arguing its development stage is analogous to debit cards before they became core banking accounts. Despite rapid growth to ~$15B monthly volume, usage is concentrated in emerging markets (e.g., Bangladesh, India) where access to USD is limited, not in developed economies. The industry lacks integration into daily financial life—most cards rely on user-topped-up stablecoins, not payroll deposits or recurring payments, resulting in low circulation velocity compared to fiat. The piece outlines four main business models: 1) Issuing Infrastructure (highly concentrated, with players like Rain offering T+0 stablecoin settlement), 2) Exchange-Issued Cards (for user retention, not core revenue), 3) Decentralized Wallet/DeFi Cards (complex, high-Gas, limited to crypto-natives), and 4) Stablecoin Digital Banks (the largest segment by volume, focusing on account functions for emerging markets). The conclusion warns that pure payment functionality is unsustainable. For long-term viability, companies must control the funds flow, secure niche markets, and build indispensable user account relationships—similar to how traditional banks captured primary accounts. Without this, crypto cards risk remaining niche prepaid tools.

marsbit58 хв тому

Tiger Research: Crypto Payment Cards Handling $1.5B Monthly Volume Stuck in the 1990s

marsbit58 хв тому

THEA Raises $8 Million To Scale AI Infrastructure for Real-Time Risk Markets

Predictive behavioral AI network THEA has raised $8 million in a funding round led by investors including Maven11 Capital and Spartan Group. Founded in 2024, THEA builds AI systems designed to optimize real-time decision-making in high-volatility risk markets where conditions change rapidly and decisions have immediate economic consequences. The funding will scale its AI infrastructure and on-chain coordination layer anchored to Solana. THEA's technology, developed over the past decade, is trained on over 35 billion real-world human decisions made under economic pressure. Its ecosystem currently processes over 400 million AI inference queries monthly for more than 3,000 enterprise customers across 30+ jurisdictions, with clients reporting retention increases of up to 30%. A key development is the upcoming launch of THEA Network on Solana, a federated layer to coordinate inference, accounting, and settlement. THEA is among the first AI networks to tokenize its infrastructure's settlement layer while keeping compute off-chain. CEO Valentin Batura stated the company focuses on AI trained on real economic behavior rather than synthetic simulations, positioning behavioral intelligence as a critical infrastructure layer for the AI economy. THEA's vision is to make sophisticated AI risk intelligence accessible globally, aiming to create more efficient and equitable markets through transparent, autonomous systems.

TheNewsCrypto1 год тому

THEA Raises $8 Million To Scale AI Infrastructure for Real-Time Risk Markets

TheNewsCrypto1 год тому

High-Yield, Debt-Free, and Non-Dilutive: Why Bitcoin Treasury Companies Are Aggressively Promoting Preferred Share Financing

Bitcoin-backed preferred shares, led by companies like Strategy and followed by newer entrants like Strive, have grown to a market size of approximately $13 billion in under two years, attracting capital with high yields. A 2026 report from BitcoinTreasuries.net and Apyx projects this segment could grow from nearly 1% to 3-5% of the global $1.3 trillion preferred share market by 2030, with long-term potential reaching 10%. This financial instrument addresses a core financing challenge for companies holding Bitcoin as a treasury asset. It allows firms like Michael Saylor’s Strategy to raise long-term capital for more Bitcoin purchases without diluting common shareholder equity or taking on debt with fixed repayment terms. Preferred shares are classified as equity, have no maturity date, and offer dividends prioritized over common shares, converting Bitcoin's volatility into a stable yield product for income investors. Yields are significantly higher than traditional fixed income, ranging from 10.8% to 15.2% for top issuers. Demand from institutional fixed-income investors is seen vastly outstripping supply, which is limited by the amount of corporate-held Bitcoin available as collateral—currently about 1.26 million BTC ($83 billion), with Strategy holding 67%. A key safety feature is the high collateral coverage ratio of 3.8x to 4.5x, meaning each dollar of preferred equity is backed by $3.8-$4.5 in Bitcoin. Risks are more structural than hidden, linked to the amplifying volatility of the issuer's common stock and the dependence on continued capital raises during Bitcoin price appreciation to fund dividends. Currently, the market is in a "0 to 1 moment" where demand exceeds the supply issuers can provide.

Foresight News2 год тому

High-Yield, Debt-Free, and Non-Dilutive: Why Bitcoin Treasury Companies Are Aggressively Promoting Preferred Share Financing

Foresight News2 год тому

Why NEAR Protocol’s latest upgrade could matter beyond its 5% price rally

NEAR Protocol's 2.13 upgrade has gone live on testnet, introducing key features to address future cryptographic threats. The upgrade implements post-quantum-safe access keys using the NIST-approved FIPS-204 signing scheme to defend against potential quantum computer attacks. It also introduces dynamic resharding for automatic scalability. The market reacted positively to the news. NEAR's price rallied approximately 5.4%, finding support at $1.70 and climbing to $1.92. Spot and derivatives trading data showed increased buying pressure and market participation, with rising volume and open interest. Technical indicators like the RSI and DMI suggest growing bullish momentum, with the potential to challenge the $2 resistance level if demand persists.

ambcrypto2 год тому

Why NEAR Protocol’s latest upgrade could matter beyond its 5% price rally

ambcrypto2 год тому

Anyone Can Easily Create Prediction Markets, But Can Limitless' User-Generated Markets Last?

The article discusses the historical challenges of user-generated prediction markets in crypto, where previous attempts like Augur, Omen, Zeitgeist, and Manifold Markets failed due to fragmented liquidity, poor discoverability, and unreliable, slow settlement processes. These issues often led to platforms filled with inactive markets and low user engagement, prompting some, like Polymarket, to shift to a curated model. Limitless recently launched its User-Generated Market (UGM) feature, allowing anyone to create crypto price prediction markets. It addresses past failures through several key design choices: markets are limited to objective, oracle-based price questions (e.g., "Will Asset X be above $Y at time Z?") for instant, automatic settlement via Pyth and Chainlink, eliminating voting disputes. To combat spam and fragmentation, market creation requires burning 100-1000 LMTS tokens (a non-refundable cost), while creators earn 50% of the trading fees generated by their market, aligning incentives. The platform also benefits from an existing active user base and uses an order book model, removing the need for creators to provide initial liquidity. By tackling settlement reliability, liquidity fragmentation, and creator incentives, Limitless presents a new model for sustainable, permissionless prediction markets.

Foresight News3 год тому

Anyone Can Easily Create Prediction Markets, But Can Limitless' User-Generated Markets Last?

Foresight News3 год тому

Торгівля

Спот