The Image Generation Model That's Hotter Than Nano Banana Has Leaked, Screenshots Are No Longer Evidence | Includes Prompts

marsbitОпубліковано о 2026-04-19Востаннє оновлено о 2026-04-19

Анотація

A new AI image generation model, widely referred to as "GPT Image 2," has been leaked and is demonstrating significant advancements over predecessors like DALL-E 3 and even Google's Nano Banana Pro. It excels in four key areas: text rendering, prompt adherence, photorealism, and world knowledge. The model can generate highly accurate text in multiple languages, including complex Chinese characters, making it capable of producing convincing fake documents, UI screenshots, and product labels. This capability also raises concerns about the reliability of using screenshots as evidence. The model is currently in A/B testing, with a full release expected around May 2026 when DALL-E services are officially retired. It is accessible for testing on the LM Arena platform. The article includes several prompt templates optimized for the model, such as generating realistic app screenshots, product photos with detailed labels, and street scenes with accurate signage. This advancement is reshaping creative workflows but also accelerating the displacement of some traditional design roles.

Is your impression of text-to-image still stuck on Nano Banana?

But kid, times have changed again.

@johnAGI168 https://x.com/johnAGI168/status/2044781168151724067

@0115hippo https://x.com/0115hippo/status/2044722124611539160

In early April, three anonymous image models, codenamed maskingtape-alpha, packingtape-alpha, and gaffertape-alpha, appeared on the LM Arena evaluation platform. They disappeared a few hours later.

OpenAI has not officially announced this model yet, but based on the metadata returned by the API and user-side testing records, it has already gained a widely accepted name: GPT Image 2.

Screenshots Can No Longer Be Used as Evidence

Over the past few years, one of the most obvious weaknesses of AI image generation models has been text within images. In the DALL-E 3 era, if you asked it to write "Hello" in an image, it might output "Hellp" or even "Hl10", with letters tilting drunkenly. GPT Image 1 improved a lot, handling simple English labels. By GPT Image 1.5, its accuracy in rendering English text was close to 95%, but it still had significant flaws with non-Latin scripts like Chinese, Japanese, and Korean.

But the leaked sample images from GPT Image 2 have changed this impression.

@MrLarus https://x.com/MrLarus/status/2044824800909054181

@akokoi1 https://x.com/akokoi1/status/2044789531615056175

The text in the images is exactly what it should be. Chinese characters are clear, with accurate glyphs and complete strokes. Someone tested generating an ID card-style image, where the name, address, and ID number were all rendered correctly, with neat formatting, looking at first glance like a photo of a real document.

This is good news. The improvement in text rendering means generating infographics, posters, product packaging, and complex charts becomes more reliable.

But there's always another side to the coin. A model that can generate photo-realistic ID-style images and precisely render UI screenshots naturally makes "screenshots can be used as evidence" increasingly questionable.

By comparison, this is also a core difference between the GPT Image series and other models. Midjourney still has no progress in text rendering, and the Stable Diffusion series also has this old problem. According to the leaked Arena test results, GPT Image 2 surpassed Midjourney in four dimensions: text rendering, instruction following, photorealism, and world knowledge. Midjourney's advantages are mainly retained in artistic style and aesthetic control.

Does It Really Know What the World Looks Like?

A tester asked the model to generate a hypothetical GPT-8 product pricing page. The resulting image had a layout that was indeed in the style of the OpenAI website, with button placement and font choices resembling those from a real interface, and the hierarchical logic of the price table was correct.

GPT Image 2 can generate images extremely similar to real software interfaces, including browser windows, mobile app interfaces, and data visualization charts, with a level of fidelity unmatched by the previous generation.

@johnAGI168 https://x.com/johnAGI168/status/2044781168151724067

@levelsio https://x.com/levelsio/status/2040333489476681758

This will lead to some very interesting practical uses. When designers are creating product prototypes, they don't need to open Figma first and draw a bunch of wireframes; they can directly describe the desired interface in text, and the output is a reference image that can be used for team discussions. When creating investor decks, they can show a "product screenshot" without waiting for an engineer to write code. When writing documentation, example interface images for illustration can be generated directly, without having to think about where to find screenshots for a blank page.

@marmaduke091 https://x.com/marmaduke091/status/2040338311873515597

Image Generation Is No Longer Just "Image Generation"

OpenAI has already announced that DALL-E 2 and DALL-E 3 will officially cease service on May 12, 2026. Azure OpenAI's DALL-E 3 was retired early in February.

DALL-E was the first place many people encountered AI image generation, from those blurry early works to today, in just a few short years.

Meanwhile, Google, which had just established its industry position with Nano Banana Pro in early 2026, might feel the pressure. Early test reports indicate that GPT Image 2 simultaneously surpasses Nano Banana Pro in three dimensions: realism, text rendering, and world knowledge. This kind of triple win is not common.

For creators, the feeling is complex. Illustrators, graphic designers, and photographers are not facing this topic for the first time. Since the release of GPT Image 1, the number of freelance graphic design positions has decreased by about 18%. AI has indeed replaced the decision to "hire someone to do this" in certain scenarios, but it is also creating new ways of working, allowing one person to do more.

The evolution speed of image generation models no longer leaves much time for adaptation. It was only a few months from GPT Image 1's launch to version 1.5. And from 1.5 to 2, it's only been about half a year. Each generation solves the core shortcomings of the previous one while opening up new possibilities.

GPT Image 2 is currently still in the A/B testing phase, with some ChatGPT users randomly gaining access. The official release window is widely predicted to be around May, coinciding with the retirement of DALL-E. If you want to experience it early, you can currently try your luck on the LM Arena evaluation platform.

Test Address: https://arena.ai

Based on community feedback and the known strengths of this model, the following prompt templates can maximize your chances of success:

UI/Screenshot Prompt: A photorealistic screenshot of a mobile banking app, clearly showing transaction history with dates, amounts, and merchant names legible. iPhone 16 screen, natural hand holding the phone, coffee shop background.

Product Label Prompt: A photographic product photo of a craft beer bottle, with clear label details showing the brewery name "Oakridge Brewing Co.", alcohol content 6.8%, a mountain logo, and an ingredient list. Studio lighting, white background.

Signage Prompt: A street scene photo of a Tokyo alley at night, showing multiple neon signs in both Japanese and English, including a ramen shop sign reading "Ichiban Ramen — Est. 1987", a karaoke bar sign, and various glowing advertisements. Wet, reflective pavement with light reflections.

Interface/World Knowledge Prompt: A photorealistic YouTube video screenshot showing a video titled "How to Assemble a Computer in 2026" with 2.3 million views, featuring realistic comments, sidebar video recommendations, and channel info. Desktop browser view.

Widescreen Trigger Prompt: A cinematic widescreen photo of an IKEA store exterior at dusk, showing the glowing IKEA sign, a parking lot with realistic cars, and shoppers entering and leaving. Golden hour lighting, 16:9 format.

Unattributed image sources and references: https://miraflow.ai/blog/how-to-use-duct-tape-ai-model-arena-gpt-image-2-guide

This article is from the WeChat public account "APPSO", author: Discovering Tomorrow's Products

Пов'язані питання

QWhat is the name of the leaked image generation model mentioned in the article, and what is its significance?

AThe leaked model is referred to as GPT Image 2. Its significance lies in its dramatic improvement in text rendering accuracy, especially for non-Latin scripts like Chinese, and its ability to generate highly realistic images, including convincing UI screenshots and document-style images, which challenges the reliability of screenshots as evidence.

QHow does GPT Image 2's performance compare to other models like Midjourney and Google's Nano Banana Pro?

AAccording to the article, GPT Image 2 outperforms Midjourney in text rendering, prompt following, photorealism, and world knowledge, with Midjourney retaining an advantage mainly in artistic style and aesthetic control. It also reportedly surpasses Google's Nano Banana Pro in realism, text rendering, and world knowledge.

QWhat are some of the potential practical applications of GPT Image 2's capabilities?

APotential applications include generating product prototypes and UI mockups for designers, creating realistic 'screenshots' for investor decks without coding, producing example interface images for documentation, and generating accurate product labels, packaging, and information graphics.

QWhat major change is OpenAI making to its image generation services in relation to this new model?

AOpenAI has announced that DALL-E 2 and DALL-E 3 will officially stop service on May 12, 2026, with Azure's DALL-E 3 having already been retired in February. This suggests a transition to the new GPT Image model series.

QWhere can users currently try to access or test the GPT Image 2 model, and what is a recommended strategy for getting good results?

AThe model is currently in A/B testing, with some ChatGPT users randomly gaining access. Users can also try their luck on the LM Arena评测平台 (arena.ai). The article recommends using specific, detailed prompt templates focused on UI/screenshots, product labels, signage, interface/world knowledge, and widescreen formats to maximize success.

Пов'язані матеріали

Bitwise: Bullish on Bitcoin's Performance in the Second Half of the Year, AI and Regulation Will Spark a New Altcoin Season

Bitwise CIO Matt Hougan and Research Lead Ryan Rasmussen express strong bullish sentiment on Bitcoin's long-term prospects, suggesting that its $1 million price target may be too conservative. They argue Bitcoin serves a dual role: as digital gold and a potential global settlement asset, especially amid declining trust in traditional monetary systems. Despite a weak Q1 2026 where nearly all crypto assets and prices saw double-digit declines, the analysts remain optimistic due to strong forward-looking catalysts, including institutional adoption via Bitcoin ETFs from major firms like Morgan Stanley and Goldman Sachs. Geopolitical instability, such as Iran’s mention of using Bitcoin for international payments, increases the value of Bitcoin’s “out-of-the-money call option” as a non-political, global settlement currency. This enhances its appeal beyond a mere store of value. . Additionally, Hougan highlights that a clearer regulatory token framework under current SEC leadership, combined with AI efficiency gains and high-performance blockchains, could fuel a new “altseason” by late 2026. This may lead to a wave of legitimate, value-capturing token projects, unlike the earlier ICO boom. . Bitwise also announced an Avalanche ETF, citing its unique architecture and rapid growth in real-world asset (RWA) tokenization, which has surged 10x to nearly $30 billion in two years. The firm believes Layer 1 blockchains are still early in their growth cycle, with significant potential ahead.

marsbit4 хв тому

Bitwise: Bullish on Bitcoin's Performance in the Second Half of the Year, AI and Regulation Will Spark a New Altcoin Season

marsbit4 хв тому

Bitcoin Rally To Near $80K Fuels Sharp Sentiment Rebound Across Crypto Markets

Bitcoin's recent rally towards $80,000 has driven a significant rebound in crypto market sentiment, with the Fear & Greed Index jumping 14 points to 46—its highest level since January. Analysts note that over 300,000 BTC have shifted from short-term to long-term holders in the past month, signaling stronger investor conviction. However, the rally appears largely driven by perpetual futures speculation rather than spot market demand, raising concerns about sustainability. Retail participation also remains subdued, limiting further sentiment gains. While entities like Strategy (formerly MicroStrategy) continue accumulating BTC, weak spot interest could lead to a price correction if profit-taking occurs.

bitcoinist40 хв тому

Bitcoin Rally To Near $80K Fuels Sharp Sentiment Rebound Across Crypto Markets

bitcoinist40 хв тому

Intel Soars 20%, CPUs Return to Center Stage in the Agent Era

Intel's stock surged 20% after reporting exceptional Q1 2026 results, with revenue of $13.6 billion (up 7% YoY) and non-GAAP EPS of $0.29, beating expectations by 29x. The rebound is driven by the resurgence of CPUs in the AI era, particularly as workloads shift from training to inference and agent-based applications. Intel’s Data Center and AI (DCAI) division hit a record $5.1 billion in revenue, up 22% YoY, marking a U-shaped recovery since mid-2025. The growth is attributed to strong demand for Xeon 6 "Granite Rapids" processors and increased AI infrastructure refresh cycles. While NVIDIA and AMD dominated the AI training phase, Intel is benefiting from the focus on AI agents, where CPU performance becomes critical—accounting for 50-90% of workflow latency in agent orchestration. This shift, coupled with management changes and strategic refocus on CPUs (including canceling the Falcon Shores GPU project), has repositioned Intel. New CEO Lip-Bu Tan emphasized that CPUs are re-establishing themselves as essential infrastructure in the AI era.

marsbit1 год тому

Intel Soars 20%, CPUs Return to Center Stage in the Agent Era

marsbit1 год тому

DeepSeek V4 Finally Released, Breaking the Strongest Closed-Source Monopoly, Explicitly Partnering with Huawei Chips

DeepSeek AI has officially released DeepSeek-V4, available in two versions: the high-performance **DeepSeek-V4-Pro** (49B activated parameters, 1.6T total) and the more efficient **DeepSeek-V4-Flash** (13B activated parameters, 284B total). Both support a 1M context length, making long-context capability a baseline feature rather than a premium offering. The Pro version rivals top closed-source models in agent capabilities, world knowledge, and reasoning performance. It outperforms Claude Sonnet 4.5 in agentic coding and approaches Claude Opus 4.6 (non-thinking mode) in quality. The Flash version offers competitive performance at a lower cost, though it lags in highly complex tasks. A key technical innovation is a new attention mechanism that reduces computational and memory demands for long contexts. The models are optimized for agent frameworks like Claude Code and OpenClaw. API services are available with support for both OpenAI and Anthropic-style interfaces. DeepSeek also announced upcoming support for Huawei’s computing hardware in the second half of the year. The models are open-sourced on Hugging Face and ModelScope.

marsbit1 год тому

DeepSeek V4 Finally Released, Breaking the Strongest Closed-Source Monopoly, Explicitly Partnering with Huawei Chips

marsbit1 год тому

Tether Cooperates with U.S. Sanctions to Freeze $344 Million in USDT, Reigniting Debate Over Stablecoin 'One-Click Freeze Authority'

Tether, the issuer of USDT, has frozen over $344 million worth of USDT across two Tron blockchain addresses in its largest-ever single compliance action. The freeze was carried out on April 23 in coordination with the U.S. Office of Foreign Assets Control (OFAC) and other law enforcement agencies. The funds are suspected to be linked to sanctions evasion, criminal networks, or other illicit activities, though specific details were not disclosed. This action comes amid increased U.S. regulatory scrutiny, including sanctions against entities tied to Iran and Mexican drug cartels. Tether’s CEO, Paolo Ardoino, emphasized the company’s commitment to preventing illegal use of USDT, contrasting it with slower responses from competitors like Circle. To date, Tether has frozen over $4.4 billion in assets and collaborates with more than 340 law enforcement agencies across 65 countries. The move has reignited debate within the crypto community over the centralized "freeze authority" held by stablecoin issuers, challenging the notion that "your stablecoins are your own." Critics point to the built-in blacklist function in USDT’s smart contracts, which allows Tether to immobilize funds in targeted wallets, while proponents argue it enhances regulatory compliance and anti-money laundering efforts.

marsbit1 год тому

Tether Cooperates with U.S. Sanctions to Freeze $344 Million in USDT, Reigniting Debate Over Stablecoin 'One-Click Freeze Authority'

marsbit1 год тому

Торгівля

Спот

Ф'ючерси

Обговорення

Ласкаво просимо до спільноти HTX. Тут ви можете бути в курсі останніх подій розвитку платформи та отримати доступ до професійної ринкової інформації. Нижче представлені думки користувачів щодо ціни BANANA (BANANA).

The Image Generation Model That's Hotter Than Nano Banana Has Leaked, Screenshots Are No Longer Evidence | Includes Prompts

Анотація

Screenshots Can No Longer Be Used as Evidence

Does It Really Know What the World Looks Like?

Image Generation Is No Longer Just "Image Generation"

Пов'язані питання

Пов'язані матеріали

Bitwise: Bullish on Bitcoin's Performance in the Second Half of the Year, AI and Regulation Will Spark a New Altcoin Season

Bitcoin Rally To Near $80K Fuels Sharp Sentiment Rebound Across Crypto Markets

Intel Soars 20%, CPUs Return to Center Stage in the Agent Era

DeepSeek V4 Finally Released, Breaking the Strongest Closed-Source Monopoly, Explicitly Partnering with Huawei Chips

Tether Cooperates with U.S. Sanctions to Freeze $344 Million in USDT, Reigniting Debate Over Stablecoin 'One-Click Freeze Authority'

Торгівля

Популярні статті

Як купити BANANA

Обговорення