AI2 Releases Fully Open-Source Web Agent MolmoWeb: Controlling Web Pages Using Only "Vision"

marsbitPublished on 2026-03-26Last updated on 2026-03-26

Abstract

AI2 has released MolmoWeb, a groundbreaking, fully open-source web agent that operates solely by analyzing screenshots, marking a significant leap in vision-driven web navigation. Unlike traditional agents that rely on DOM, MolmoWeb captures and interprets visual data to make decisions—such as clicking, scrolling, or typing—making its process transparent and robust. Despite its compact size (4B and 8B parameters), MolmoWeb performs impressively: it scores 78.2% on the WebVoyager benchmark, nearing OpenAI’s proprietary o3 model (79.3%), and achieves up to 94.7% success with multiple attempts. It even surpasses Anthropic’s Claude3.7 in UI element localization. AI2 also released MolmoWebMix, a massive open dataset with 36K human-browsing tasks, over 2.2M screenshot-QA pairs, and GPT-4o-verified synthetic data. The model and data are fully available on Hugging Face and GitHub under Apache 2.0, promoting transparency and collaboration in AI development. Challenges remain in complex instructions, logins, and legal compliance.

The Allen Institute for Artificial Intelligence (AI2) recently released the groundbreaking fully open-source web agent MolmoWeb . Unlike traditional agents that rely on a webpage's underlying code (DOM), MolmoWeb makes decisions solely by reading screenshots, marking a significant leap forward in "vision-driven" web navigation technology.

Core Technology: "Seeing" Web Pages Like a Human

MolmoWeb's operating logic is very intuitive: it captures a screenshot of the current browser window, decides the next action (such as clicking, scrolling, or paging) through visual analysis, then executes it and repeats. This "what you see is what you get" model makes it more robust than traditional agents because the visual layout of a webpage is generally more stable than its underlying code, and its decision-making process is completely transparent and explainable to human users.

Performance Leap: Small Model Outperforms Giants

Despite having parameter sizes of only 4B and 8B, MolmoWeb demonstrates a "small but mighty" performance:

Topping the Charts: In the WebVoyager test, the 8B version scored an impressive 78.2%, not only ranking among the top open-source models but also approaching the performance of OpenAI's proprietary model o3 (79.3%).
Huge Potential: Research found that by running tasks multiple times and selecting the optimal result, its success rate could further jump to 94.7%.
Precise Localization: In UI element localization benchmark tests, it even surpassed Anthropic's Claude3.7.

Data Support: The Largest Open Dataset to Date

AI2 has not only open-sourced the model weights but also contributed a massive dataset named MolmoWebMix. This dataset contains:

36,000 real browsing tasks completed by human volunteers.
Over 2.2 million screenshot-question-answer pairs.
Automated synthetic data verified by GPT-4o. Experiments show that synthetic data is even better than human trajectories at guiding the agent to find the "optimal path".

Open-Source Spirit and Future Challenges

Currently, MolmoWeb is fully available under the Apache 2.0 license on Hugging Face and GitHub. Although it still faces challenges in handling complex instructions, login authentication, and legal compliance (such as terms of service), AI2 firmly believes that only through complete transparency and community collaboration can we truly counter the data monopoly of large tech companies.

Cardano Founder Warns XRP Investors, Is Ripple Doing Something Wrong?

Cardano founder Charles Hoskinson has warned XRP investors, claiming Ripple is dumping its XRP holdings to fund business acquisitions without benefiting token holders. He alleges Ripple controls 80% of the supply and uses sales to build Web 2.5 companies, with no buybacks or value distribution to XRP investors. Hoskinson also notes the lack of staking or DeFi mechanisms on XRPL reduces organic demand. Despite Ripple CEO Brad Garlinghouse calling XRP central to their vision, Hoskinson compares Ripple to Tether in accruing value solely for itself. XRP price fell nearly 2% to around $1.40.

bitcoinist13m ago

Cardano Founder Warns XRP Investors, Is Ripple Doing Something Wrong?

bitcoinist13m ago

Strategy Surpasses 800K BTC as $2.5B Bitcoin Buying Spree Continues

MicroStrategy, the world's largest public Bitcoin holder, has increased its total holdings to over 815,000 BTC after purchasing an additional 34,164 BTC for approximately $2.54 billion. This acquisition, executed at an average price of $74,395 per bitcoin, is the company's third-largest purchase by coin count. The buying spree, which occurred between April 13 and 19, was primarily funded through the sale of the company's STRC perpetual preferred security. The firm's average purchase price for its entire holdings now stands at $75,527 per bitcoin.

TheNewsCrypto46m ago

Strategy Surpasses 800K BTC as $2.5B Bitcoin Buying Spree Continues

TheNewsCrypto46m ago

Shiba Inu Crosses 20,000 Burn Transactions Milestone, Dogecoin Eyes X Money, But Why Are Prices Down?

Shiba Inu has surpassed 20,000 burn transactions, aiming to reduce its large token supply, while Dogecoin gains attention due to speculation about its potential integration into X's payment system, referred to as X Money. Despite these developments, both cryptocurrencies are experiencing downward price trends. Shiba Inu trades around $0.000006, showing short-term declines and limited demand, indicating that burn efforts alone aren't enough to boost prices in a weak market. Similarly, Dogecoin remains around $0.09, far from bullish projections, as market sentiment remains cautious without concrete implementation details. Ecosystem progress is overshadowed by a lack of strong capital inflows and broader crypto market uncertainty.

bitcoinist1h ago

Shiba Inu Crosses 20,000 Burn Transactions Milestone, Dogecoin Eyes X Money, But Why Are Prices Down?

bitcoinist1h ago

First Batch of Keynote Speakers and Partners Announced! Web2+3 Summit: Defining the Next Generation of Digital Economy

Web2+3 Summit: Defining the Next Generation of Digital Economy The 6th BEYOND International Technology Innovation Expo (BEYOND Expo 2026), Asia's largest tech and ecosystem exhibition, is launching a dedicated Web2+3 stage for the first time. Co-hosted by BEYOND Expo and ChainNeXT Group, the Web3 Summit will take place from May 28–30, 2026. Against the backdrop of accelerating global tech integration, the boundaries between Web2 and Web3 are rapidly blurring. With clearer global regulations for blockchain-driven internet (Web3) and the special issuance of a Hong Kong dollar stable币 license by the Hong Kong SAR government on April 10, 2026, Web3's decentralized principles are quickly merging with traditional industries (Web2) such as e-commerce, finance, and artificial intelligence. Focused on blockchain-driven digital economy elements, the summit will center on three core principles—implementability, commercial viability, and compliance. It will bring together top Web3 experts to discuss key integration areas like stablecoin payment finance (PayFi), real-world asset tokenization (RWA), and decentralized AI (DeAI), unveiling new opportunities for industrial innovation. The first wave of confirmed speakers includes Jack Kong (Director of Hong Kong Cyberport, Chairman of Nano Labs), Yat Siu (Chairman of Animoca Brands), Michael Wu (Co-founder & CEO of Amber Group), Michael Heinrich (Co-founder & CEO of 0G), and Art Abal (Co-founder of Vana). More Web3 ecosystem pioneers, AI, and fintech experts will be announced soon. Core forum topics include: - Web2+DeAI: New AI Paradigms Driven by Decentralized Infrastructure - Web2+RWA: Real-World Asset Tokenization and Global Liquidity - Web2+PayFi: Cross-Border Payments and Financial Innovation Powered by Crypto Infrastructure - Web2+3 AI: Autonomous Agents and the Crypto Economy - Web2+3 Wealth: On-Chain and Off-Chain Integrated Investment Ecosystems - Web2+3 Commerce: A New Landscape for Global Trade Driven by Stablecoins Additional agenda details will be released in the near future.

marsbit2h ago

First Batch of Keynote Speakers and Partners Announced! Web2+3 Summit: Defining the Next Generation of Digital Economy

marsbit2h ago

Crypto Whale Wallets Accumulate Ozak AI Aggressively as Presale Funding Surpasses $6.8 Million

Crypto whales are aggressively accumulating Ozak AI tokens as the project's presale funding exceeds $6.8 million, signaling strong institutional interest. Analysts view this as strategic, long-term positioning rather than speculative trading, especially as large-cap assets like Bitcoin and Ethereum face growth compression. Ozak AI's presale has sold over 1.05 billion $OZ tokens at $0.014 each, attracting sustained demand from both whales and sophisticated retail investors. The project's AI-native utility, including Prediction Agents and EigenLayer integration, along with associations with Pyth Network and other firms, adds credibility. Market pullbacks are accelerating accumulation, with whales rotating into early-stage AI assets. A projected $1 listing target suggests significant upside potential post-listing.

TheNewsCrypto2h ago

Crypto Whale Wallets Accumulate Ozak AI Aggressively as Presale Funding Surpasses $6.8 Million