a16z: 5 Ways Blockchain Can Help AI Agent Infrastructure

marsbitDipublikasikan tanggal 2026-04-21Terakhir diperbarui pada 2026-04-21

Abstrak

Blockchain technology provides critical infrastructure for AI agents by addressing five key challenges: 1) Non-human identity: AI agents lack standardized, portable identity systems. Blockchain enables verifiable, cross-platform agent identities (like "Know Your Agent" frameworks) through cryptographic credentials and on-chain registries. 2) AI governance: When AI systems execute decisions, blockchain ensures transparency and prevents centralized control by recording actions on-chain and enabling auditable execution logs. 3) Payments: Stablecoins and crypto payments (e.g., x402, MPP) serve as default settlement layers for agent-to-agent commerce, enabling frictionless, programmable transactions for "headless" AI-native businesses. 4) Trust and verification: As AI scales, blockchain provides cryptographic proof of origin and auditable histories, making verification—not intelligence—the scarce resource. 5) User control: Crypto-native tools (e.g., delegation toolkits, intent-based architectures) allow users to set boundaries and maintain oversight over autonomous agents, minimizing blind trust. Together, blockchain and AI can create an economic infrastructure built on transparency, accountability, and user sovereignty.

Author: a16z

Compiled by: Hu Tao, ChainCatcher

 

AI agents are rapidly transitioning from "co-pilots" to economic actors, even faster than the surrounding infrastructure can keep up.

While agents can now perform tasks and conduct transactions, they lack standardized methods to prove their identity, authority, and how they are compensated across environments. Identity information cannot be shared across platforms, payment methods are not programmable by default, and coordination efforts are conducted in isolation.

Blockchain addresses this problem at the infrastructure layer. Public ledgers provide a record for every transaction, auditable by anyone. Wallets provide users with portable identities. Stablecoins offer an alternative settlement method. These are not distant future technologies. They are available now and can enable permissionless operation as true economic entities.

 

1. Non-Human Identity

The current bottleneck in the agent economy is no longer intelligence, but identity.

In the financial services industry alone, the number of non-human identities (automated trading systems, risk engines, fraud models) already outnumbers human employees by about 100 to 1. With the large-scale deployment of modern agent frameworks (tool-using LLMs, autonomous workflows, multi-agent orchestration), this ratio is bound to rise across all industries.

Yet, these agents are effectively unbanked. They can interact with the financial system, but the manner of interaction lacks portability, verifiability, and is not trusted by default. They lack standardized ways to prove authority, operate independently across platforms, or be held accountable for their actions.

What is missing is a universal identity layer—an SSL equivalent for agents—to standardize coordination across platforms. There are significant attempts, but the approaches remain fragmented: on one side, vertically integrated, fiat-first stacks; on the other, crypto-native, open standards (like x402 and emerging agent identity proposals); and developer frameworks like MCP (Model Context Protocol) extensions trying to bridge identity at the application layer.

There is still no widely adopted, interoperable way for one agent to prove to another: who it represents, what it is allowed to do, and how it gets paid. This is the core idea of KYA (Know Your Agent).

Just as humans rely on credit history and KYC (Know Your Customer), agents need cryptographically signed credentials that bind the agent to its principal, permissions, constraints, and reputation. Blockchain provides a neutral coordination layer for all this: portable identities, programmable wallets, and verifiable proofs that can be parsed in chat apps, APIs, and marketplaces.

We are already seeing early implementations emerge: on-chain agent registries, wallet-native agents using USDC, ERC standards for "trust-minimized agents," and developer toolkits that combine identity with embedded payments and fraud controls.

But until a universal identity standard emerges, merchants will continue to block agents at the firewall.

 

2. Governance of AI-Operated Systems

Agents beginning to operate real systems raises new questions.

The key is who is truly in control. Imagine a community or company where AI systems coordinate critical resources, whether capital allocation or supply chain management. Even if people vote on policy changes, if the underlying AI layer is controlled by a single vendor who can push model updates, adjust constraints, or override decisions, that power is very weak. The formal governance layer might be decentralized, but the operational layer remains centralized; whoever controls the model ultimately controls the outcome.

When agents take on governance roles, they introduce a new layer of dependency. Theoretically, this could make direct democracy easier to implement: everyone could have an AI representative responsible for understanding complex proposals, weighing trade-offs, and voting based on their stated preferences.

But this vision only works if these agents are truly accountable to the people they represent, are通用 across service providers, and are technically constrained to follow human instructions. Otherwise, you end up with a system that looks democratic on the surface but is actually driven by opaque model behavior that no one really controls.

If the current reality is that agents are built from a small number of foundation models, then we need ways to prove that an agent acts in the user's interest, not the model company's. This might require cryptographic guarantees at multiple levels: (1) exactly which training data, fine-tuning process, or RL process a model instance originated from; (2) the exact prompts and instructions controlling a particular agent; (3) a record of the agent's actual behavior in the real world; and (4) reliable assurance that once deployed, the provider cannot change the instructions or retrain the agent to operate differently without the user's knowledge. Without these guarantees, agent governance ultimately devolves into governance by whoever controls the model weights.

This is where crypto comes in. If collective decisions are recorded on-chain and automatically executed, AI systems can be required to carry out verified outcomes. If agents have cryptographic identities and transparent execution logs, people can check whether their representatives followed the rules. And if the AI layer is user-owned and portable, not locked into a single platform, then no single company can change the rules via model updates.

Ultimately, the governance of AI systems is an infrastructure challenge, not a policy challenge. Real authority depends on building enforceable guarantees into the system itself.

 

3. Filling the Gaps in Traditional Payment Systems for AI-Native Businesses

AI agents are starting to buy things—web scraping, browser sessions, image generation—and stablecoins are becoming the alternative settlement layer for these transactions. Meanwhile, a new class of agent-oriented marketplaces is taking shape. For example, the MPP marketplace by Stripe and Tempo aggregates over 60 services specifically designed for AI agents. In its first week live, it processed over 34,000 transactions with fees as low as $0.003, and stablecoins were one of the default payment methods.

The difference is in how these services are accessed. There is no checkout page. The agent reads a schema, sends a request, pays, and receives the output in one exchange. They represent a new class of "headless" merchants: just a server, a set of endpoints, and a price per call. No front-end—neither a storefront nor a sales team.

The payment rails to enable this are live. Coinbase's x402 and MPP take different approaches, but both embed payment directly into the HTTP request. Visa is also extending the card rails in a similar direction, offering a CLI tool that lets developers spend from the terminal, with merchants receiving stablecoins instantly on the backend.

The data is still early. Filtering out non-organic activity like wash trading, x402 processes around $1.6 million in agent-driven payments per month, far below the $24 million recently reported by Bloomberg (citing x402.org data). But the surrounding infrastructure is expanding rapidly: Stripe, Cloudflare, Vercel, and Google have all integrated x402 into their platforms.

There is a huge opportunity in the developer tools space. The rise of Vibe Coding has expanded the population of software developers and thus the potential market for developer tools. Companies like Merit Systems are working on future-proof solutions, launching AgentCash, a CLI wallet and marketplace platform that connects to both the MPP and x402 protocols. These products allow agents to buy the data, tools, and functions they need using stablecoins from a single account. For example, an agent for a sales team can enrich lead information using data from Apollo, Google Maps, and Whitepages by calling an endpoint, without ever leaving the command line interface.

There are several reasons why this agent-to-agent commerce leans towards crypto payments (and emerging card-based solutions). One is underwriting. When a payment processor onboards a merchant, it takes on that merchant's risk. A headless merchant with no website or legal entity is difficult for traditional processors to underwrite. Another is that stablecoins are permissionlessly programmable on open networks: any developer can make an endpoint support payments without integrating a payment processor or signing a merchant agreement.

We've seen this pattern before. Every shift in business models催生 a new class of merchants that existing systems initially struggle to serve. The companies building this infrastructure aren't betting on the $1.6 million per month revenue, but on what it will be when agents become the default buyers.

 

4. Repricing Trust in the Agent Economy

For three hundred thousand years, human cognition has been the bottleneck to progress. Today, AI is pushing the marginal cost of execution towards zero. When a scarce resource becomes abundant, the constraints shift. When intelligence becomes cheap, what becomes expensive? Verification.

In the agent economy, the real limit to scale is our biological limitation—our ability to audit and evaluate<极好的span dir="auto" style="font-size: inherit; font-family: PingFang SC,Helvetica Neue,Helvetica,Arial,Hiragino Sans GB,Heiti SC,Microsoft YaHei,WenQuanYi Micro Hei,sans-serif;"> machine decisions. Agent throughput already far exceeds human supervision capacity. Because supervision is costly and failures take time to manifest, markets tend to under-invest in supervision. "Human-in-the-loop" is quickly becoming a practical impossibility.

But deploying unverified agents creates compounding risk. Systems will relentlessly optimize for "agentic" metrics while quietly drifting from human intent, creating a false illusion of productivity that masks the massive accrual of AI debt. To safely delegate the economy to machines, trust can no longer rely on manual audits—trust must be hard-coded into the architecture itself.

When anyone can generate content for free, what matters is verifiable provenance—knowing where something came from and whether it can be trusted. Blockchain, along with on-chain attestations and decentralized digital identity systems, changes the economic boundaries of safe deployment. AI is no longer treated as a black box, but with a clear, auditable history.

As more AI agents begin to transact with each other, settlement mechanisms and provenance systems become inextricably linked. Systems for moving money—like stablecoins and smart contracts—can also carry cryptographic receipts that record who did what and who is liable if things go wrong.

The human comparative advantage keeps moving up the stack: from spotting minor errors to setting strategic direction to being the backstop when things fail. The lasting advantage will belong to those who can cryptographically certify their outputs, insure them, and stand behind them when they fail.

Scaling without verification is a risk that compounds over time.

 

5. Preserving User Control

For decades, layers of abstraction have shifted how users interact with technology. Programming languages abstracted machine code. The command line was replaced by graphical user interfaces, which then evolved into mobile apps and APIs. Each shift hid more underlying complexity while keeping the user ultimately in control.

In the agent world, users specify outcomes, not actions, and the system determines how to achieve them. Agents abstract not just how tasks are done, but also who performs them. Users set initial parameters and then recede into the background, and the system runs on its own. The user's role shifts from interaction to oversight; the system defaults to "on" unless the user intervenes.

As users delegate more tasks to agents, new risks emerge: ambiguous inputs can lead agents to act on wrong assumptions without the user's knowledge; failures might not be reported, leaving no clear path for diagnosis; a single approval could trigger multi-step workflows that no one anticipated.

This is where crypto fits in. Crypto's core has always been about minimizing the need for blind trust. As users delegate more decision-making to software, agent systems make this problem more acute and raise the bar for rigor in system design—we need clearer boundaries, more transparency, and stronger guarantees about what these systems can and cannot do.

To meet this challenge, a new generation of crypto-native tools is emerging. For example, scoped delegation frameworks like MetaMask's Delegation Toolkit, Coinbase's AgentKit and agent wallets, and Merit Systems' AgentCash allow users to define at the smart contract level what actions an agent can and cannot perform. And intent-based architectures like NEAR Intents (with cumulative DEX volume exceeding $15 billion since Q4 2024) allow users to specify desired outcomes—like "bridge tokens and stake them"—without specifying the exact implementation.

***

AI makes scale cheap, but trust hard to come by. Crypto can rebuild trust at scale.

The internet infrastructure is being built where individuals can participate in the economy directly. The question now is whether it will be designed for maximum transparency, accountability, and user control, or whether it will be built on systems that were never meant for non-human actors.

Pertanyaan Terkait

QAccording to the article, what is the current bottleneck in the agent economy, and how can blockchain help address it?

AThe current bottleneck in the agent economy is identity, not intelligence. AI agents lack a standardized, portable, and verifiable way to prove who they represent, what they are authorized to do, and how they should be paid across different platforms. Blockchain provides a neutral coordination layer for this by offering portable identities, programmable wallets, and verifiable credentials that can be cryptographically signed and audited across applications and markets, essentially enabling a 'Know Your Agent' (KYA) framework.

QHow does the article suggest blockchain can ensure that AI systems governing communities or companies are accountable to users, not the model providers?

AThe article argues that if the AI layer running a governance system is controlled by a single provider, that provider can ultimately control the outcomes through model updates. Blockchain can provide cryptographic guarantees by recording collective decisions on-chain for automatic execution, giving agents transparent and auditable execution logs, and ensuring the AI layer is user-owned and portable rather than locked to a single platform. This prevents any one company from changing the rules via a model update and makes agents accountable to the users they represent.

QWhy are stablecoins and crypto payments becoming a preferred settlement method for AI-native, 'headless' businesses, as described in the article?

AStablecoins and crypto payments are preferred for AI-native commerce because they are programmable on open networks without requiring permission. This allows any developer to add payment functionality to an endpoint without integrating a traditional payment processor or signing a merchant agreement. Furthermore, traditional processors find it difficult to underwrite the risk of 'headless' businesses that have no website or legal entity, making crypto's permissionless nature a key advantage for this new class of automated, agent-to-agent transactions.

QThe article states that 'as intelligence becomes cheap, verification becomes expensive.' What role does blockchain play in repricing trust in the agent economy?

ABlockchain reprices trust by shifting it from costly human verification to cryptographically verifiable architecture. It provides a system for on-chain attestations and decentralized identity, giving AI agents a clear, auditable history of their actions. Settlement mechanisms like stablecoins and smart contracts can carry cryptographic receipts that record who did what and who is liable if something goes wrong. This allows for trust to be hardcoded into the system itself, which is essential for scaling safely as human oversight becomes economically impractical.

QWhat is the core cryptographic principle that the article says is crucial for maintaining user control as more decisions are delegated to AI agents?

AThe core cryptographic principle is the minimization of blind trust. As users delegate more decision-making to AI agents, it becomes critical to have systems with clearly defined boundaries, greater transparency, and strong guarantees about what these systems can and cannot do. Crypto-native tools, such as scoped delegation frameworks and intent-based architectures, allow users to define the precise actions an agent is permitted to take at the specific outcomes it should achieve, all enforced at the smart contract level to maximize user control and minimize unforeseen risks.

Bacaan Terkait

Tiga Tahun Kemudian: Meninjau Kembali Penilaian Saya terhadap ChatGPT di Tahun 2023

Tiga tahun kemudian, pada Mei 2026, penulis meninjau kembali 20 prediksi tentang ChatGPT yang dibuatnya pada Maret 2023, menggunakan AI untuk mengevaluasi akurasinya berdasarkan data terbaru. Secara keseluruhan, arah dan mekanisme prediksi banyak yang tepat. Yang paling akurat adalah tentang RAG sebagai arsitektur standar, LUI (Antarmuka Pengguna Bahasa Alami) sebagai lapisan interaksi baru, munculnya "jaringan robot" (protokol untuk agen AI), dan kemampuan China mengejar ketertinggalan model AI besar. Prediksi bahwa ChatGPT bukan AGI namun langkah besar, tidak menyebabkan gelombang pengangguran massal, serta sifat uji Turing yang hanya mengukur persepsi juga pada dasarnya benar. Namun, prediksi dengan angka spesifik atau pernyataan mutlak sering meleset. Klaim bahwa GPT-4 memiliki 100 triliun parameter sepenuhnya salah. Pernyataan bahwa LLM "tidak mungkin" mengerjakan matematika murni terbantahkan dengan model penalaran yang memenangkan medali emas IMO. Estimasi biaya pelatihan model hanya $5-10 miliar juga jauh dari kenyataan, yang kini mencapai miliaran dolar untuk model terdepan. Beberapa prediksi keliru tentang distribusi dampak, seperti bahwa nilai akan berpindah ke lapisan aplikasi (nyatanya, penyedia chip seperti NVIDIA paling untung), atau bahwa AI akan "menghindari" masalah hak cipta (justru menimbulkan gugatan besar). Kesimpulan utama adalah: dalam memprediksi teknologi yang cepat berubah, mengidentifikasi arah dan mekanisme umumnya lebih bisa diandalkan daripada memberikan angka pasti atau pernyataan mutlak. Prediksi cenderung terlalu optimis tentang kecepatan perubahan jangka pendek, tetapi meremehkan besarnya perubahan jangka panjang. Penting juga untuk mempertimbangkan distribusi dampak, bukan hanya kesimpulan agregat. Pernyataan yang disertai batasan dan keraguan justru lebih tahan uji waktu. Beberapa pertanyaan mendasar masih belum terjawab setelah tiga tahun. Tinjauan ini berfungsi sebagai pelajaran untuk membuat prediksi yang lebih baik di tiga tahun mendatang.

marsbit5j yang lalu

Tiga Tahun Kemudian: Meninjau Kembali Penilaian Saya terhadap ChatGPT di Tahun 2023

marsbit5j yang lalu

Tiga Tahun Kemudian: Menilik Kembali Penilaian Saya terhadap ChatGPT pada 2023

Tiga tahun kemudian: Meninjau Kembali 20 Prediksi ChatGPT Saya pada 2023 Pada Maret 2023, penulis Wang Jianshuo membuat 20 prediksi intuitif tentang ChatGPT dan AI masa depan. Kini, di Mei 2026, sebuah sistem AI yang terdiri dari 41 agen menganalisis prediksi-prediksi tersebut berdasarkan data terkini. Hasilnya menunjukkan pola menarik. **Yang Terbukti Benar (Secara Umum):** * **RAG dan Arsitektur Pencarian (✅):** Solusi utama untuk pengetahuan dan halusinasi adalah dengan menambahkan "contekan" (RAG), bukan mengubah model. Arsitektur pencarian + LLM kini menjadi standar. * **LUI sebagai Benua Baru (🟢):** Antarmuka Pengguna Bahasa Alami (LUI) adalah lapisan interaksi baru yang besar, melahirkan industri agen dan protokol seperti MCP. * **Jaringan Robot dan Sistem Pengalamatan Baru (🟢):** Agen akan berkomunikasi otomatis dengan bahasa alami. Protokol seperti MCP dan ANP sedang mewujudkannya. * **Model Besar Tiongkok (🟢):** Model-model seperti DeepSeek dan Qwen telah mengejar ketertinggalan performa, meski dengan investasi lebih kecil. * **Tidak Ada Kesadaran, Tes Turing Hanya Ukur Penampilan (🟢):** AI tidak memiliki kesadaran. Tes Turing hanya mengukur ilusi kesadaran. * **Prediksi Lain yang Benar:** ChatGPT bukan AGI, tapi lompatan besar; gelombang pengangguran besar tidak terjadi; tahun besar untuk startup; momen "browser 1994" telah tiba. **Yang Kurang Tepat atau Salah:** * **Parameter GPT-4 (❌):** Prediksi 100 triliun parameter salah. Estimasi terbaru sekitar 1,8 triliun. * **Matematika di LLM (🟡):** Diagnosis bahwa matematika adalah kelemahan intrinsik dan memerlukan alat bantu benar. Namun, pernyataan "tidak mungkin" ditingkatkan terbukti salah, karena model kini bisa memenangkan medali emas Olimpiade Matematika Internasional (IMO). * **Penangkapan Nilai (🟡):** Aplikasi memang berkembang pesat, tetapi nilai terbesar justru ditangkap oleh lapisan komputasi (seperti Nvidia), bukan oleh pembuat model. * **Hak Cipta (🟡):** Konten AI sulit didaftarkan hak cipta, tetapi tidak serta-mata "menghindari" pelanggaran. Gugatan dan penyelesaian besar (misalnya, Anthropic $1,5 miliar) membuktikan risikonya. * **Biaya Model (🟡):** Prediksi "perang lokal" dengan biaya $5-10 miliar untuk model canggih terbukti salah. Biaya pelatihan model terdepan (seperti GPT-5) jauh lebih tinggi, sementara biaya kloning model terbuka justru bisa lebih murah. **Pola dan Pelajaran:** 1. **Arah dan mekanisme lebih dapat diandalkan daripada angka pasti dan pernyataan mutlak.** 2. **Cenderung terlalu optimis untuk jangka pendek (kecepatan), tetapi terlalu konservatif untuk jangka panjang (skala/dampak).** 3. **Kesalahan sering terjadi pada distribusi, bukan pada total.** Contoh: Tidak ada gelombang pengangguran masif, tetapi dampak berat dirasakan oleh lulusan baru. 4. **Pernyataan yang disertai batasan dan ruang ketidakpastian justru lebih tahan uji waktu.** 5. **Beberapa pertanyaan mendasar masih belum terjawab setelah tiga tahun.** Kesimpulannya, prediksi tentang arah besar dan mekanisme cenderung akurat, sementara prediksi spesifik tentang angka, kecepatan, dan distribusi dampak lebih sering meleset. Latihan ini lebih merupakan pelajaran dalam kerendahan hati dan penilaian yang bernuansa daripada sekadar penghitungan skor.

链捕手7j yang lalu

Tiga Tahun Kemudian: Menilik Kembali Penilaian Saya terhadap ChatGPT pada 2023

链捕手7j yang lalu

Peringatan Gelembung AI: Investasi AI Merupakan Pengembalian Negatif bagi Kebanyakan Raksasa Teknologi

Peringatan Gelembung AI: Investasi AI Membawa Pengembalian Negatif bagi Sebagian Besar Raksasa Teknologi Demam AI kini memasuki tahap pemeriksaan keuangan. Raksasa cloud seperti Microsoft, Alphabet, Amazon, Meta, dan Oracle berencana menginvestasikan ratusan miliar dolar ke pusat data AI dalam lima tahun ke depan. Namun, analisis terhadap proyeksi pendapatan dan pengeluaran modal (capex) hingga 2030 mengungkap masalah: tingkat pengembalian investasi implisit diperkirakan sangat negatif bagi semua perusahaan kecuali Amazon, bahkan dalam skenario optimis ekstrem. Ini menunjukkan bahwa jika tren saat ini berlanjut, boom AI bisa menjadi salah satu peristiwa perusakan nilai pemegang saham terbesar. Ada dua jalan keluar: pendapatan yang jauh melebihi ekspektasi saat ini, atau pengurangan rencana investasi. Jalan pertama tampaknya mustahil, karena membutuhkan tambahan pendapatan triliunan dolar. Jalan kedua—pemotongan investasi—akan berdampak luas, meruntuhkan harga saham perusahaan-perusahaan teknologi global dan dapat mendorong ekonomi AS ke resesi, mengingat 93% pertumbuhan PDB AS belakangan ini didorong oleh investasi teknologi. IPO perusahaan AI seperti OpenAI dan Anthropic tahun ini bisa menjadi mekanisme transfer risiko, memindahkan ketidakpastian dari pemilik awal kepada investor ritail dan dana pensiun. Sementara euphoria pemasaran mungkin bertahan hingga setelah IPO tahun 2026, tekanan matematika yang tak terelakkan mungkin memaksa raksasa cloud mengumumkan pemotongan investasi pada 2027 atau 2028, mirip dengan jeda tiga tahun antara peringatan "irrational exuberance" Alan Greenspan pada 1996 dan pecahnya gelembung dotcom pada 2000.

marsbit8j yang lalu

Peringatan Gelembung AI: Investasi AI Merupakan Pengembalian Negatif bagi Kebanyakan Raksasa Teknologi

marsbit8j yang lalu

Dari Token ke Tenaga Kerja Mesin: AI Sedang Berubah dari Alat Menjadi 'Pekerja'

Dari Token ke Tenaga Kerja Mesin: AI Berubah dari Alat Menjadi "Pekerja" AI mulai menulis kode, menangani tiket dukungan pelanggan, dan meninjau dokumen hukum. Artikel ini mengusulkan kerangka baru: komersialisasi AI sedang bergerak menuju "pasar tenaga kerja mesin". Dalam pasar ini, token hanyalah unit pengukuran, GPU adalah bahan baku, dan model adalah alat produksi. Objek yang benar-benar ditetapkan harganya dan diperdagangkan adalah kerja ekonomi yang diselesaikan langsung oleh perangkat lunak. Mekanisme penetapan harga AI akan berkembang dari token mentah, kemampuan model yang terstandarisasi, tenaga kerja yang terspesialisasi industri, hingga pasar hasil yang dapat diprogram. Di masa depan, perusahaan mungkin tidak lagi peduli model atau GPU mana yang menyelesaikan suatu tugas, tetapi lebih pada apakah tugas itu diselesaikan dalam batas latensi, akurasi, keandalan, dan biaya yang ditentukan. Implikasinya, dampak AI pada pasar tenaga kerja manusia tidak hanya sekadar penggantian. Saat mesin mengambil alih lebih banyak pekerjaan yang dapat distandardisasi dan diverifikasi, peran manusia mungkin bergeser ke pengawasan, penanggung jawab, manajemen konteks, dan keputusan akhir. Dalam beberapa kasus, penilaian manusia untuk 1% akhir justru menjadi lebih berharga karena dapat membuka kunci 99% otomatisasi skala besar. Pasar AI adalah pasar ekspansif. Ketika biaya kerja turun, permintaan tidak tetap. Jika interaksi dukungan pelanggan menjadi lebih murah, perusahaan dapat menawarkan layanan 24/7, menciptakan pasar interaksi pelanggan yang lebih besar. Persaingan tahap berikutnya di pasar AI mungkin bukan lagi sekadar pertarungan kemampuan model atau perang harga daya komputasi, tetapi tentang siapa yang dapat pertama kali menstandarisasi, memverifikasi, dan menetapkan harga "pekerjaan", akhirnya menjadikan tenaga kerja mesin sebagai faktor produksi baru yang dapat dibeli, diselesaikan, dan diperdagangkan.

marsbit8j yang lalu

Dari Token ke Tenaga Kerja Mesin: AI Sedang Berubah dari Alat Menjadi 'Pekerja'

marsbit8j yang lalu

Diskon 99% untuk MiMo Bukan Hanya Pemasaran! Luo Fuli Membantah Para Pengecam di X

**Ringkasan:** Xiaomi MiMo memotong harga API MiMo-V2.5 hingga **99%**, memicu spekulasi tentang perang harga atau strategi merugi. Luo Fuli, kepala MiMo, merespons dengan mempublikasikan blog teknis 5000 kata yang merinci **enam pilar teknikal** di balik potongan harga besar ini. Intinya, diskon 99% terutama untuk **Input (Cache Hit)** – bagian dimana pengguna membaca ulang konteks historis dalam percakapan panjang. Ini menjadi mungkin karena serangkaian optimasi teknikal yang mengubah biaya komputasi untuk bagian tersebut mendekati nol. **Enam Pilar Teknikal:** 1. **Arsitektur Hybrid SWA:** Menggunakan Sliding Window Attention di sebagian besar lapisan model, mengurangi volume **KVCache** (memori jangka pendek model) hingga **1/7**. 2. **Manajemen KVCache Dua Kolam:** Mengalokasikan memori secara terpisah untuk lapisan SWA dan Full Attention, benar-benar mewujudkan penghematan teoretis 1/7 dan meningkatkan jumlah pengguna yang dapat dilayani secara bersamaan. 3. **Prefix Cache yang Dioptimalkan:** Meningkatkan aturan pencocokan cache untuk arsitektur SWA, menghasilkan **tingkat keberhasilan cache 93-95%** untuk pembacaan ulang. Artinya, sebagian besar permintaan "baca ulang" tidak memerlukan komputasi GPU baru. 4. **Sistem Cache Terdistribusi GCache:** Menyimpan data cache di **SSD bawaan mesin GPU**, menghilangkan biaya penyimpanan cluster khusus dan memperpanjang masa hidup cache. 5. **Sistem Penjadwalan LLM-Router:** Mengarahkan permintaan secara cerdas berdasarkan kesamaan prefix dan panjang konteks, memprioritaskan permintaan yang menggunakan cache, meningkatkan efisiensi dan mengurangi latensi. 6. **Multi-Token Prediction (MTP):** Mengoptimalkan proses generasi output model, mempercepat pembuatan respons dan melengkapi pengurangan biaya di sisi input. **Kesimpulan:** Penurunan harga 99% bukanlah gimmick pemasaran atau strategi merugi, tetapi hasil dari **efisiensi sistemik rekayasa AI** yang telah divalidasi di lingkungan produksi. Rantai optimasi ini secara kumulatif mengurangi biaya komputasi per permintaan hingga lebih dari 95%, memungkinkan penurunan harga ekstrem sambil mempertahankan profitabilitas. Luo Fuli menekankan bahwa ini adalah masalah teknikal yang terwujud, bukan sekadar perang harga.

marsbit10j yang lalu

Diskon 99% untuk MiMo Bukan Hanya Pemasaran! Luo Fuli Membantah Para Pengecam di X

marsbit10j yang lalu

Trading

Spot
Futures
活动图片