Spending $200 to Buy Stars, Scamming VCs Out of Tens of Millions: The Entire GitHub Fake Star Industry Exposed

marsbitPublished on 2026-04-21Last updated on 2026-04-21

Abstract

A peer-reviewed study from Carnegie Mellon University (CMU) reveals that GitHub hosts approximately 6 million fake Stars, involving 18,600 repositories and 301,000 accounts, with AI/LLM projects being the largest non-malicious category for fake engagement. The fake Star market has exploded, with prices as low as $0.03 per Star. Research shows that venture capital firms, such as Redpoint Ventures, use GitHub Star counts as a key metric for evaluating startups, with median Stars at 2,850 for seed-stage funding. For less than $200, a project can artificially meet this threshold, distorting investment landscape. Over a dozen websites openly sell GitHub Stars, and fake Star activity saw explosive growth in 2024. AI-related repositories were among the most heavily affected. Despite GitHub’s policies against fake engagement, enforcement remains inconsistent: while 90% of flagged repositories were deleted, only 57% of involved accounts were suspended. The report highlights how purchased Stars can manipulate GitHub’s Trending algorithm and influence VC funding decisions, creating a cycle where artificial metrics attract real investment.

Author: Claude, Deep Tide TechFlow

Deep Tide Intro: A peer-reviewed study from Carnegie Mellon University (CMU) found approximately 6 million fake Stars on GitHub, involving 18,600 repositories and 301,000 accounts. AI/LLM projects are the largest non-malicious category for star-buying. The market price for a single star can be as low as $0.03. Redpoint data shows the median number of Stars for VC seed-stage projects is 2,850—meaning spending less than $200 can 'buy' a false level of popularity that meets the seed-round threshold.

GitHub Stars are becoming an elaborately packaged scam.

According to an investigative report published by Awesome Agents on April 13th, a mature gray market around GitHub Stars is operating in plain sight: academic papers have quantified the scale of the problem, over a dozen websites openly sell Stars, and venture capital firms directly incorporate Star counts into their project screening decisions.

The investigation team independently verified 20 repositories and found that 36% to 76% of the Stars for some projects came from accounts with zero followers, with fork-to-star ratios less than one-tenth of the baseline for organic projects.

The core academic support for this report comes from a peer-reviewed paper jointly published by CMU, North Carolina State University, and Socket at ICSE 2026 (International Conference on Software Engineering). The research team's detection tool, StarScout, analyzed 20TB of GitHub metadata (6.7 billion events, 326 million Stars, covering 2019 to 2024), ultimately flagging approximately 6 million suspicious fake Stars, 18,600 involved repositories, and about 301,000 participating accounts.

6 Million Fake Stars: Explosive Growth in 2024, AI Projects Heavily Affected

Fake Stars are not a new phenomenon, but their scale exploded in 2024. CMU paper data shows that before 2022, there were no more than 10 repositories involved in fake Star activity per month. By the peak in July 2024, this number skyrocketed to 3,216 repositories and 30,779 participating accounts. As of July 2024, 16.66% of repositories with more than 50 Stars had engaged in fake Star activity.

The detection accuracy of the research team was indirectly validated by GitHub's own actions: 90.42% of the repositories flagged by StarScout have been deleted, and 57.07% of the flagged accounts have been purged.

In the classification of fake Star usage, most are used to promote short-lived phishing/malware repositories. But among non-malicious categories, AI and LLM-related projects rank first, with a total of 177,000 fake Stars, surpassing blockchain/cryptocurrency projects. The paper notes that "many of these are academic paper repositories or products from LLM-related startups." More critically, 78 repositories detected with fake Star activity had appeared on the GitHub Trending page, proving that purchased Stars can indeed successfully manipulate the platform's recommendation algorithm.

A Star for as Low as 3 Cents: The Openly Operating Star-Buying Market

This is not a dark web transaction. The investigation confirmed that at least a dozen websites openly sell GitHub Stars, including SocialPlug.io, Buy.fans, Boost-Like.store, etc. There are 24 active Star-buying services on Fiverr, ranging from basic packages for $5 to "organic promotion" packages for $25 and above.

Pricing is tiered: cheap tier (disposable new accounts) $0.03 to $0.10 per star, mid-tier $0.20 to $0.50, premium tier (aged accounts with years of history) $0.80 to $0.90. Premium services promise "non-drop stars" and a 30-day refill guarantee. SocialPlug claims to have delivered 3.1 million Stars cumulatively, serving over 53,000 customers, and even offers an API interface for programmatic bulk purchasing.

Star exchange platforms like GithubStarMate.com and SafeStarExchange.com use a points-based mutual brushing model, allowing users to exchange Stars without spending money. There are also at least 7 open-source tools on GitHub (e.g., fake-git-history, commit-bot, etc.) specifically designed to forge contribution history graphs. Pre-made GitHub accounts with 5 years of commit history and the Arctic Code Vault contributor badge are sold on Telegram for about $5,000.

A 2020 study from Tsinghua University documented the operations of promotion groups on QQ and WeChat in China: groups with over 1,020 members process about 20 repository star-buying tasks daily, estimating an annual industry profit of $3.4 million to $4.4 million.

VCs Use Stars for Project Screening, Spending $200 Can "Meet" Seed Round Standards

The relationship between Stars and funding is not speculation; it's something venture capital firms themselves publicly admit.

Redpoint Ventures partner Jordan Segall analyzed 80 developer tool companies and found that the median number of GitHub Stars at seed funding was 2,850, and 4,980 at Series A. He explicitly stated: "Many VCs write internal crawlers to find GitHub projects with fast Star growth. Stars are the metric they most commonly track."

These numbers essentially give startups a precise shopping list. Using cheap Stars, spending $85 to $285 can manufacture 2,850 Stars to reach the seed round median; spending $990 to $4,500 can reach the Series A threshold. Compared to the typical seed round funding range of $1 million to $10 million, the return on investment ranges from 3,500x to 117,000x.

The ROSS Index (Ranking of Open Source Startups), published quarterly by Runa Capital, further amplifies this incentive. According to TechCrunch, 68% of the companies on the ROSS Index received investment at the seed stage, with total tracked funding reaching $169 million. An independent analysis in the investigative report found that Union Labs, ranked first in the Q2 2025 ROSS Index (Star growth 54.2x, total 74,300 Stars), showed severe signs of star-buying: 32.7% of its Stars came from accounts with zero repositories, 52% from accounts with zero followers, and StarScout flagged 47.4% of its Stars as suspicious. The top project on an industry ranking widely cited by VCs had nearly half its Stars涉嫌造假 (suspected of being fake).

Actual cases already corroborate the conversion chain from Stars to funding: Lovable (formerly GPT Engineer) secured a $7.5 million pre-seed round with 50,000+ Stars, with a Series A valuation of $1.8 billion; Browser-use received a $17 million seed round after gaining 50,000 Stars in three months; Pangolin entered Y Combinator with 1,000 Stars and completed a $4.7 million seed round within eight months.

GitHub's Asymmetric Enforcement: Delete Repositories but Keep Accounts

GitHub's Acceptable Use Policies explicitly prohibit "artificial engagement," "ranking manipulation," and creating a secondary market for fake Stars, even specifically banning star-buying behavior incentivized by "cryptocurrency airdrops."

But enforcement is passive and asymmetric. GitHub deleted 90.42% of the repositories flagged by StarScout but only purged 57.07% of the executing accounts. The "workforce" of the fake Star industry remains largely intact. After Dagster published an investigative report in 2023, the related fake Star accounts were deleted within 48 hours—but this was a reaction to public exposure, not the result of proactive detection.

The CMU research team suggested GitHub adopt a network centrality-based weighted popularity metric to replace the raw Star count, structurally dismantling the fake Star economy. GitHub has not implemented this to date.

This forms a self-reinforcing loop: VCs use Stars as a screening signal → Startups buy Stars → VCs see artificial hype → More VCs adopt Star tracking → More startups buy Stars. The benchmark numbers publicly released by Redpoint (seed: 2,850, Series A: 4,980) essentially gave startups a clearly priced shopping list.

As one commentator in the investigative report said: "Star counts can be faked, but saving someone a weekend of bug fixes cannot."

Related Questions

QWhat is the estimated number of fake GitHub Stars identified in the CMU study, and which category of projects had the highest number of non-malicious fake Stars?

AThe CMU study identified approximately 6 million fake GitHub Stars. Among non-malicious categories, AI and LLM-related projects had the highest number of fake Stars, totaling 177,000.

QHow much does the cheapest fake GitHub Star cost, and what is the estimated cost to fake the median Star count for a seed-round project?

AThe cheapest fake GitHub Star costs as low as $0.03. To fake the median Star count of 2,850 for a seed-round project, it would cost less than $200 using the cheapest options.

QAccording to the article, what percentage of repositories with over 50 Stars had engaged in fake Star activities by July 2024?

ABy July 2024, 16.66% of repositories with over 50 Stars had engaged in fake Star activities.

QWhich venture capital firm published data on the median GitHub Star counts for seed and Series A rounds, and what were those numbers?

ARedpoint Ventures published the data. The median GitHub Star count was 2,850 for seed rounds and 4,980 for Series A rounds.

QWhat tool did the research team develop to detect fake GitHub Stars, and how was its accuracy indirectly validated?

AThe research team developed a tool called StarScout to detect fake GitHub Stars. Its accuracy was indirectly validated by GitHub's actions: 90.42% of the repositories flagged by StarScout were deleted, and 57.07% of the flagged accounts were purged.

Related Reads

AI Trading Cools, South Korean Stocks Plunge 1.8%, Spot Gold Rises 1%, Bitcoin Dives

A sell-off in AI-related stocks, triggered by Broadcom's disappointing earnings forecast, sent shockwaves through global markets. South Korea's KOSPI led Asia's decline, plunging 1.8% as the risks from concentrated chip stock gains and surging leveraged investments came to the fore. The tech-heavy Nasdaq 100 futures fell 0.5% following Broadcom's 14% after-hours plunge, which signaled a slower-than-expected transition to AI clients. This pullback extended Wall Street's weakness, halting the S&P 500's nine-day rally amid hawkish Fed signals and renewed Middle East tensions. South Korean authorities convened an emergency meeting, pledging "immediate measures" against market volatility and warning of record-high stock margin debt. The adjustment rippled across assets: Bitcoin fell to around $64,000, its lowest since February, while safe-haven gold rose 1% on bargain hunting. Oil prices dipped on Middle East ceasefire news. Market analysts noted the sell-off was driven by profit-taking after massive gains, particularly in chip stocks like Samsung and SK Hynix, which now dominate the KOSPI. Wall Street banks are divided on Korea's outlook, with Goldman Sachs raising its target while Citigroup and others warn of overvaluation and a potential bubble. Bridgewater's Ray Dalio noted that great technological shifts often create bubbles. Meanwhile, Fed officials' hints at potential future rate hikes added to the cautious mood ahead of key U.S. jobs data.

华尔街日报17m ago

AI Trading Cools, South Korean Stocks Plunge 1.8%, Spot Gold Rises 1%, Bitcoin Dives

华尔街日报17m ago

Seeking Alpha's Hot Article: Why Might the U.S. Stock Market Crash in June?

In a recent Seeking Alpha article, financial professor and analyst Damir Tokic argues that the US stock market may be poised for a significant crash in June 2026. The core thesis centers on a "mega-bubble" in equities, particularly within the technology sector, which has driven the S&P 500 to near-record valuations, with a Shiller P/E ratio exceeding 40—a level comparable to the 2000 dot-com bubble. Tokic identifies two primary catalysts for a potential collapse. First, he points to unsustainable market exuberance fueled by what he terms the "Trump Stimulus"—massive AI capital expenditure by tech giants, which he believes is politically driven and cannot last. Second, and more urgently, he highlights the escalating Iran war as a critical threat. The ongoing closure of the Strait of Hormuz has created a severe global energy supply crunch. Strategic petroleum reserves are projected to hit critically low operational levels by June, potentially causing oil prices to spike above $200 per barrel and triggering a severe, supply-driven inflationary shock. This scenario, Tokic warns, would force the Federal Reserve's hand. Despite currently maintaining a dovish bias, the Fed would likely be compelled to officially pivot to a hawkish stance at its June FOMC meeting to combat soaring inflation and bond yields. He contends that such a shift—or even a failure to act, which would destroy Fed credibility—could be the trigger that punctures the market bubble. The resulting downturn, he concludes, could rival the bear markets of 2000 and 2008, advising investors to prepare for a major correction.

marsbit39m ago

Seeking Alpha's Hot Article: Why Might the U.S. Stock Market Crash in June?

marsbit39m ago

AI PC Battle: Bet on the Toll Booth, Not the Camp

**Title:** The AI PC Battle: Don't Bet on Sides, Bet on the Tollbooth **Summary:** The AI PC competition is moving beyond simple "x86 vs. Arm" narratives. The core investment thesis should focus on identifying which players can sustain margins, cash flow, and pricing power throughout the upgrade cycle, rather than backing a particular architecture. The opportunity is analyzed in three layers: 1. **The Advanced Foundry Tollbooth:** TSMC is positioned to collect "tolls" regardless of which chip designer wins, due to its dominant ~70% share in advanced semiconductor manufacturing, which is essential for high-end AI PC chips. 2. **Compute & Platform Spillover:** AMD represents an offensive in the x86 CPU+GPU space, while NVIDIA leverages its GPU and CUDA software stack dominance. Both benefit from the demand for increased local AI compute. 3. **Architecture Diffusion & Turnaround Plays:** ARM and Intel offer potential for significant upside (elasticity), but investments here require stricter discipline due to higher execution risks and competitive challenges. The industry is transitioning from concept to shipment validation. While short-term forecasts for AI PC adoption have been revised down slightly due to tariffs and procurement delays, the long-term trend towards AI becoming a standard PC feature remains intact. The key driver for upgrade cycles will be whether compelling enterprise applications (e.g., privacy-sensitive computing, low-latency inference) emerge beyond consumer-focused features like meeting summarization. Investment strategy should prioritize companies with platform-level advantages and recurring revenue streams. TSMC offers high certainty as the foundational tollbooth. AMD presents a strong offensive play within the established ecosystem. ARM and Intel are higher-risk, higher-potential-reward turnaround bets. The report cautions against chasing short-term hype and emphasizes a disciplined, long-term approach focused on buying ecosystem strength and cash-flow certainty after market enthusiasm subsides. **Key Risks:** Underwhelming AI PC applications slowing upgrade cycles; slow improvement in Windows on Arm compatibility; macro/tariff impacts on PC demand; potential advanced node supply-demand mismatches affecting TSMC; high overall AI sector valuations making stocks vulnerable to a risk-off shift in markets.

marsbit53m ago

AI PC Battle: Bet on the Toll Booth, Not the Camp

marsbit53m ago

Ten-Thousand-Word Analysis: From $10 to $290, MRVL Wins the Entire AI Era by 'Not Making GPUs'

Marvell Technology's stock price surged from under $10 in 2016 to a record $290 in June 2026, fueled not by making GPUs, but by dominating AI infrastructure connectivity. This analysis argues the market misvalues MRVL as merely a smaller Broadcom in custom AI chips, overlooking its true, unique position. Marvell's core strength lies in enabling high-speed data flow for AI clusters through three interconnected businesses. First, it holds a commanding ~70% market share in high-speed optical DSPs (essential for data center light modules), a deep-moat business with accelerating growth. Second, its custom AI chip design business serves hyperscalers like AWS, Microsoft, and Google, with a significant revenue pipeline despite lower margins. Third, stable cash flows come from Ethernet switch chips and enterprise storage controllers. Together, they form a full-stack "AI data movement" platform. CEO Matt Murphy's transformative leadership since 2016, involving strategic divestments, key acquisitions (like Inphi for optical DSPs), and securing long-term agreements with major cloud providers, repositioned the company. A pivotal $2 billion strategic investment from NVIDIA in 2026 underscored Marvell's critical role in the AI ecosystem, particularly through collaborations like NVLink Fusion. While Marvell faces risks—including client concentration (losing the Amazon Trainium3 design), lower-margin business mix, competitive threats, insider selling, and complex supply chains—its fundamentals remain strong. The optical interconnect moat is widening with the acquisition of Celestial AI (photonics fabric), and financial metrics show accelerating revenue growth and operating leverage. With a PEG ratio suggesting undervaluation relative to its growth, the thesis is that the market undervalues Marvell's monopolistic position in AI "plumbing" while overemphasizing its competitive custom chip segment. The story transcends investing, symbolizing how in any complex system—from the internet to AI—the value of "connection" ultimately surpasses that of individual "nodes."

marsbit1h ago

Ten-Thousand-Word Analysis: From $10 to $290, MRVL Wins the Entire AI Era by 'Not Making GPUs'

marsbit1h ago

Trading

Spot
Futures
活动图片