GitHub Announces Default Use of Copilot User Data for AI Model Training Starting April 24

marsbit2026-03-26 tarihinde yayınlandı2026-03-26 tarihinde güncellendi

Özet

GitHub has announced an update to its repository policy, effective April 24, 2026, allowing the use of user interaction data to train its AI models. The data collection will include users of Copilot Free, Pro, and Pro+, covering model inputs and outputs, code snippets, contextual information, repository structures, and chat logs. According to GitHub’s Chief Product Officer Mario Rodriguez, the move aims to enhance the accuracy and security of the model’suggestions, with internal Microsoft tests already showing improved acceptance rates. The policy follows an opt-out model, meaning affected users must manually disable data sharing in their privacy settings, sparking debate within the developer community over data ownership and the definition of private repositories. Copilot Business, Enterprise, and educational users are currently exempt due to contractual terms. GitHub defended the change as consistent with industry practices adopted by companies like Anthropic, JetBrains, and Microsoft. However, the inclusion of private repository code in training sets challenges conventional notions of privacy. This shift reflects a broader industry trend where leading AI providers are turning to user interaction data as high-quality public code resources diminish. It signals GitHub’s continued transition from an open-source platform to a closed-loop AI training ecosystem and highlights growing tensions between data compliance and AI model advancement.

GitHub recently announced an update to its repository policy effective April 24, 2026, planning to utilize user interaction data to train its AI models. This data collection covers Copilot Free, Pro, and Pro+ users, specifically including model inputs and outputs, code snippets, contextual information, repository structures, and chat interaction logs.

GitHub's Chief Product Officer, Mario Rodriguez, stated that the introduction of interaction data aims to improve the accuracy and security of the model's code suggestions, noting that pre-testing with Microsoft's internal data has significantly increased suggestion acceptance rates. Notably, the policy adopts an "opt-in by default" mechanism, requiring affected users to manually disable the relevant option in their privacy settings to opt out, which has sparked widespread discussion in the developer community regarding the definition of private repositories and data ownership.

Currently, Copilot Business, Enterprise users bound by contract terms, and educational users are temporarily unaffected by this change. GitHub emphasized in its statement that this move aligns with industry practices commonly adopted by major players like Anthropic, JetBrains, and Microsoft. However, incorporating private repository code into training datasets essentially challenges the traditional boundaries of "private" concepts, even though GitHub claims its purpose is to optimize development workflows.

From an industry perspective, as high-quality public code data becomes increasingly scarce, leading AI vendors are accelerating their shift toward mining "deep data" such as private interaction data to seek performance gains in models. This policy shift not only marks GitHub's further tilt from an open-source hosting platform toward a closed-loop AI training ecosystem but also signals that the AI developer tools sector is entering a new stage of博弈 between data compliance and model evolution.

İlgili Sorular

QWhat is the main change GitHub announced regarding Copilot and user data?

AGitHub announced that starting April 24, 2026, it will update its repository policy to use user interaction data from Copilot Free, Pro, and Pro+ users to train its AI models.

QWhich groups of users are exempt from this new data usage policy?

ACopilot Business, Enterprise users, and educational users are currently not affected by this change due to contractual terms.

QWhat reason did GitHub's Chief Product Officer give for collecting this data?

AMario Rodriguez stated that introducing interaction data aims to improve the model's code suggestion accuracy and security, noting that internal testing at Microsoft has already significantly increased suggestion acceptance rates.

QHow can users opt out of having their data used for training?

AThe policy uses an 'opt-out' mechanism, meaning affected users must manually go into their privacy settings to disable the relevant option to exclude their data.

QWhat broader industry trend does this policy change reflect according to the article?

AIt reflects a trend where top AI vendors are turning to 'deep data' like private interaction data to seek model performance gains as high-quality public code data becomes scarce, signaling a new phase of balancing data compliance with model evolution in AI developer tools.

İlgili Okumalar

Senator Proposes Creation of Bureau to Combat Trump's Cryptocurrency Business

A US senator has proposed the creation of a federal bureau to investigate and combat corruption related to cryptocurrency businesses, explicitly citing former President Donald Trump's ventures. The proposed bureau, to be led by a Senate-confirmed board, would operate independently with investigative powers. The senator's bill also aims to allow private citizens and state attorneys general to sue officials and companies to recover illicitly obtained funds, seeking to eliminate conflicts of interest for public servants in crypto. The proposal follows reports that Trump and his family earned over $5.4 billion from crypto projects like World Liberty Financial and specific memecoins after his 2025 return to the presidency. While the White House denies any conflict of interest, these controversies have stalled the CLARITY bill, with Democrats pushing for amendments to ban crypto profits for presidents, Congress members, and their families. Similar amendments were previously rejected from the GENIUS stablecoin law in 2025.

cryptonews.ru1 saat önce

Senator Proposes Creation of Bureau to Combat Trump's Cryptocurrency Business

cryptonews.ru1 saat önce

HIVE Executive: AI GPUs Generate 10x More Revenue Per Hour than Mining Farms

HIVE Digital Technologies reported a significant shift in its business strategy, with AI computing now generating vastly more revenue per GPU-hour than its Bitcoin mining operations. CEO Frank Holmes stated that a cluster of 504 Nvidia B200 GPUs deployed in AI infrastructure earns about $2.90 per GPU-hour, compared to roughly $0.12 per hour from Bitcoin mining rigs. The company’s fiscal 2026 total revenue grew 158% year-over-year to $297.8 million, with its new AI and high-performance computing (HPC) division, BUZZ HPC, contributing $19.5 million. HIVE’s strategy involves directing most new infrastructure investment towards higher-margin AI services while maintaining its significant Bitcoin mining operations, which produced 2,885 BTC in FY2026. Its most ambitious project is a 320-megawatt AI data center near Toronto, slated to begin operations in late 2027 and projected to generate approximately $360 million in annual recurring revenue. HIVE’s pivot reflects a broader industry trend as public mining companies face pressure from declining Bitcoin mining margins and increasingly allocate resources to lucrative AI and cloud computing contracts.

cryptonews.ru1 saat önce

HIVE Executive: AI GPUs Generate 10x More Revenue Per Hour than Mining Farms

cryptonews.ru1 saat önce

"Every Day Without Rules Is a Loss of Capital": Grayscale Appeals to Senate Over CLARITY Act

Grayscale Investments has sent an open letter to U.S. Senate leaders urging a vote on the CLARITY Act before Congress's August recess. The letter, from Grayscale's Chief Legal Officer Craig Salm, argues the U.S. crypto industry has long operated without a comprehensive regulatory framework, relying instead on unpredictable "regulation by enforcement." The CLARITY Act aims to establish clear market rules, define jurisdictional boundaries between the CFTC and SEC, and enhance U.S. competitiveness. Grayscale warns that each day of delay sees talent, innovation, and capital flow to jurisdictions like Singapore and Abu Dhabi that already have clear regulations. The bill has advanced through committee stages and now awaits a full Senate vote. While acknowledging a busy legislative calendar, Grayscale respectfully requests a pre-recess vote to provide long-awaited regulatory certainty. The push comes amid ongoing political debate around the bill, which has faced delays due to other priorities and criticism from some state officials and bankers concerned about fraud risks and stablecoin provisions.

cryptonews.ru1 saat önce

"Every Day Without Rules Is a Loss of Capital": Grayscale Appeals to Senate Over CLARITY Act

cryptonews.ru1 saat önce

Grayscale believes the emergence of 3,000 onchain vaults with assets over $7 billion will be the next breakthrough in the cryptocurrency sector

Grayscale Research believes that onchain vaults, which pool investor capital for professionally managed, yield-generating strategies via smart contracts, could be the next major crypto innovation to enter the mainstream. Currently, over 3,000 such vaults hold roughly $7 billion in assets. They resemble traditional Collateralized Loan Obligations (CLOs) but replace intermediaries like trustees with blockchain-based smart contracts, enabling real-time transparency, lower costs, and potentially higher liquidity. While still small compared to the $1.5 trillion global CLO market, these vaults—primarily focused on stablecoin strategies—could become foundational digital asset investment products. However, regulatory uncertainty in the U.S., particularly around whether vault managers might be deemed investment advisers or issuers of securities, remains a key obstacle for institutional adoption. Their success hinges on balancing smart contract efficiency with compliance to traditional financial standards.

cryptonews.ru1 saat önce

Grayscale believes the emergence of 3,000 onchain vaults with assets over $7 billion will be the next breakthrough in the cryptocurrency sector

cryptonews.ru1 saat önce

Major Crypto Investor Unstakes $57 Million Worth of HYPE

A major crypto investor, often referred to as a "whale," has unstaked $57 million worth of $HYPE tokens. This particular whale acquired the tokens 17 months ago at an average price of $18, meaning a potential profit of over $39 million if sold. The assets were deposited into FalconX and Coinbase Prime, indicating a likely intention to sell. Separately, another wallet unstaked 1.89 million $HYPE ($105.9 million). Analysts note that while unstaking can signal bearish sentiment or portfolio restructuring, it does not automatically mean the holder will sell. This trend is growing; the volume of $HYPE awaiting withdrawal has more than doubled in a week, from $241 million to $496.6 million. Concurrently, the total amount of staked $HYPE has decreased over the past month. This shift suggests some long-term holders are adopting a bearish outlook and reducing their positions. The $HYPE price has fallen 16.15% over the month to $54.4, with a market cap of $13.7 billion, trading in a downtrend after a recent bounce. Daily trading volume has also declined. In a related move, BitMEX co-founder Arthur Hayes sold all of his $HYPE holdings worth $18.02 million in June, despite previously offering optimistic forecasts for the cryptocurrency.

cryptonews.ru1 saat önce

Major Crypto Investor Unstakes $57 Million Worth of HYPE

cryptonews.ru1 saat önce

İşlemler

Spot