Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Hao He
August 27, 2024
The number of GitHub stars is probably the first metric we look at when we evaluate an open-source project or package. Our habit puts a lot of weight on GitHub stars as a metric in software supply chain security. However, such weight corrupts GitHub stars as a popularity metric, echoing the two laws argued by social scientists:
Recently, people are buying fake GitHub stars either to cheat the popularity contest, or to spread malicious content. Their prices can be as low as $0.10 per star. However, GitHub Acceptable Use Policy prohibits “automated excessive bulk activity and coordinated inauthentic activity" and “inauthentic interactions, such as fake accounts and automated inauthentic activity”.
Although GitHub “has been aware of the presence of fake starrers for years, and actively works to remove these from the platform,” we still do not understand this phenomenon well enough, and we do not have publicly available estimates of their frequency and impact. Dagster proposed an open-source detector last year, which is a good starting point, but our experiments found that it is not scalable enough to practically scan the entire GitHub. This motivates us to start an new exploration around fake stars in GitHub.
First, fake stars trick people into installing malicious software.
zigzagmoot/TapSwapAuto
zigzagklatton/APEXLEGENDSbyklatton
zigzag869/utrorrent-activation-by-gaij
zhengyanlin18/PlayDoge-Auto-Farm-and-Bot-Setup
zhengkaifor/adobe-lightroom-ai-activation
zhaowuling/Zhaowulling-Sea-Main
zhangdapao9523/Flash-USDT-Sender
zhangdapao9523/ETH-HUNTER
zhangdapao9523/DoxCoinAuto
1cyres/Albion-Radar-Main
1Xitz1/eg54yyg5e4
1Xitz1/d5y4ggy5d4
1905mali/League-Of-Legends-Hack
1842JakUCY/h7ixgmze47ykfk4
zhangdapao9523/DotcoinAuto
...
This is a snapshot of GitHub repositories with a very high percentage of fake stars (all of the ones in this list have been taken down by GitHub). We could probably guess from the repository names that many of them may be spreading malware to steal your cryptocurrencies or copyright violation / piracy software (which may also contain hidden malware).
Here is another example malicious repository Solmonster/PhantomSniper-Solana-Sniper-Bot
we have found that is still on GitHub at the time of writing (mid Aug). It has 109 suspected fake stars at the time of detection (early July) and a fancy README. However, it is secretly stealing your cryptocurrencies using a hidden spawn()
call.
The motivation for luring VCs with fake stars is growth hacking (”fake it until you make it”). However, our early statistical modeling shows that cheating fake stars does not really help you gain traction. It may be able to get you more real stars in the first month, but the presence of fake stars gives you a negative effect long term. Plus, it does not have a statistically significant effect on attracting downloads.
Finally, fake stars promote low quality GitHub repositories, notably low quality “listicles” and tutorials, creating spam and information pollution on GitHub. For example, we have detected a large bunch of fake star repos with “awesome”, “template”, “demo”, “example,” etc. in their titles. These seemingly popular but low quality listicles/tutorials adds more noise to GitHub and may be misleading to programming newcomers.
Recognito-Vision/Face-SDK-iOS-Demo 93 stars (93 suspected fakes)
dsnbey/MVVM-Layered-Architecture-Example 237 stars (236 suspected fakes)
1321928757/Concurrent-MulThread-Demo 81 stars (80 suspected fakes)
dnbmagic/farcaster-examples 68 stars (67 suspected fakes)
dnbmagic/awesome-frames 64 stars (63 suspected fakes)
solidglue/tensorflow2_examples_jupyter 61 stars (60 suspected fakes)
andeug/code-examples 490 stars (478 suspected fakes)
Recognito-Vision/Face-SDK-Android-Demo 98 stars (95 suspected fakes)
StrawHat1Luffy/farcaster-examples 69 stars (64 suspected fakes)
solidglue/sklearn_examples_jupyter 69 stars (61 suspected fakes)
scayle/demo-add-on-vite 70 stars (61 suspected fakes)
Foblex/f-flow-example 77 stars (67 suspected fakes)
Recognito-Vision/Face-SDK-Linux-Demos 109 stars (89 suspected fakes)
52jing/wang-template-backend 126 stars (100 suspected fakes)
ai-boost/awesome-prompts 3892 stars (3015 suspected fakes)
CerberusChaos/Starknet-Dapp-Template 85 stars (59 suspected fakes)
jiawanlong/cesium-three-demos 96 stars (64 suspected fakes)
...
Our algorithm identified 3,746,538 suspected fake stars in the last five years (July 2019 to July 2024) and 10,155 repositories that have seemingly run a fake star campaign. The number of suspected fake stars is rapidly growing in the last six months.
According to our estimate, ~89% of repositories with suspected fake star campaigns have been deleted. It is unclear whether GitHub is taking action on these repositories because they bought fake stars, or because they are spreading malware, or because the authors deleted them.
However, there are still ~11% (1,136) repositories present on GitHub even if they have a suspected fake star campaign. Notably, for the 41 npm and 47 PyPI packages, only three (3.4%) of their GitHub repos have been deleted in GitHub.
More importantly, we found that VirusTotal is reporting malware on 28 repositories that are still on GitHub at the time of writing, indicating that fake stars are highly correlated with malicious activities on GitHub.
Even for the repositories that are eventually taken down (which means that they are probably malicious!), 7.86% of them have lived for more than one month, leaving a long time window for these repositories to make potential exploitations.
Only 18.12% of the users participating in those suspected fake star campaigns have been deleted on GitHub. Most of the deletions happened recently.
This project is led by one of our summer interns, Hao He. Hao is currently a Ph.D. student in Software Engineering at Carnegie Mellon University, co-advised by Dr. Bogdan Vasilescu and Dr. Christian Kästner. The initial idea comes from the observation that the number of stars are often used blindly by both researchers and practitioners without careful consideration about the meaning behind it. A bit more further exploration shows that there are multiple GitHub star black markets and these fake stars may be linked with other malicious activities in the software supply chain. This aligned with Socket’s interest and resulted in Hao’s internship project of detecting fake star activities in GitHub.
To find fake stars, Hao builds on prior research from social media fraud detection and open-source software. The detector runs on the GHArchive dataset, a mirror of all GitHub events stored in Google BigQuery and updated daily. It employs two heuristics:
Note that both heuristics generate false positives. Notably, fake star accounts may star legitimate repos to avoid detection. To make our output more trustworthy, we included an additional post-processing step to only label repositories with a noticeable fake star bursts as those that probably bought fake stars.
Based on this research, Socket is launching a new “Suspicious Stars on GitHub” alert that utilizes the low activity and clustering heuristics to detect packages associated with repositories that have fake stars.
This alert gives users more visibility into the legitimacy of a software package’s star count, and flags those that may have been artificially inflated stars from bots, crowdsourcing, or other means. It’s set as a High Severity alert, due to the potential for spam, fraud, or even a supply chain attack. These packages should be carefully reviewed before installing.
First, you should look carefully at the open-source packages or projects you want to use. Don’t take stars at face value! If the star count seems fishy (e.g., the projects have lots of stars but very little actual activity, such as open issues and PRs), it could be fraudulent.
Second, if you are suspicious about certain packages, you can check Socket’s package pages for free: We publish all this data and make it available on our website, so anyone can check view package information with our detections.
Finally, if you want to get proactive alerts and check your entire organization for suspicious star packages (and 70+ indicators of supply chain risk), install the free Socket for GitHub app in just 2 clicks. Whenever a new dependency is added or updated in a pull request, Socket analyzes the package's behavior and security risk, alerting you before any malicious code has the chance to land in your project.
Subscribe to our newsletter
Get notified when we publish new security blog posts!
Try it now
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.