Three days ago, Drew DeVault - founder and CEO of SourceHut - published a blogpost called, "Please stop externalizing your costs directly into my face", where he complained that LLM companies were crawling data without respecting robosts.txt and causing severe outages to SourceHut.
[…]
Then, yesterday morning, KDE GitLab infrastructure was overwhelmed by another AI crawler, with IPs from an Alibaba range; this caused GitLab to be temporarily inaccessible by KDE developers.
[…]
By now, it should be pretty clear that this is no coincidence. AI scrapers are getting more and more aggressive, and - since FOSS software relies on public collaboration, whereas private companies don't have that requirement - this is putting some extra burden on Open Source communities.