Luu Tuyen@lemmy.world to Technology@lemmy.worldEnglish · 1 year agoTikTok’s parent launched a web scraper that’s gobbling up the world’s online data 25-times faster than OpenAIfortune.comexternal-linkmessage-square86linkfedilinkarrow-up155arrow-down10
arrow-up155arrow-down1external-linkTikTok’s parent launched a web scraper that’s gobbling up the world’s online data 25-times faster than OpenAIfortune.comLuu Tuyen@lemmy.world to Technology@lemmy.worldEnglish · 1 year agomessage-square86linkfedilink
minus-squarepurrtastic@lemmy.nzlinkfedilinkEnglisharrow-up8·1 year agoIt’s not fine. They are not archiving the internet. I had to ban their user agent after very aggressive scraping that would have taken down our servers. Fuck this shitty behaviour.
minus-squareMelvin_Ferd@lemmy.worldlinkfedilinkEnglisharrow-up1·1 year agoIsn’t there a way to limit requests so that traffic isn’t bringing down your servers
minus-squareMojave@lemmy.worldlinkfedilinkEnglisharrow-up0·1 year agoThey obfuscate their traffic by randomizing user agents, so it’s either add a global rate limit, or let them ass fuck you
minus-squareWhyJiffie@sh.itjust.workslinkfedilinkEnglisharrow-up0·1 year agothe article told all source IPs can be tracked back to bytedance. Wouldn’t it be possible to block them? maybe even blocking all IPs of a specific ASN
It’s not fine. They are not archiving the internet.
I had to ban their user agent after very aggressive scraping that would have taken down our servers. Fuck this shitty behaviour.
Isn’t there a way to limit requests so that traffic isn’t bringing down your servers
They obfuscate their traffic by randomizing user agents, so it’s either add a global rate limit, or let them ass fuck you
the article told all source IPs can be tracked back to bytedance. Wouldn’t it be possible to block them? maybe even blocking all IPs of a specific ASN