Andy Reid@lemmy.world to Technology@lemmy.worldEnglish · 9 months agoAI companies are violating a basic social contract of the web and and ignoring robots.txtwww.theverge.comexternal-linkmessage-square28fedilinkarrow-up13arrow-down10
arrow-up13arrow-down1external-linkAI companies are violating a basic social contract of the web and and ignoring robots.txtwww.theverge.comAndy Reid@lemmy.world to Technology@lemmy.worldEnglish · 9 months agomessage-square28fedilink
minus-squareKillingTimeItself@lemmy.dbzer0.comlinkfedilinkEnglisharrow-up0·9 months agohmm, i though websites just blocked crawler traffic directly? I know one site in particular has rules about it, and will even go so far as to ban you permanently if you continually ignore them.
minus-squareBogasse@lemmy.mllinkfedilinkEnglisharrow-up0·9 months agoDetecting crawlers can be easier said than done 🙁
minus-squareKillingTimeItself@lemmy.dbzer0.comlinkfedilinkEnglisharrow-up0arrow-down1·9 months agoi mean yeah, but at a certain point you just have to accept that it’s going to be crawled. The obviously negligent ones are easy to block.
hmm, i though websites just blocked crawler traffic directly? I know one site in particular has rules about it, and will even go so far as to ban you permanently if you continually ignore them.
Detecting crawlers can be easier said than done 🙁
i mean yeah, but at a certain point you just have to accept that it’s going to be crawled. The obviously negligent ones are easy to block.