bug#52338: Crawler bots are downloading substitutes

bug-guix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#52338: Crawler bots are downloading substitutes

From:	Tobias Geerinckx-Rice
Subject:	bug#52338: Crawler bots are downloading substitutes
Date:	Fri, 10 Dec 2021 23:52:51 +0100

All,

Mark H Weaver 写道：

For what it's worth: during the years that I administered Hydra,I foundthat many bots disregarded the robots.txt file that was in placethere.In practice, I found that I needed to periodically scan theaccess logsfor bots and forcefully block their requests in order to keepHydra from
becoming overloaded with expensive queries from bots.


Very good point.

IME (which is a few years old at this point) at least thehighlighted BingBot & SemrushThing always respected my robots.txt,but it's definitely a concern. I'll leave this bug open to remindus of that in a few weeks or so…

If it does become a problem, we (I) might add some basicUser-Agent sniffing to either slow down or outright blocknon-Guile downloaders. Whitelisting any legitimate ones, ofcourse. I think that's less hassle than dealing with dynamic IPblocks whilst being equally effective here.

Thanks (again) for taking care of Hydra, Mark, and thank you Leofor keeping an eye on Cuirass :-)


T G-R

signature.asc
Description: PGP signature

[Prev in Thread]

Current Thread

[Next in Thread]

bug#52338: Crawler bots are downloading substitutes, Leo Famulari, 2021/12/06
- bug#52338: [maintenance] hydra: berlin: Create robots.txt., Leo Famulari, 2021/12/06
  - bug#52338: Crawler bots are downloading substitutes, Mathieu Othacehe, 2021/12/09
    - bug#52338: Crawler bots are downloading substitutes, Tobias Geerinckx-Rice, 2021/12/09
    - bug#52338: Crawler bots are downloading substitutes, Leo Famulari, 2021/12/10
    - bug#52338: Crawler bots are downloading substitutes, Tobias Geerinckx-Rice, 2021/12/10
    - bug#52338: Crawler bots are downloading substitutes, Mathieu Othacehe, 2021/12/11
    - bug#52338: Crawler bots are downloading substitutes, Mathieu Othacehe, 2021/12/19
- bug#52338: Crawler bots are downloading substitutes, Mark H Weaver, 2021/12/10
  - bug#52338: Crawler bots are downloading substitutes, Tobias Geerinckx-Rice <=

Prev by Date: bug#52001: numpy CPU dispatch probably prevents builds of python-numpy from being reproducible
Next by Date: bug#52139: jupyter trying to modify /gnu/store
Previous by thread: bug#52338: Crawler bots are downloading substitutes
Next by thread: bug#52347: Shell: error when -m manifest is removed
Index(es):
- Date
- Thread