Re: [Savannah-hackers-public] Web Crawler Bots

savannah-hackers-public

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Savannah-hackers-public] Web Crawler Bots

From:	Karl Berry
Subject:	Re: [Savannah-hackers-public] Web Crawler Bots
Date:	Sat, 7 Jan 2017 23:36:59 GMT

Bob - as you probably know, there are some existing fail2ban filters for
this -- {apache,nginx}-botsearch.conf are the most apropos I see at
first glance. fail2ban is the only scalable/maintainable way I can
imagine to deal with it.

A nonscalable/nonmaintainable way ... for tug.org, years ago I created a
robots.txt based on spammer user-agent strings I found at
projecthoneypot.org
(https://www.projecthoneypot.org/harvester_useragents.php nowadays, it
seems). It's still somewhat beneficial, though naturally it was surely
out of date the instant I put it up, let alone now. I also threw in
iptable rules by hand when the server was getting bogged down. I hope
one day I'll set up fail2ban (including recidive) for it ... -k

[Prev in Thread]

Current Thread

[Next in Thread]

[Savannah-hackers-public] Web Crawler Bots, Bob Proulx, 2017/01/06
- Re: [Savannah-hackers-public] Web Crawler Bots, Karl Berry <=
  - Re: [Savannah-hackers-public] Web Crawler Bots, Bob Proulx, 2017/01/07
    - Re: [Savannah-hackers-public] Web Crawler Bots, Karl Berry, 2017/01/08

Prev by Date: Re: [Savannah-hackers-public] Subversion svn moved to new server
Next by Date: Re: [Savannah-hackers-public] Web Crawler Bots
Previous by thread: [Savannah-hackers-public] Web Crawler Bots
Next by thread: Re: [Savannah-hackers-public] Web Crawler Bots
Index(es):
- Date
- Thread