Hi, my VM is currently being hammered by a bot, wanted to block the requests by adding the bot name to fail2ban's apache-badbots.conf. It "would" work, probably, but after enabling the badbots jail, this happened: Code: 2024-04-25 10:31:23,556 fail2ban.filter [667]: WARNING [apache-badbots] Ignore line since time 1714033229 < 1714033883.556705 - 600 2024-04-25 10:31:23,556 fail2ban.filter [667]: WARNING [apache-badbots] Please check jail has possibly a timezone issue. Line with odd timestamp: my-domain.de:443 3.138.141.202 - - [25/Apr/2024:10:20:29 +0200] "GET /forum/search.php?search_id=active_topics&sid=123 HTTP/2.0" 503 0 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; [email protected])" 2024-04-25 10:31:25,496 fail2ban.filter [667]: WARNING [apache-badbots] Simulate NOW in operation since found time has too large deviation 1714033821 ~ 1714033885.4967306 +/- 60 2024-04-25 10:31:25,496 fail2ban.filter [667]: WARNING [apache-badbots] Please check jail has possibly a timezone issue. Line with odd timestamp: my-domain.de:443 3.133.147.87 - - [25/Apr/2024:10:30:21 +0200] "GET /forum/viewforum.php?f=57&sid=123 HTTP/2.0" 503 0 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; [email protected])" It's not blocking any IPs, after all. Tried resolving this, but I must say, I really couldn't find anything current, the mentions of this are not really productive for this case. My /etc/timezone is "Europe/Berlin", so I can't really understand what the cause is. My question is: how do I get around this so that the bot's IP addresses will get blocked?
If we are talking about just a specific webspace or just a few it might be easier to just use htaccess file for that. Code: BrowserMatchNoCase "claudebot" bad_bot Order Deny,Allow Deny from env=bad_bot This should be sufficient to block the requests. This ist for apache, to do this with nginx look for useragent rules or http_user_agent You could use robots.txt aswell, as this seems like a legit good bot which should respect "robots.txt".
Thanks, @pyte , I went for the htaccess approach, although I would have preferred to block them before reaching apache. Seems like they don't respect robots.txt entries. see https://www.linode.com/community/questions/24842/ddos-from-anthropic-ai
The only possibilty to block them before reaching the services itself would be to block them on the network level which either requires blocking the IPs or having a WAF implemented in the network level. So even with fail2ban it would at least reach the service a few times before blocking the IPs. I think blocking them this way will be sufficient.
"A few times" would have been OK, that's why I like the fail2ban approach. Didn't work out for me, though, it seems. Thx anyway.
thing is, fail2ban checks the timestamp in the log file itself. if it deems the log to be "too old" it goes the save approach and ignores it. Looks like you have plenty of logfiles to go trough for poor fail2ban There is a setting "findtime" you can adjust => 600 seconds
I would check the server has time syncronized accurately. Is it running NTP or some such? Then check the time stamps application writes, is some application using another timezone? Although I believe timestamps should be in UTC, otherwise it is hard to compare times between servers (other server may be in another timezone).
If the system time was out of place, the logs would have the wrong time too and there wouldve been no issue. close enough and probably just turned off fail2ban during testing and got some logs to process meanwhile. no? Never can hurt to check something of course. Always =)
fail2ban normally is slow to process all the apache log activity. The htaccess solution written by pyte could be improved creating a Custom log file inside the Apache config for that .htaccess. And later configuring a new jail in fail2ban to process only that log file: Like this: https://stackoverflow.com/questions/45710014/create-log-file-using-htaccess. surely it will work faster reducing the attempts, I believe.