Original post: https://3xn.nl/projects/2023/09/20/crude-solution-to-ban-bots-by-their-user-agent/
I’ve very much simplified the script that instantly redirects unwanted traffic away from the server. Currently, I am using a very cheap VPS to receive all that traffic.
Here ya go:
<?php
// CC-BY-NC (2023)
// Author: FoxSan - fox@cytag.nl
// This is a functional but dirty hack to block bots, spiders and indexers by looking at the HTTP USER AGENT.
// Traffic that meets the conditions is being yeeted away to any place of your choice.
//////////////////////////////////////////////////////////////
// Emergency bypass
// goto end;
//////////////////////////////////////////////////////////////
// attempt to basically just yeet all bots to another website
$targetURL = "https://DOMAIN.TLD/SUB/";
// Function to check if the user agent appears to be a bot or spider
function isBot()
{
$user_agent = $_SERVER['HTTP_USER_AGENT'];
$bot_keywords = ['bytespider', 'amazonbot', 'MJ12bot', 'YandexBot', 'SemrushBot', 'dotbot', 'AspiegelBot', 'DataForSeoBot', 'DotBot', 'Pinterestbot', 'PetalBot', 'HeadlessChrome', 'GPTBot', 'Sogou', 'ALittle Client', 'fidget-spinner-bot', 'intelx.io_bot', 'Mediatoolkitbot', 'BLEXBot', 'AhrefsBot'];
foreach ($bot_keywords as $keyword) {
if (stripos($user_agent, $keyword) !== false) {
return true;
}
}
return false;
}
// Check if the visitor is a bot or spider
if (isBot()) {
// yeet
header("Location: $targetURL");
// Exit to prevent further processing
exit;
}
end:
// If the visitor is not a bot, spider, or crawler, continue with your website code.
//////////////////////////////////////////////////////////////////////
?>