Original post: https://3xn.nl/projects/2023/09/20/crude-solution-to-ban-bots-by-their-user-agent/
I’ve very much simplified the script that instantly redirects unwanted traffic away from the server. Currently, I am using a very cheap VPS to receive all that traffic.
Here ya go:
<?php
// CC-BY-NC (2023)
// Author: FoxSan - fox@cytag.nl
// This is a functional but dirty hack to block bots, spiders and indexers by looking at the HTTP USER AGENT.
// Traffic that meets the conditions is being yeeted away to any place of your choice.
//////////////////////////////////////////////////////////////
// Emergency bypass
// goto end;
//////////////////////////////////////////////////////////////
// attempt to basically just yeet all bots to another website
$targetURL = "https://DOMAIN.TLD/SUB/";
// Function to check if the user agent appears to be a bot or spider
function isBot()
{
$user_agent = $_SERVER['HTTP_USER_AGENT'];
$bot_keywords = ['bytespider', 'amazonbot', 'MJ12bot', 'YandexBot', 'SemrushBot', 'dotbot', 'AspiegelBot', 'DataForSeoBot', 'DotBot', 'Pinterestbot', 'PetalBot', 'HeadlessChrome', 'GPTBot', 'Sogou', 'ALittle Client', 'fidget-spinner-bot', 'intelx.io_bot', 'Mediatoolkitbot', 'BLEXBot', 'AhrefsBot'];
foreach ($bot_keywords as $keyword) {
if (stripos($user_agent, $keyword) !== false) {
return true;
}
}
return false;
}
// Check if the visitor is a bot or spider
if (isBot()) {
// yeet
header("Location: $targetURL");
// Exit to prevent further processing
exit;
}
end:
// If the visitor is not a bot, spider, or crawler, continue with your website code.
//////////////////////////////////////////////////////////////////////
?>
Okay, this is a very crude way to block bots, spiders and crawlers by their user-agent, but so far, this has been very, very efficient.
Even when one chooses ” yes “, the question will be repeated. This is not a problem, because no one in their right mind is going to add “bot”, “spider” or “crawler” as their user-agent.
So here’s the PHP script that I rammed into a certain website to prevent it from being DDOSsed by (malicious) bots.
<?php
// CC-BY-NC (2023)
// Author: FoxSan - fox@cytag.nl
// This is a functional but dirty hack to block bots, spiders and indexers by looking at the HTTP USER AGENT.
// The form is, iirc, not even working, but that's fine if you only want human visitors.
// It can also throw a 403, but the effect is the same.
////////////////////////////////////////////////////////////////////////////////
// Emergency bypass
// goto end;
////////////////////////////////////////////////////////////////////////////////
// Function to check if the user agent appears to be a bot or spider.
// Enter the bots you would like to block in a list as shown below.
function isBot()
{
$user_agent = $_SERVER["HTTP_USER_AGENT"];
$bot_keywords = ['bytespider',
'amazonbot',
'MJ12bot',
'YandexBot',
'SemrushBot',
'dotbot',
'AspiegelBot',
'DataForSeoBot',
'DotBot',
'Pinterestbot',
'PetalBot',
'HeadlessChrome',
'AhrefsBot'];
foreach ($bot_keywords as $keyword) {
if (stripos($user_agent, $keyword) !== false) {
return true;
}
}
return false;
}
// Check if the visitor is a bot or spider
if (isBot()) {
// This visitor appears to be a bot or spider, so display a choice.
// Check if the choice form is submitted
if (isset($_POST["submit"])) {
// Check the choice made by the visitor
$choice = isset($_POST["choice"]) ? $_POST["choice"] : "";
if ($choice === "yes") {
// User selected "Yes," block access
echo "Access denied. If you believe this is an error, please contact us by writing the word [MAILBOX] before the at sign, followed by [DOMAIN.TLD]";
} elseif ($choice === "no") {
// User selected "No," proceed to end
goto end;
}
} else {
// Output the message to the user and make the choice mandatory
echo "Your user agent suggests you might be a bot, spider, or crawler. Are you one of these three?";
// Output the radio button choices within a form
echo '</p>
<form method="post" action="">';
echo ' <label><input type="radio" name="choice" value="yes" required>Yes</label>';
echo ' <label><input type="radio" name="choice" value="no">No</label>';
echo ' <button type="submit" name="submit">Proceed</button>';
echo "</form>
<p>";
}
// Exit to prevent further processing
exit();
}
end:
// Original website code starts from here.
/////////////////////////////////////////////////////////////
?>