Bot Block Party

A snippet from my .htaccess file to list the blocked bots:

RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Bolt\ 0 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot\@yahoo\.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} CazoodleBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Default\ Browser\ 0 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} discobot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ecxi [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [NC,OR]
RewriteCond %{HTTP_USER_AGENT} GT::WWW [NC,OR]
RewriteCond %{HTTP_USER_AGENT} heritrix [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [NC,OR]
RewriteCond %{HTTP_USER_AGENT} HTTP::Lite [NC,OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ia_archiver [NC,OR]
RewriteCond %{HTTP_USER_AGENT} IDBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} id-search [NC,OR]
RewriteCond %{HTTP_USER_AGENT} id-search\.org [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetSeer\.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} IRLbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ISC\ Systems\ iRc\ Search\ 2\.1 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Java [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} libwww [NC,OR]
RewriteCond %{HTTP_USER_AGENT} libwww-perl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Link [NC,OR]
RewriteCond %{HTTP_USER_AGENT} LinksManager.com_bot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} linkwalker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} lwp-trivial [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Maxthon$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} MFC_Tear_Sample [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^microsoft\.url [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Microsoft\ URL\ Control [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Missigua\ Locator [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla\.*Indy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla\.*NEWT [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Nutch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [NC,OR]
RewriteCond %{HTTP_USER_AGENT} panscient.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [NC,OR]
RewriteCond %{HTTP_USER_AGENT} PECL::HTTP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^PeoplePal [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [NC,OR]
RewriteCond %{HTTP_USER_AGENT} PHPCrawl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} PleaseCrawl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Rippers\ 0 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} SBIder [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SeaMonkey$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck\.internetseer\.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Snoopy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Steeler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Toata\ dragostea\ mea\ pentru\ diavola [NC,OR]
RewriteCond %{HTTP_USER_AGENT} URI::Fetch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} urllib [NC,OR]
RewriteCond %{HTTP_USER_AGENT} User-Agent [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Web\ Sucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} webalta [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [NC,OR]
RewriteCond %{HTTP_USER_AGENT} WebCollage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Wells\ Search\ II [NC,OR]
RewriteCond %{HTTP_USER_AGENT} WEP\ Search [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WWW-Mechanize [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} zermelo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus\.*Webster [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ZyBorg [NC,OR]
RewriteCond %{HTTP_USER_AGENT} JSimplepieFactory [NC]
RewriteRule ^.* - [F,L]

Loading

Crude solution to ban bots by their user-agent

Okay, this is a very crude way to block bots, spiders and crawlers by their user-agent, but so far, this has been very, very efficient.

Even when one chooses ” yes “, the question will be repeated. This is not a problem, because no one in their right mind is going to add “bot”, “spider” or “crawler” as their user-agent.

So here’s the PHP script that I rammed into a certain website to prevent it from being DDOSsed by (malicious) bots.

<?php

// CC-BY-NC (2023)
// Author: FoxSan - fox@cytag.nl
// This is a functional but dirty hack to block bots, spiders and indexers by looking at the HTTP USER AGENT.
// The form is, iirc, not even working, but that's fine if you only want human visitors.
// It can also throw a 403, but the effect is the same.

////////////////////////////////////////////////////////////////////////////////
// Emergency bypass
// goto end;
////////////////////////////////////////////////////////////////////////////////

// Function to check if the user agent appears to be a bot or spider.
// Enter the bots you would like to block in a list as shown below.
function isBot()
{
    $user_agent = $_SERVER["HTTP_USER_AGENT"];
    $bot_keywords = ['bytespider', 
                     'amazonbot', 
                     'MJ12bot', 
                     'YandexBot', 
                     'SemrushBot', 
                     'dotbot', 
                     'AspiegelBot',
                     'DataForSeoBot',
                     'DotBot',
                     'Pinterestbot',
                     'PetalBot',
                     'HeadlessChrome', 
                     'AhrefsBot'];

    foreach ($bot_keywords as $keyword) {
        if (stripos($user_agent, $keyword) !== false) {
            return true;
        }
    }

    return false;
}

// Check if the visitor is a bot or spider
if (isBot()) {
    // This visitor appears to be a bot or spider, so display a choice.
    // Check if the choice form is submitted
    if (isset($_POST["submit"])) {
        // Check the choice made by the visitor
        $choice = isset($_POST["choice"]) ? $_POST["choice"] : "";

        if ($choice === "yes") {
            // User selected "Yes," block access
            echo "Access denied. If you believe this is an error, please contact us by writing the word [MAILBOX] before the at sign, followed by [DOMAIN.TLD]";
        } elseif ($choice === "no") {
            // User selected "No," proceed to end
            goto end;
        }
    } else {
        // Output the message to the user and make the choice mandatory
        echo "Your user agent suggests you might be a bot, spider, or crawler. Are you one of these three?";

        // Output the radio button choices within a form
        echo '</p>
<form method="post" action="">';
        echo ' <label><input type="radio" name="choice" value="yes" required>Yes</label>';
        echo ' <label><input type="radio" name="choice" value="no">No</label>';
        echo ' <button type="submit" name="submit">Proceed</button>';
        echo "</form>
<p>";
    }

    // Exit to prevent further processing
    exit();
}
end:
// Original website code starts from here.
/////////////////////////////////////////////////////////////
?>

Loading

CloudPanel website causing “Too many redirects”

I have installed CloudPanel and the new website caused a “Too many redirects” bug. This is because my SSL certificates are controlled by a proxy and this can cause some confusion between the systems. Also, because CloudPanel installs its own certificates.

This application can also install a Let’s Encrypt certificate, but this works only in more conventional systems. Mine is going through a DNS to a Proxy that listens to a certain IP address and that proxy redirects the request to a Virtual Machine on one of my servers.

So, here is my, probably unconventional method of disabling the SSL certificates on my CloudPanel installation:

  1. Open the CloudPanel controlpanel.
  2. Select the website you want to edit
  3. Choose the Vhost tab
  4. Change the following code into the new code:
server {
listen 80;
listen [::]:80;
listen 443 ssl http2;
listen [::]:443 ssl http2;
{{ssl_certificate_key}}
{{ssl_certificate}}
server_name subdomain.3xn.nl;
{{root}}

{{nginx_access_log}}
{{nginx_error_log}}

if ($scheme != "https") {
rewrite ^ https://$host$uri permanent;
}
server {
listen 80;
listen [::]:80;
# listen 443 ssl http2;
# listen [::]:443 ssl http2;
# {{ssl_certificate_key}}
# {{ssl_certificate}}
server_name subdomain.3xn.nl;
{{root}}

{{nginx_access_log}}
{{nginx_error_log}}

# if ($scheme != "https") {
# rewrite ^ https://$host$uri permanent;
# }

Done! Your website should now say “Hello world :-)”

You can see that I have disabled the listen to port 443, the certificate keys, the forced https and the path to the keys. I chose to switch off the forced HTTP, because my proxy is already taking care of that.

This post is subject to change, but this helps you along your way!

Loading

How to update Mastodon to a new version

EXPIRED: NEW GUIDE AVAILABLE HERE

Updating Mastodon and creating backups are important steps to ensure the security and stability of your instance. Here’s a comprehensive tutorial on how to update Mastodon, including making backups of the database and assets:


Note 1: Always perform updates on a test/staging instance before applying them to your live instance. This tutorial assumes you have some basic command line and server administration knowledge.

Note 2: If you made alterations to your files and want to update to a new branch, like v4.2.0, don’t forget to stash your files first. DO THIS AT YOUR OWN RISK.

cd /home/mastodon/live
git stash

You will not have an error message any more when entering:
I currently do not know how to put the stashed files back, this will be found out later.

git checkout v4.2.0

Note 3: If you switch to a new branch and version, like 4.2.0, you might run into an error stating the following:

rbenv: version '3.2.2' is not installed (set by /home/mastodon/live/.ruby-version)

Take the following steps to solve this problem:
(Type the version you need where it says x.x.x)

rbenv install x.x.x

If that fails, try the following command:

brew update && brew upgrade ruby-build

Followed by:
(Type the version you need where it says x.x.x)

rbenv install x.x.x

OPTIONAL: If bundler or rails fails to work, try the following commands:

gem install bundler

gem install rails

Click here for the backup steps. It basically comes down to a Database dump, a settings file backup and a Redis dump. If you wish to backup your assets like images and stuff (User-uploaded files), backup the folder named “public/system”. Keep in mind that this folder can be rather large. Actually, it can become rather massive.

After a good 90 minutes, I gave up on trying to show you how large the asset folder is. So beware if you are going to make a backup of it. Perhaps you can just skip the cache folder?

You can always check the folder size by using NCDU, for which you can find the manual here. Also, installations may vary, but this is an example of my instance.


Upgrade procedure.

  1. su - mastodon
  2. cd /home/mastodon/live
  3. git fetch --tags
  4. git checkout [type the most recent version here, starting with the letter v. For example; v4.0.1
    
    Command example: git checkout v4.0.1
  5. bundle install
  6. yarn install
  7. RAILS_ENV=production bundle exec rails db:migrate
  8. RAILS_ENV=production bundle exec rails assets:precompile
  9. exit
  10. reboot now

And that should be it!

If you don’t want to restart your server, use the following commands instead of “reboot now”:

  1. exit
  2. systemctl restart mastodon-sidekiq
  3. systemctl reload mastodon-web

    The reload operation is a zero-downtime restart, also called a “phased restart”. As such, Mastodon upgrades usually do not require any advance notice to users about planned downtime. In rare cases, you can use the restart operation instead, but there will be a (short) felt interruption of service for your users.

  4. The streaming API server is also updated and requires a restart; doing so will result in all connected clients being disconnected, which can increase the load on your server:
systemctl restart mastodon-streaming

Done!

PS: When I updated my instance from 4.1.9 to 4.2.0, a lot of warnings flew by, and this is, in my experience, not a problem as the instance is working perfectly.

Loading

Measurements of a double parking garage for your home.

I got a question about whether it was better to have a 5x6m parking garage or a 6×6 one. Well, it depends on your budget, of course. It also comes down to how comfortable you want your parking to be and if you want additional storage space.

One (5,7m) door is expensive, and when it breaks down, you won’t be able to get in (easily) any more. One 3,5m door requires careful parking but is the cheapest option. Personally, I’d go for two 2.6m wide doors, so both cars can be parked comfortably. It also allows for a 20x20cm centre support beam, which can make your construction lighter and more economical.

If you want more storage space, you can make it 6m deep instead of 5m, but this comes at an extra cost. It doesn’t look like much, but it is an additional 2m of wall and 6m² of floor and (flat) roof.

Loading

A working Apache2 server with PHP7.4

I was in need of a server solution that could be quickly deployed as a VM.

      1. Install Debian 11 as a VM with web- and SSH server
      2. Create a USER next to your root account during the installation
      3. Find the IP address of the new installation. The easiest is if you have NoVNC running. Log in as USER and type
        ip a
      4. Time to so the sudo thing
        su

        log in as root

        apt-get update && apt-get install -y sudo
        usermod -aG sudo USER
        exit
        exit

        log back in as USER

      5. Okay, let’s install some more stuff but first we do an update
        sudo apt-get update && sudo apt-get upgrade -y

        Now we want some essentials

        sudo apt-get install -y dirmngr gnupg2 nano wget gpg curl fail2ban ufw software-properties-common

        Preparing the PHP install

        wget -q https://packages.sury.org/php/apt.gpg -O- | sudo apt-key add -
        echo "deb https://packages.sury.org/php/ $(lsb_release -sc) main" | sudo tee /etc/apt/sources.list.d/php.list
        sudo apt-get update
        sudo apt-get install -y php7.4 libapache2-mod-php7.4 php7.4-mysql php7.4-curl php7.4-gd php7.4-mbstring php7.4-xml php7.4-xmlrpc php7.4-zip

        And restart the Apache2 Webserver

        sudo systemctl restart apache2
      6. Alright, that’s done. Next step is to test things.
        sudo nano /var/www/html/test.php

        Enter this into the php file and press Control X and type Y to save and exit.

        <?php
        // Show all PHP information
        phpinfo();
        ?>
      7. Go to the IP address of the server you just created and type
        HTTP://IP ADDRESS/test.php
        

        If you see a PHP page with all sorts of data, you’re good. If not, go fix. Don’t ask me, I’m not there yet!

Loading

Mastodon: Manual backups

This is an updated version (02-02-2024)

This manual is a transcript of the way that I have made a backup of my Mastodon instance. Please make sure you are careful and use your brain while following this manual.

Preparing the backup folder

  1. Log in as root
  2. cd /home/mastodon
  3. mkdir backups
  4. cd backups

Making the backups

  1. Database (three steps)
    su - mastodon
    cd /home/mastodon/backups
    pg_dump -U mastodon mastodon_production -F t > DATE_FILENAME.tar
    
    Example: pg_dump -U mastodon mastodon_production -F t > 2024-02-02-mastodon_production.tar
  2. Settings file (one step)
    cp /home/mastodon/live/.env.production /home/mastodon/backups/DATE_.env.production
    
    Example: cp /home/mastodon/live/.env.production /home/mastodon/backups/2024-02-02-.env.production
  3. Redis (two steps)
    exit
    cp /var/lib/redis/dump.rdb /home/mastodon/backups/DATE_dump.rdb
    
    Example: cp /var/lib/redis/dump.rdb /home/mastodon/backups/2024-02-02-REDIS-dump.rdb

You can now check your backups folder to see if all three files are present. This is also a good moment to copy the backup files to another, safe, location.

If you wish to backup your assets like images and stuff (User-uploaded files), backup the folder named “public/system”. Keep in mind that this folder can be rather large. Actually, it can become rather massive.

Loading

Garden sheds and stuff

Some stuff I get inspiration from:

https://www.lowes.com/n/how-to/she-shed
https://www.familyhandyman.com/project/how-to-build-a-portable-sauna/
http://thegardenfrog.me/my-300-rustic-picket-garden-shed/
https://www.buitenlevengevoel.nl/zelf-een-tuinhuis-bouwen/

Do Not Buy Ted’s 16,000 Woodworking Plans without Reading This

Spoiler: It’s a very terrible scam. More: https://3xn.nl/projects/2022/08/23/do-not-buy-teds-woodworking/

Loading