Function to change ownership of the www folder

Just another snippet of code that can be implemented somewhere:

set-perms.sh
#!/bin/bash

# Function to change ownership of the www folder
change_ownership() {
    sudo chown -R "$1":www-data /var/www
    echo "Ownership of the www folder has been set to $1:www-data."
}

# Loop until a valid username is provided
while true; do
    # Prompt the user to enter the desired username
    read -p "Enter the username for permissions: " username

    # Check if the username provided exists
    if id "$username" &>/dev/null; then
        change_ownership "$username"
        break  # Exit the loop if a valid username is provided
    else
        echo "Error: User $username does not exist."
    fi
done

echo "done!"

To make the script executable:

sudo chmod +x set-perms.sh

and run it with

sudo ./set-perms.sh

Loading

[MASTODON 4.2.10] How to update Mastodon to a new version

Updated on 26-08-2024. For the old version, click here.

Contents:
1. UPDATE
2. RUBY VERSION ISSUES
3. RUBY GEM ISSUES
4. CHARLOC_HOLMES GEM ISSUES

So a new release came out and it is important to get this update as soon as possible! This manual is a transcript of the way that I have updated my Mastodon instance. Please make sure you make proper backups and use your brain while updating things.

A guide to making a Mastodon backup can be found here.

Linux flavour: Debian
Update from: 4.2.xx

  1. Log into your server
  2. su - mastodon
  3. cd /home/mastodon/live
  4. git fetch --tags
  5. git checkout [type the most recent version here, starting with the letter v. For example; v4.2.5
    git checkout v4.2.10
  6. If you get a ruby version error, please see bottom of this article for a fix! bundle install
  7. yarn install
  8. RAILS_ENV=production bundle exec rails db:migrate
    
    #NOTE: You might get a ruby error which then suggests you to enter the command "bundle install". Do that and then run the RAILS command again.
  9. RAILS_ENV=production bundle exec rails assets:precompile
  10. exit

OPTIONAL:
After updating, you might get a notification to update the browserlist database. You can do that with the following command:

npx update-browserslist-db@latest

Okay, you can now choose to either reboot or restart the services.

REBOOT:

  1. This command may vary, depending on your Linux flavour.
    systemctl reboot

RESTART:

  1. This command may vary, depending on your Linux flavour.
    systemctl restart mastodon-sidekiq
    systemctl reload mastodon-web
    
    

    Optional:

    systemctl restart mastodon-streaming

RUBY VERSION ISSUES

My system was unable to find the required v3.2.3 of Ruby and I have fixed this by doing the following steps:

  1. Please make sure that your path is correct.
    git -C ~/.rbenv/plugins/ruby-build pull
  2. rbenv install 3.2.3
    
    

    *WAIT TILL DONE* (it may take a little while)

  3. To check all the installed versions type:
    rbenv versions
  4. To set v3.2.3 as the global version, type:
    rbenv global 3.2.3
  5. To double-check the active, installed version, type:
    rbenv versions
  6. Done!

This manual is a transcript of the way that I have updated my Mastodon instance. Please make sure you make proper backups and use your brain while updating things.

Sources: https://richstone.io/where-is-ruby-3-0-0-on-rbenv/

RUBY GEM ISSUES

After entering…

RAILS_ENV=production bundle exec rails db:migrate

…you might get a ruby gem error like:

Could not find rexml-3.3.5, strscan-3.1.0 in locally installed gems
Run `bundle install` to install missing gems.

Enter the command “bundle install” and after that run the RAILS command again.

CHARLOCK_HOLMES ISSUES

The charlock_holmes gem may fail to build on some systems with recent versions of gcc. If you run into such an issue, try

BUNDLE_BUILD__CHARLOCK_HOLMES="--with-cxxflags=-std=c++17" bundle install.

Loading

Update in the script to block bots, spiders and indexers

Original post: https://3xn.nl/projects/2023/09/20/crude-solution-to-ban-bots-by-their-user-agent/

I’ve very much simplified the script that instantly redirects unwanted traffic away from the server. Currently, I am using a very cheap VPS to receive all that traffic.

Here ya go:

<?php

// CC-BY-NC (2023)

// Author: FoxSan - fox@cytag.nl

// This is a functional but dirty hack to block bots, spiders and indexers by looking at the HTTP USER AGENT.
// Traffic that meets the conditions is being yeeted away to any place of your choice.

//////////////////////////////////////////////////////////////
// Emergency bypass
// goto end;
//////////////////////////////////////////////////////////////

// attempt to basically just yeet all bots to another website

$targetURL = "https://DOMAIN.TLD/SUB/";

// Function to check if the user agent appears to be a bot or spider

function isBot()

{

    $user_agent = $_SERVER['HTTP_USER_AGENT'];

$bot_keywords = ['bytespider', 'amazonbot', 'MJ12bot', 'YandexBot', 'SemrushBot', 'dotbot', 'AspiegelBot', 'DataForSeoBot', 'DotBot', 'Pinterestbot', 'PetalBot', 'HeadlessChrome', 'GPTBot', 'Sogou', 'ALittle Client', 'fidget-spinner-bot', 'intelx.io_bot', 'Mediatoolkitbot', 'BLEXBot', 'AhrefsBot'];

    foreach ($bot_keywords as $keyword) {

        if (stripos($user_agent, $keyword) !== false) {

            return true;

        }
    }

    return false;

}

// Check if the visitor is a bot or spider

if (isBot()) {

// yeet

header("Location: $targetURL");

    // Exit to prevent further processing

    exit;

}

end:

// If the visitor is not a bot, spider, or crawler, continue with your website code.

//////////////////////////////////////////////////////////////////////

?>

Loading

More tweaks for the .htaccess file!

Here’s a list of stuff that I have in my .htaccess files on various websites.

I want to work on my website, but any other visitor should be booted to another website so I can work in peace. Sidenote: It's forever since I last used this, so it might work. Or not.

---

# YOUR IP address goes here:
RewriteCond %{REMOTE_ADDR} !^000\.000\.000\.000$
# And provides you access to:
RewriteCond %{REQUEST_URI} !^https://DOMAIN.TLD$ [NC]
# Fine, go have all the media as well
RewriteCond %{REQUEST_URI} !\.(jpg|jpeg|png|gif|svg|swf|css|ico|js)$ [NC]
# Any other visitor can go visit the following website:
RewriteRule .* https://DOMAIN.TLD/ [R=302,L]

# Hey, no viewing access to this file
<FilesMatch "^.ht">
Order deny,allow
Deny from all
</FilesMatch>

# Disable Server Signature
ServerSignature Off

# SSL all the things!
RewriteCond %{HTTPS} !=on
RewriteRule ^/?(.*) https://%{SERVER_NAME}/$1 [R,L]

# No WWW
RewriteCond %{HTTP_HOST} ^www\.DOMAIN\.TLD$
RewriteRule ^/?$ "https\:\/\/DOMAIN\.TLD\/" [R=301,L]

# Do we like Symlinks? Yeah we do.
Options +FollowSymlinks

# No open directories or directory listings. What is this... 1998?
Options All -Indexes
IndexIgnore *

# Rewrite rules to block out some common exploits.
RewriteCond %{QUERY_STRING} base64_encode[^(]*\([^)]*\) [OR]
RewriteCond %{QUERY_STRING} (<|%3C)([^s]*s)+cript.*(>|%3E) [NC,OR]
RewriteCond %{QUERY_STRING} GLOBALS(=|\[|\%[0-9A-Z]{0,2}) [OR]
RewriteCond %{QUERY_STRING} _REQUEST(=|\[|\%[0-9A-Z]{0,2})
RewriteRule .* index.php [F]

# PHP doohickies
php_flag register_globals off 
php_flag safe_mode off 
php_flag allow_url_fopen off 
php_flag display_errors off 
php_value session.save_path '/tmp' 
php_value disable_functions "exec,passthru,shell_exec,system,curl_multi_exec,show_source,eval"

# File Injection Protection, or a code-condom. What.
RewriteCond %{REQUEST_METHOD} GET
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=http:// [OR]
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=(\.\.//?)+ [OR]
RewriteCond %{QUERY_STRING} [a-zA-Z0-9_]=/([a-z0-9_.]//?)+ [NC]
RewriteRule .* - [F]

# /proc/self/environ? Go away!
RewriteCond %{QUERY_STRING} proc/self/environ [NC,OR]

# Disallow Access To Sensitive Files. Enter your own file names.
RewriteRule ^(htaccess.txt|configuration.php(-dist)?|joomla.xml|README.txt|web.config.txt|CONTRIBUTING.md|phpunit.xml.dist|plugin_googlemap2_proxy.php)$ - [F]

# Don't allow any pages to be framed - Defends against CSRF
<IfModule mod_headers.c>
Header set X-Frame-Options SAMEORIGIN
</IfModule>

# Disallow Php Easter Eggs
RewriteCond %{QUERY_STRING} \=PHP[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12} [NC]
RewriteRule .* index.php [F]

# Libwww-perl Access Block
RewriteCond %{HTTP_USER_AGENT} libwww-perl.* 
RewriteRule .* ? [F,L]

# Uh. I forgot.
<IfModule mod_autoindex.c>
IndexIgnore *
</IfModule>

# NO SNIFFYWIFFY OwO
<IfModule mod_headers.c>
Header always set X-Content-Type-Options "nosniff"
</IfModule>

# NEEDS TESTING
# Turn on IE8-IE9 XSS prevention tools
#Header set X-XSS-Protection "1; mode=block"

# NEEDS TESTING TOO
# Only allow JavaScript from the same domain to be run.
# Don't allow inline JavaScript to run.
#Header set X-Content-Security-Policy "allow 'self';"

# Example if you don't like Russia and Turkey (Optional A1 is to block anonymous proxies)
RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^(RU|TR)$
RewriteRule .* https://DOMAIN.TLD/directorywithindexdothtml/ [R=302,L]

# Caching stuff
<FilesMatch "\.(ico|jpg|jpeg|png|gif|js|css|swf)$">
Header set Cache-Control "public, max-age=3600"
</FilesMatch>

# Compress text, html, javascript, css, xml!
<IfModule mod_deflate.c>
AddOutputFilterByType DEFLATE text/html text/xml text/css text/plain
AddOutputFilterByType DEFLATE image/svg+xml application/xhtml+xml application/xml
AddOutputFilterByType DEFLATE application/rdf+xml application/rss+xml application/atom+xml
AddOutputFilterByType DEFLATE text/javascript application/javascript application/x-javascript application/json
AddOutputFilterByType DEFLATE application/x-font-ttf application/x-font-otf
AddOutputFilterByType DEFLATE font/truetype font/opentype
</IfModule>

# Joomla! core SEF Section.
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
RewriteCond %{REQUEST_URI} !^/index\.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* index.php [L]

# PHP TWEAKS 
php_value upload_max_filesize 200M
php_value post_max_size 200M
php_value max_input_vars 2000
php_value max_execution_time 120
php_value memory_limit 1024M

## BEGIN EXPIRES CACHING 
<IfModule mod_expires.c>
ExpiresActive on
ExpiresDefault "access plus 1 month"
ExpiresByType text/cache-manifest "access plus 0 seconds"
ExpiresByType text/html "access plus 0 seconds"
ExpiresByType text/xml "access plus 0 seconds"
ExpiresByType application/xml "access plus 0 seconds"
ExpiresByType application/json "access plus 0 seconds"
ExpiresByType application/rss+xml "access plus 1 hour"
ExpiresByType application/atom+xml "access plus 1 hour"
ExpiresByType image/x-icon "access plus 1 week"
ExpiresByType image/gif "access plus 1 month"
ExpiresByType image/png "access plus 1 month"
ExpiresByType image/jpg "access plus 1 month"
ExpiresByType image/jpeg "access plus 1 month"
ExpiresByType video/ogg "access plus 1 month"
ExpiresByType audio/ogg "access plus 1 month"
ExpiresByType video/mp4 "access plus 1 month"
ExpiresByType video/webm "access plus 1 month"
ExpiresByType text/x-component "access plus 1 month"
ExpiresByType application/x-font-ttf "access plus 1 month"
ExpiresByType font/opentype "access plus 1 month"
ExpiresByType application/x-font-woff "access plus 1 month"
ExpiresByType image/svg+xml "access plus 1 month"
ExpiresByType application/vnd.ms-fontobject "access plus 1 month"
ExpiresByType text/css "access plus 1 year"
ExpiresByType text/javascript "access plus 1 year"
ExpiresByType application/javascript "access plus 1 year"
<IfModule mod_headers.c>
Header append Cache-Control "public"
</IfModule>
</IfModule>
## END EXPIRES CACHING

Loading

Bot Block Party

A snippet from my .htaccess file to list the blocked bots:

RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Bolt\ 0 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot\@yahoo\.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} CazoodleBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Default\ Browser\ 0 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} discobot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ecxi [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [NC,OR]
RewriteCond %{HTTP_USER_AGENT} GT::WWW [NC,OR]
RewriteCond %{HTTP_USER_AGENT} heritrix [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [NC,OR]
RewriteCond %{HTTP_USER_AGENT} HTTP::Lite [NC,OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ia_archiver [NC,OR]
RewriteCond %{HTTP_USER_AGENT} IDBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} id-search [NC,OR]
RewriteCond %{HTTP_USER_AGENT} id-search\.org [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetSeer\.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} IRLbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ISC\ Systems\ iRc\ Search\ 2\.1 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Java [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} libwww [NC,OR]
RewriteCond %{HTTP_USER_AGENT} libwww-perl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Link [NC,OR]
RewriteCond %{HTTP_USER_AGENT} LinksManager.com_bot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} linkwalker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} lwp-trivial [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Maxthon$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} MFC_Tear_Sample [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^microsoft\.url [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Microsoft\ URL\ Control [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Missigua\ Locator [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla\.*Indy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla\.*NEWT [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Nutch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [NC,OR]
RewriteCond %{HTTP_USER_AGENT} panscient.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [NC,OR]
RewriteCond %{HTTP_USER_AGENT} PECL::HTTP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^PeoplePal [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [NC,OR]
RewriteCond %{HTTP_USER_AGENT} PHPCrawl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} PleaseCrawl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Rippers\ 0 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} SBIder [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SeaMonkey$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck\.internetseer\.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Snoopy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Steeler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Toata\ dragostea\ mea\ pentru\ diavola [NC,OR]
RewriteCond %{HTTP_USER_AGENT} URI::Fetch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} urllib [NC,OR]
RewriteCond %{HTTP_USER_AGENT} User-Agent [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Web\ Sucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} webalta [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [NC,OR]
RewriteCond %{HTTP_USER_AGENT} WebCollage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Wells\ Search\ II [NC,OR]
RewriteCond %{HTTP_USER_AGENT} WEP\ Search [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WWW-Mechanize [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} zermelo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus\.*Webster [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ZyBorg [NC,OR]
RewriteCond %{HTTP_USER_AGENT} JSimplepieFactory [NC]
RewriteRule ^.* - [F,L]

Loading

Crude solution to ban bots by their user-agent

Okay, this is a very crude way to block bots, spiders and crawlers by their user-agent, but so far, this has been very, very efficient.

Even when one chooses ” yes “, the question will be repeated. This is not a problem, because no one in their right mind is going to add “bot”, “spider” or “crawler” as their user-agent.

So here’s the PHP script that I rammed into a certain website to prevent it from being DDOSsed by (malicious) bots.

<?php

// CC-BY-NC (2023)
// Author: FoxSan - fox@cytag.nl
// This is a functional but dirty hack to block bots, spiders and indexers by looking at the HTTP USER AGENT.
// The form is, iirc, not even working, but that's fine if you only want human visitors.
// It can also throw a 403, but the effect is the same.

////////////////////////////////////////////////////////////////////////////////
// Emergency bypass
// goto end;
////////////////////////////////////////////////////////////////////////////////

// Function to check if the user agent appears to be a bot or spider.
// Enter the bots you would like to block in a list as shown below.
function isBot()
{
    $user_agent = $_SERVER["HTTP_USER_AGENT"];
    $bot_keywords = ['bytespider', 
                     'amazonbot', 
                     'MJ12bot', 
                     'YandexBot', 
                     'SemrushBot', 
                     'dotbot', 
                     'AspiegelBot',
                     'DataForSeoBot',
                     'DotBot',
                     'Pinterestbot',
                     'PetalBot',
                     'HeadlessChrome', 
                     'AhrefsBot'];

    foreach ($bot_keywords as $keyword) {
        if (stripos($user_agent, $keyword) !== false) {
            return true;
        }
    }

    return false;
}

// Check if the visitor is a bot or spider
if (isBot()) {
    // This visitor appears to be a bot or spider, so display a choice.
    // Check if the choice form is submitted
    if (isset($_POST["submit"])) {
        // Check the choice made by the visitor
        $choice = isset($_POST["choice"]) ? $_POST["choice"] : "";

        if ($choice === "yes") {
            // User selected "Yes," block access
            echo "Access denied. If you believe this is an error, please contact us by writing the word [MAILBOX] before the at sign, followed by [DOMAIN.TLD]";
        } elseif ($choice === "no") {
            // User selected "No," proceed to end
            goto end;
        }
    } else {
        // Output the message to the user and make the choice mandatory
        echo "Your user agent suggests you might be a bot, spider, or crawler. Are you one of these three?";

        // Output the radio button choices within a form
        echo '</p>
<form method="post" action="">';
        echo ' <label><input type="radio" name="choice" value="yes" required>Yes</label>';
        echo ' <label><input type="radio" name="choice" value="no">No</label>';
        echo ' <button type="submit" name="submit">Proceed</button>';
        echo "</form>
<p>";
    }

    // Exit to prevent further processing
    exit();
}
end:
// Original website code starts from here.
/////////////////////////////////////////////////////////////
?>

Loading

CloudPanel website causing “Too many redirects”

I have installed CloudPanel and the new website caused a “Too many redirects” bug. This is because my SSL certificates are controlled by a proxy and this can cause some confusion between the systems. Also, because CloudPanel installs its own certificates.

This application can also install a Let’s Encrypt certificate, but this works only in more conventional systems. Mine is going through a DNS to a Proxy that listens to a certain IP address and that proxy redirects the request to a Virtual Machine on one of my servers.

So, here is my, probably unconventional method of disabling the SSL certificates on my CloudPanel installation:

  1. Open the CloudPanel controlpanel.
  2. Select the website you want to edit
  3. Choose the Vhost tab
  4. Change the following code into the new code:
server {
listen 80;
listen [::]:80;
listen 443 ssl http2;
listen [::]:443 ssl http2;
{{ssl_certificate_key}}
{{ssl_certificate}}
server_name subdomain.3xn.nl;
{{root}}

{{nginx_access_log}}
{{nginx_error_log}}

if ($scheme != "https") {
rewrite ^ https://$host$uri permanent;
}
server {
listen 80;
listen [::]:80;
# listen 443 ssl http2;
# listen [::]:443 ssl http2;
# {{ssl_certificate_key}}
# {{ssl_certificate}}
server_name subdomain.3xn.nl;
{{root}}

{{nginx_access_log}}
{{nginx_error_log}}

# if ($scheme != "https") {
# rewrite ^ https://$host$uri permanent;
# }

Done! Your website should now say “Hello world :-)”

You can see that I have disabled the listen to port 443, the certificate keys, the forced https and the path to the keys. I chose to switch off the forced HTTP, because my proxy is already taking care of that.

This post is subject to change, but this helps you along your way!

Loading

How to update Mastodon to a new version

EXPIRED: NEW GUIDE AVAILABLE HERE

Updating Mastodon and creating backups are important steps to ensure the security and stability of your instance. Here’s a comprehensive tutorial on how to update Mastodon, including making backups of the database and assets:


Note 1: Always perform updates on a test/staging instance before applying them to your live instance. This tutorial assumes you have some basic command line and server administration knowledge.

Note 2: If you made alterations to your files and want to update to a new branch, like v4.2.0, don’t forget to stash your files first. DO THIS AT YOUR OWN RISK.

cd /home/mastodon/live
git stash

You will not have an error message any more when entering:
I currently do not know how to put the stashed files back, this will be found out later.

git checkout v4.2.0

Note 3: If you switch to a new branch and version, like 4.2.0, you might run into an error stating the following:

rbenv: version '3.2.2' is not installed (set by /home/mastodon/live/.ruby-version)

Take the following steps to solve this problem:
(Type the version you need where it says x.x.x)

rbenv install x.x.x

If that fails, try the following command:

brew update && brew upgrade ruby-build

Followed by:
(Type the version you need where it says x.x.x)

rbenv install x.x.x

OPTIONAL: If bundler or rails fails to work, try the following commands:

gem install bundler

gem install rails

Click here for the backup steps. It basically comes down to a Database dump, a settings file backup and a Redis dump. If you wish to backup your assets like images and stuff (User-uploaded files), backup the folder named “public/system”. Keep in mind that this folder can be rather large. Actually, it can become rather massive.

After a good 90 minutes, I gave up on trying to show you how large the asset folder is. So beware if you are going to make a backup of it. Perhaps you can just skip the cache folder?

You can always check the folder size by using NCDU, for which you can find the manual here. Also, installations may vary, but this is an example of my instance.


Upgrade procedure.

  1. su - mastodon
  2. cd /home/mastodon/live
  3. git fetch --tags
  4. git checkout [type the most recent version here, starting with the letter v. For example; v4.0.1
    
    Command example: git checkout v4.0.1
  5. bundle install
  6. yarn install
  7. RAILS_ENV=production bundle exec rails db:migrate
  8. RAILS_ENV=production bundle exec rails assets:precompile
  9. exit
  10. reboot now

And that should be it!

If you don’t want to restart your server, use the following commands instead of “reboot now”:

  1. exit
  2. systemctl restart mastodon-sidekiq
  3. systemctl reload mastodon-web

    The reload operation is a zero-downtime restart, also called a “phased restart”. As such, Mastodon upgrades usually do not require any advance notice to users about planned downtime. In rare cases, you can use the restart operation instead, but there will be a (short) felt interruption of service for your users.

  4. The streaming API server is also updated and requires a restart; doing so will result in all connected clients being disconnected, which can increase the load on your server:
systemctl restart mastodon-streaming

Done!

PS: When I updated my instance from 4.1.9 to 4.2.0, a lot of warnings flew by, and this is, in my experience, not a problem as the instance is working perfectly.

Loading