Protect your PHP website from bots with this open source tool

PHP is a widely-used programming language on the web, and it's estimated that nearly 80% of all websites use it. My team at CrowdSec decided that we needed to provide server admins with a PHP bouncer to help ward away bots and bad actors who may attempt to interact with PHP files.

CrowdSec bouncers can be set up at various levels of an applicative stack: web server, firewall, CDN, and so on. This article looks at one more layer: setting up remediation directly at the application level.

Remediation directly in the application can be helpful for various reasons:

It provides a business-logic answer to potential security threats.
It gives freedom about how to respond to security issues.

While CrowdSec already publishes a WordPress bouncer, this PHP library is designed to be included in any PHP application (Drupal, for example). The bouncer helps block attackers, challenging them with CAPTCHA to let humans through while blocking bots.

Prerequisites

This tutorial assumes that you are running Drupal on a Linux server with Apache as a web server.

The first step is to install CrowdSec on your server. You can do this with an official install script. If you're on Fedora, CentOS, or similar, download the RPM version:

$ curl -s https://packagecloud.io/install/repositories/crowdsec/crowdsec/script.rpm.sh

On Debian and Debian-based systems, download the DEB version:

$ curl -s https://packagecloud.io/install/repositories/crowdsec/crowdsec/script.deb.sh

These scripts are simple, so read through the one you download to verify that it imports a GPG key and configures a new repository. Once you're comfortable with what it does, execute it and then install.

$ sudo dnf install crowdsec || sudo apt install crowdsec

CrowdSec detects all the existing services on its own, so there should be no further configuration to get an immediately functional setup.

Test the initial setup

Now that you have CrowdSec installed, launch a web application vulnerability scanner, such as Nikto, and see how it behaves:

$ ./nikto.pl -h https://<ip_or_domain>

Image by:

^{(Philippe Humeau, CC BY-SA 4.0)}

The IP address has been detected and triggers various scenarios, the last one being crowdsecurity/http-crawl-non_statics.

Image by:

^{(Philippe Humeau, CC BY-SA 4.0)}

However, CrowdSec only detects issues, and a bouncer is needed to apply remediation. Here comes the PHP bouncer.

Remediate with the PHP bouncer

Now that you can detect malicious behaviors, you need to block the IP at the website level. At this time, there is no Drupal bouncer available. However, you can use the PHP bouncer directly.

How does it work? The PHP bouncer (like any other bouncer) makes an API call to the CrowdSec API and checks whether it should ban incoming IPs, send them a CAPTCHA, or allow them to pass.

The web server is Apache, so you can use the install script for Apache.

$ git clone https://github.com/crowdsecurity/cs-php-bouncer.git
$ cd cs-php-bouncer/
$ ./install.sh --apache

Image by:

^{(Philippe Humeau, CC BY-SA 4.0)}

The bouncer is configured to protect the whole website. Secure a specific part of the site by adapting the Apache configuration.

Try to access the website

The PHP bouncer is installed and configured. You're banned due to the previous web vulnerability scan actions, but you can try to access the website:

Image by:

^{(Philippe Humeau, CC BY-SA 4.0)}

The bouncer successfully blocked your traffic. If you were not banned following a previous web vulnerability scan, you could add a manual decision with:

$ cscli decisions add -i <your_ip>

For the remaining tests, remove the current decisions:

$ cscli decisions delete -i <your_ip>

Going further

I blocked the IP trying to mess with the PHP website. It’s nice, but what about IPs trying to scan, crawl, or DDoS it? Those kinds of detections can lead to false positives, so why not return a CAPTCHA challenge to check whether it is an actual user (rather than a bot) instead of blocking the IP?

Detect crawlers and scanners

I dislike crawlers and bad user agents and there are various scenarios available on the Hub to spot them.

Ensure the base-http-scenarios collections from the Hub are downloaded with cscli:

$ cscli collections list | grep base-http-scenarios
crowdsecurity/base-http-scenarios  ✔️ enabled  /etc/crowdsec/collections/base-http-scenarios.yaml

If it is not the case, install it, and reload CrowdSec:

$ sudo cscli collections install crowdsecurity/base-http-scenarios
$ sudo systemctl reload crowdsec

Remedy with a CAPTCHA

Since detecting DDoS, crawlers, or malevolent user agents can lead to false positives, I prefer to return a CAPTCHA for any IP address triggering those scenarios to avoid blocking real users.

To achieve this, modify the profiles.yaml file.

Add this YAML block at the beginning of your profile in /etc/crowdsec/profiles.yaml:

---
# /etc/crowdsec/profiles.yaml
name: crawler_captcha_remediation
filter: Alert.Remediation == true && Alert.GetScenario() in ["crowdsecurity/http-crawl-non_statics", "crowdsecurity/http-bad-user-agent"]

decisions:
  - type: captcha
    duration: 4h
on_success: break

With this profile, a CAPTCHA is enforced (for four hours) on any IP address that triggers the scenarios crowdsecurity/http-crawl-non_statics or crowdsecurity/http-bad-user-agent.

Next, reload CrowdSec:

$ sudo systemctl reload crowdsec

Try the custom remediations

Relaunching a web vulnerability scanner would trigger many scenarios, so you would ultimately be banned again. Instead, you can just craft an attack that triggers the bad-user-agent scenario (the list of known bad user-agents is here). Please note that you must activate the rule twice to get banned.

$ curl --silent -I -H "User-Agent: Cocolyzebot" https://example.com > /dev/null
$ curl -I -H "User-Agent: Cocolyzebot" https://example.com
HTTP/1.1 200 OK
Date: Tue, 05 Oct 2021 09:35:43 GMT
Server: Apache/2.4.41 (Ubuntu)
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Cache-Control: no-cache, must-revalidate 
X-Content-Type-options: nosniff 
Content-Language: en 
X-Frame-Options: SAMEORIGIN 
X-Generator: Drupal 7 (https://drupal.org)
Content-Type: text/html; charset=utf-8

You can, of course, see that you get caught for your actions.

$ sudo cscli decisions list

Image by:

^{(Philippe Humeau, CC BY-SA 4.0)}

If you try to access the website, instead of being simply blocked, you receive a CAPTCHA:

Image by:

^{(Philippe Humeau, CC BY-SA 4.0)}

Once you solve it, you can reaccess the website.

Next, unban myself again:

$ cscli decisions delete -i <your_ip>

Launch the vulnerability scanner:

$ ./nikto.pl -h https://example.com

Unlike the last time, you can now see that you've triggered several decisions:

Image by:

^{(Philippe Humeau, CC BY-SA 4.0)}

When trying to access the website, the ban decision has the priority:

Image by:

^{(Philippe Humeau, CC BY-SA 4.0)}

Wrap up

This is a quick way to help block attackers from PHP websites and applications. This article contains only one example. Remediations can be easily extended to fit additional needs. To find out more about installing and using the CrowdSec agent, check this how-to guide to get started.

To download the PHP bouncer, go to the CrowdSec Hub or GitHub.