Phishing Scambaiting Security

My small war with scammers

Since the World Health Organisation declared COVID19 a pandemic on the 30th January 2020, there appears to have been a large increase of scams. People have invested a lot more time sat at their computer screens since communities across the world entered lockdown.

Never have we been more attractive to scammers.

A big part of the problem is Phishing websites. For those of you who have not yet seen, I recently wrote an article about how a lot of scammers appear to use a company called Namecheap to host and register their illegal websites. My article focused on how slow Namecheap are to take down these websites. I also tweeted the Namecheap CEO about this issue, but I received no response.

There will inevitably be a handful of users who fall for these scams, especially when the websites and domain names are hosted with companies who take days to take them offline. This got me wondering; what can we do to help (aside from reporting the websites and waiting for the hosting company to take them offline)?

Messing with Phishers

If you ask me, the best way to annoy a scammer is to waste their time. How can we do this? In the case of Phishing websites, why don’t we fill out their web form which they have solicited us to fill out? Though, we will fill it out several thousand times with random data. The idea being that when they go to review the form data, there will be so much fake data in their database that they won’t be able to distinguish the fake data from the legitimate data provided by victims.

Is this legal? I don’t know. Is it the right thing to do? Definitely.


  • Phishing websites often use shared web hosting services. It is therefore very important not to stress the server and cause a Denial of Service attack. Otherwise, this will more than certainly put you on the wrong side of the law, and inconvenience legitimate users of the hosting service.
  • The fake data generated will need to look real so the scammers find it impossible (or at least very difficult and time consuming) to identify.
  • The data will need to be submitted from a number of different IP addresses to further reduce the risks of scammers differentiating the data.

Python Script 1 – Form Scraper

Before we start, I need to mention that my Python programming skills are not at all good. Please do not judge me by my terrible code. 🙂

I decided to split my code into two parts. The first script would be responsible for identifying HTML forms on a URL. I would simply need to specify the Phishing website as a parameter to my script. The form information would then be saved to a configuration file which would contain a list of all input fields detected on the HTML form. There will usually be a variety of fields such as usernames, passwords, e-mail addresses, names etc.

Usage: ./ "" "/path-to-phishing-page"
# I am not responsible for what you do with this script. Run at your own risk.
import sys
from bs4 import BeautifulSoup
from requests_html import HTMLSession
import time
import os

session = HTMLSession()

def get_form_details(form):
    details = {}
    action = form.attrs.get("action")
    method = form.attrs.get("method", "get").lower()
    inputs = []
    for input_tag in form.find_all("input"):
        input_type = input_tag.attrs.get("type", "text")
        input_name = input_tag.attrs.get("name")
        input_value =input_tag.attrs.get("value", "")
        inputs.append({"type": input_type, "name": input_name, "value": input_value})

    details["action"] = action
    details["method"] = method
    details["inputs"] = inputs
    return details

def get_all_forms(url):
    res = session.get(url)
    soup = BeautifulSoup(res.html.html, "html.parser")
    return soup.find_all("form")

url = sys.argv[1] + sys.argv[2]
baseurl = sys.argv[1]
forms = get_all_forms(url)

for i, form in enumerate(forms, start=1):
    form_details = get_form_details(form)
    print("="*50, f"form #{i}", "="*50)
    print("Do you want to use this form? Y/N\n\n\n\n\n\n")
    answer = input()
    if answer == 'Y':
        method = form.attrs.get("method", "get").lower()
        submit = form.attrs.get("action", "get").lower()
        if not os.path.exists(".fwp"):
        if url[:7] == "http://":
            http = True
            c = 7
        elif url[:8] == "https://":
            https = True
            c = 8
            print("Invalid URL. Please remember to include http:// or https://")
        pos = url.find("/",c + 1)
        folder = url[c:pos]
        print("Okay! Saving form to config file...")
        if not os.path.exists(".fwp/" + folder):
            os.mkdir(".fwp/" + folder)
        f = open(".fwp/" + folder + "/.config", "w")
        f.write(method + "," + baseurl + submit + "\n")
        for input_tag in form_details["inputs"]:
            f.write(input_tag["name"] + "," + input_tag["type"] + "\n")

We can test this script against this fake Halifax website:

As we can see here, the script has found a login form and asked us if this is the form we intend to target.

When we answer yes to this script, the following file is saved:


This is a list of all the input fields identified (with the POST destination URL at the top), followed by their input types. The next step is to modify this file slightly to specify the type of data that needs to be generated.


What we have essentially done here is add another value on the end of each line which determines the data which should be generated. For example, the first field called ‘ip’ should have a random IP address generated, so ‘ipaddress’ has been added to the end. The script is programmed to look out for certain data types such as ‘ipaddress’ so the correct data can be generated.

This is where the second script comes into play.

Python Script 2 – Data Generator and Form Submission

The second script is responsible for two things.

A) Reading the config file output from the first script, and generating the random data.
B) Submitting that data to the target in mass.

Usage: / config-file amount-of-times-to-hit-target
# I am not responsible for what you do with this script. Run at your own risk.
import sys
import requests
import random
from faker import Faker
import time
from ipaddress import IPv4Address
faker = Faker('en_GB')

postdata = {}

config = sys.argv[1]
config = open(config, "r")

i = 0

maxreq = int(sys.argv[2])

while i < maxreq:
    n = 1 
    ua = faker.user_agent()
    for lines in config:
        line = lines.split()
        if n == 1:
            if line[0].lower()[:5] != "post,":
                print("This form does not use POST and is currently not supported. Good bye.")
            url = line[0][-(len(line[0])-5):]
            expl = line[0].split(",")
            consideration = expl[1]
            datatype = expl[2]
            name = expl[0]
            profile = faker.profile()
            if consideration == "fixed":
                # Fixed value
                val = datatype
                if datatype == "username":
                    val = profile["username"]
                elif datatype == "password":
                    passwords = open("/usr/share/wordlists/passwords/10-million-password-list-top-1000000.txt")
                    with passwords as f:
                        lines = [line.rstrip('\n') for line in f]
                        val = random.choice(lines)
                elif datatype == "useragent":
                    val = ua
                elif datatype == "blank":
                    val = ""
                elif datatype == "checked":
                    val = "1"
                elif datatype == "ipaddress":
                    val = str(IPv4Address(random.getrandbits(32)))
                    val = "Unknown"

            postdata.update({name: val})
        n += 1, 0)
    i += 1
    headers = {"User-Agent":ua}
    response =, data = postdata, headers = headers)

Without drawing attention to the terrible quality of my Python code, we can see that this script loads the config file generated by the first script, followed by the amount of times it should submit data to the target.

The script utilised the Python library Faker, which is a really handy library that generates all sorts of fake data. Currently, the script only supports the generation of data for user agents, IP addresses, usernames, and passwords, though, these can very easily be extended based on the target. I suppose you could even modify the script to auto-identify the data required for some of the fields.

Eventually, once the script covers a number of common data items, it can quickly be used to target a number of different Phishing websites within minutes. When used against a fake Halifax website, we can see it generating the data required:

Before you know it, the scammers database will have thousands of entries hopefully rendering their data completely useless. To mitigate against scammers filtering requests by source IP address, look into using this script with Proxychains.

This was just my (very small) way of annoying a scammer. Hopefully I can make this script more powerful in the weeks to come to start targeting these scammers in mass.