Our customized threat modeling
identifies vulnerabilities within your
security posture that puts your
most valuable organizational and
client data — the crown
jewels — at risk.
Our security audits and vulnerability
assessments are based on industry
standards and best practices to assess
weaknesses in your cloud environment
and network, as well as mobile
and web-based apps.
Our sophisticated testing services
delve into your network, smart
devices and other systems
to expose critical security
deficiencies.
Last month, 4iQ
found a massive password list containing 1.4 billion usernames and passwords from previous breaches. The data is broken up into directories and files according to the first few letters of the username to allow for quicker searching using the included query.sh
script. While this makes searching for specific users very easy, it is difficult to search the 41GB data dump by domain name for users from an entire organization.
To make it quicker for me to search for users of an org I was pen testing, I decided to throw the data into AWS Athena to see if it could handle querying the data quicker. The process was very quick and easy - well worth taking the time to set up.
I’m not going to go into detail how to do that here, but I recommend you create a new user that only has permissions for this bucket. For this guide, let’s say you name it s3://breach-wordlists
. You can use this same bucket to add more breach files in the future, so I’d name it something generic.
The magnet link for the breach data can be found easily so I’m not going to list it here. The data took about 20 minutes to download on my AWS instance.
If you don’t already have the AWS CLI tools installed, you’ll have to do that first:
pip install awscli --upgrade --user
Next, configure your access key id and secret access key for the new user:
brkr19@kali:~$ aws configure
AWS Access Key ID [None]: **********
AWS Secret Access Key [None]: **********
Default region name [None]:
Default output format [None]:
Finally, change into your data directory and run the sync command. This took about 15 minutes for me.
brkr19@kali:/mnt/wordlists/BreachCompilation/data/$ aws s3 sync . s3://breach-wordlists
Create a new database, let’s call it breach_database
, and a new table called breach_compilation
, with a location of s3://breach-wordlists/
.
Choose Text File with Custom Delimiters
and a Field Terminator of :
.
Create two string columns: username
and password
.
Continue on without adding a partition, and click Create Table
.
You can now query the flat files as if they were a database with standard ANSI SQL. The vast majority of queries I’ve used gave me full results within 10-15 seconds!
SELECT * FROM breach_compilation WHERE username = 'e_mail_address@example.org'
SELECT * FROM breach_compilation WHERE username LIKE '%@example.org'
(might be useful for complex passwords to find other related email addresses)
SELECT * FROM breach_compilation WHERE password = 'ucsennemon'
Please share this post if you found it useful and reach out if you have any feedback or questions!
You might not know how at-risk your security posture is until somebody breaks in . . . and the consequences of a break in could be big. Don't let small fractures in your security protocols lead to a breach. We'll act like a hacker and confirm where you're most vulnerable. As your adversarial allies, we'll work with you to proactively protect your assets. Schedule a consultation with our Principal Security Consultant to discuss your project goals today.
© 2025 FRACTURE LABS, LLC. ALL RIGHTS RESERVED