At the time of writing livesshattack.net has undergone ~8.2M ssh attacks to date. However, when parsing out the passwords used in each attack, the resulting list only amounts to ~187k unique passwords. A surprisingly small number - that can only be summed up by that livesshattack.net has been hit with a number of repetitive, similar brute force or dictionary attacks. Speaking of dictionary attacks, let's compare livesshattack's unique password list to other popular password lists on the internet.
Ahhh, the famous rockyou.txt password list (can be obtained here). Highly popularized for being one of the fairly good password lists that came with BackTrack (now called Kali Linux). Often used in tandem with John The Ripper, or to break open password protected zip files when paired with a short python program, cracking wpa/wpa2 protected wifi with airecrack, etc,.
So let's compare! Using
diff wasn't going to cut it for this experiment, so I wrote my own simple python script (github link here). This program is used as a quick and naive way to compare two text files that take on the form of word lists with one string per line. It works like the following:
- Pass two word lists to the program
- The words in each word list are read into python's
listdata structure, one for each file
- Each list is turned into a
setto remove duplicate elements (words), if any
- A new
setis then created with elements common to both
setsfrom step 3
- The length of the resulting list as well as the original length of both files are then used in producing calculations
livesshattack's unique password dump vs. rockyou.txt
So roughly 42.5%. 79560 of the 187410 passwords in livesshattacks's unique password list have a 1-to-1 mapping into rockyou.txt. This says that, while livesshattack.net hasn't been hammered with the full rockyou.txt word list directly, there are of course similarities between rockyou.txt and other word lists that are floating around.
500-worst-passwords.txtThis is another list provided by skullsecurity. It's quite small in size, and upon a brief inspection 500-worst-passwords.txt looks very similar to the top 1000 passwords captured by livesshattack.net (when counted by occurrence). Let's do a comparison.
livesshttack's unique password dump vs. 500-worst-passwords.txt
This may or may not be a familiar one to most. I discovered this list as a result of pruning my logs and finding quite strange password strings - I first spoke of it in back in this blog post. Googling a few of the odd passwords lead me to this word list. And just by its name alone, it appears to be used in ssh attacks/contain common passwords used for ssh. The word list can be found here.
Below is the comparison screenshot.
livesshattack's unique password dump vs. ssh_passwd.txt
25% percent and some change. I haven't been hit with this entire list yet either.
The Moby Project Lists
The Moby Project is a set of lexical documents containing words and phrases in a number of languages compiled by Grady Ward. It is supposedly the largest combined word list available and also part of the public domain. Read more about the project here or here. I will be using a number of the english documents: common words, names, phrases, and a list of hyphenated words.
Surprisingly there were no matches at all found in any of the files in the following directories of the Moby project:
/mhyph. I may later come back to this project in part 2 of this article.
Compiled by Google and based on the 10000 most popular english words used as search terms - this might be an interesting candidate. The list can be found here.
livesshattack's unique password dump vs. ssh_passwd.txt
This means the majority, ~97%, of livesshattack.net's password list doesn't contain common everyday english words. Here are a few entries of what the list does contain, however - take a look.
Control Sequences and Latin characters
Finding strings over 50 characters
crackstation.txt and crackstation-human-only.txt
Crackstation is a service that uses large pre-computed lookup tables to crack password hashes. These tables store a mapping between the hash of a password, and the correct password for that hash. This service only works for hashes with no salt involved, however. Crackstation has two very large word lists, one of which that exceeds one billion passwords. Find out more about the Crackstation service here and download the archives of the lists here.
Let's delve into our comparison. We'll start with the smaller of the two lists that Crackstation provides, crackstation-human-only.txt, which contains around ~64M passwords.
Running out of memory
So my laptop ran out of memory after storing a ~64M element list of strings and then attempting to run the set operation on that list (and then store that also). This means that, for the larger of the two lists, memory will also be an issue. Let's scale up to a bigger machine.
livesshattack's unique password dump vs. crackstation-human-only.txt
So I spun up a high memory, 128GB, instance at Digital Ocean (and normally I wouldn't be using root like this, but this was a very short and quick run of a single script). It seemed to do the trick. Peak RAM usage was around ~10.3GB before the program finished and the results where written to the display. Upon doing
less crackstation-human-only.txt, crackstation-human-only.txt seems to be a combination of names and words in the dictionary more or less riddled with random punctuation. The contents in this word list accounted for 53.4% of livesshattack.net's unique password dump.
The second list, crackstation.txt, contains over 1 Billion words and uncompressed is ~15GB in size. It contains crackstation-human-only.txt as well as all of the words on Wikipedia, multiple leaked database breaches posted on the web, as well some that were sold and obtained from the darkweb. It took 100G+ of RAM to compute the results.
livesshattack's unique password dump vs. crackstation.txt
~68%. Pretty decent.
In summary, this research was done just to see how similar the unique passwords livesshattack.net has captured over time were to other popular word lists on the internet. It was also additionally used to see if livesshattack.net had undergone a full dictionary attack by one of the candidate lists. As it turns out, that smaller the list (especially if it contains notoriously bad passwords), the more likely all passwords in the small list were to have been attempted, and the larger the candidate list, the easier it is to cover the breadth of passwords that livesshattack.net has captured to date.
If you want to get started comparing password lists for similarities, take my script (here) and start doing a bit of research on the similarities between the various password lists out there on the internet. It's easy to set up and upon a little list inspection it's a good way to discover passwords you shouldn't be using.
This post is the first of a two part series. I will release livesshattack's password list when the post is completed. Let me know if I am forgetting any major word lists or what lists you would like to see compared next.