Bruteforcing SSH Known_Hosts Files


OpenSSH is a common tool for most of network and system administrators. It is used daily to open remote sessions on hosts to perform administrative tasks. But, it is also used to automate tasks between trusted hosts. Based on public/private key pairs, hosts can exchange data or execute commands via a safe (encrypted) pipe. When you ssh to a remote server, your ssh client records the hostname, IP address and public key of the remote server in a flat file called “known_hosts“. The next time you start a ssh session, the ssh client compares the server information with the one saved in the “known_hosts” file. If they differ, an error message is displayed. The primary goal of this mechanism is to block MITM (“Man-In-The-Middle“) attacks.

But, this file (stored by default in your “$HOME/.ssh” directory) introduces security risks. If an attacker has access to your home directory, he will have access to the file which may contains hundreds of hosts on which you also have an access. Did you ever eared about “Island Hopping” attack? Wikipedia defines this attack as following:

In computer security, for example in intrusion detection and penetration testing, island hopping is the act of entering a secured system through a weak link and then “hopping” around on the computer nodes within the internal systems. In this field, island hopping is also known as pivoting.

A potential worm could take advantage of the information stored in the file to spread across multiple hosts. OpenSSH introduced a countermeasure against this attack since the version 4.0. The ssh client is able to store the host information in a hash format. The old format was:, ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA0ei6KvTUHnmCjdsEwpCCaOHZWvjS \
  jytm/5/Vv1Dc6ToaxTnqJ7ocBb7NI/HUQEc23eUYjFrZQDS0JRml3RnsG0UzvtIfAPDP1x7h6HHy4ixjAP7slXgqj3c \
  fOV5ThNjYI0mEbIh1ezGWovwoy0IxRK9Lq29CacqQH8407b1jEj/zfOzUi3FgRlsKZTsc3UIoWSY0KPSSPlcSTInviG \
  oNi+9gC8eqXHURsvOWyQMH5K5isvc/Wp1DiMxXSQ+uchBl6AoqSj6FTkRAQ9oAe8p1GekxuLh2PJ+dMDIuhGeZ60fIh \

With the version 4.0, hosts are stored in this new format:

  |1|U8gOHG/S5rH9uRH3cXgdUNF13F4=|cNimv6148Swl6QcwqBOjgRnHnKs= ssh-rsa AAAAB3NzaC1yc2EAAAABIw \
  AAAQEAvAtd04lhxzzqW57464mhkubDixZpy+qxvXBVodNmbM8culkfYtmq0Ynd+1G1s3hcBSEa8XHhNdcxTx51MbIjO \
  dCbFyx6rbvTIU/5T2z0/TMjeQyL3SZttbYWM2U0agKp/86FdaQF6V87loNcDq/26JLBSaZgViZS4gKZbflZCdD6aB2s \
  2sqEV4k7zU2OMHPy7W6ghNQzEu+Ep/44w4RCdI5OYFfids9B0JSUefR9eiumjRwyI0dCPyq9jrQZy47AI7oiQJqSjvu \

As you can see, the hostname is not readable anymore. To achieve this result, a new configuration directive has been added in version 4.0 and above: “HashKnownHosts [Yes|No]“. Note that this feature is not enabled by default. Some Linux (or other UNIX flavors) enable it by default. Check your configuration. If you switch the hashing feature on, do not forget to hash your existing known_hosts file:

  $ ssh-keygen -H -f $HOME/.ssh/known_hosts

Hashing ssh keys is definitively the right way to go but introduce problems. First, the good guys cannot easily manage their SSH hosts! How to perform a cleanup? (My “known_hosts” file has 239 entries!). In case of security incident management or forensics investigations, it can be useful to know the list of hosts where the user connected. It’s also an issue for pentesters. If you have access to a file containing hashed SSH hosts, it can be interesting to discover the hostnames or IP addresses and use the server to “jump” to another target. Remember: people are weak and re-use the same passwords on multiple servers.

By looking into the OpenSSH client source code (more precisely in “hostfile.c“), I found how are hashed the hostnames. Here is an example:


“|1|” is the HASH_MAGIC. The first part between the separators “|” is the salt encoded in Base64. When a new host is added, the salt is generated randomly. The second one is the hostname HMAC (“Hash-based Message Authentication Code“) generated via SHA1 using the decoded salt and then encoded in Base64. Once the hashing performed, it’s not possible to decode it. Like UNIX passwords, the only way to find back a hostname is to apply the same hash function and compare the results.

I wrote a Perl script to bruteforce the “known_hosts” file. It generates hostnames or IP addresses, hash them and compare the results with the information stored in the SSH file. The script syntax is:

  $ ./ -h
  Usage: [options]
   -d <domain>   Specify a domain name to append to hostnames (default: none)
   -f <file>     Specify the known_hosts file to bruteforce (default: /.ssh/known_hosts)
   -i            Bruteforce IP addresses (default: hostnames)
   -l <integer>  Specify the hostname maximum length (default: 8 )
   -s <string>   Specify an initial IP address or password (default: none)
   -v            Verbose output
   -h            Print this help, then exit

Without arguments, the script will bruteforce your $HOME/.ssh/known_hosts by generating hostnames with a maximum length of 8 characters. If a match is found, the hostname is displayed with the corresponding line in the file. If your hosts are FQDN, a domain can be specify using the flag “-d“. It will be automatically appended to all generated hostnames. By using the “-i” flag, the script generates IP addresses instead of hostnames. To spread the log across multiple computers or if you know the first letters of the used hostnames or the first bytes of the IP addresses, you can specify an initial value with the “-s” flag.

Examples: If your server names are based on the template “srvxxx” and belongs to the domain, use the following syntax:

  $ ./ -d -s srv000

If your DMZ uses IP addresses in the range, use the following syntax:

  $ ./ -i -s

When hosts are found, there are displayed as below:

  $ ./ -i -s
  *** Found host: (line 31) ***
  *** Found host: (line 165) ***
  *** Found host: (line 69) ***
  *** Found host: (line 28) ***
  *** Found host: (line 56) ***
  *** Found host: (line 51) ***

My first idea was to bruteforce using a dictionary. Unfortunately, hostnames are sometimes based on templates like “svr000” or “dmzsrv-000” which make the dictionary unreliable. And about the performance? I’m not a developer and my code could for sure be optimized. The performance is directly related to the size of your “known_hosts” file. Be patient! The script is available here. Comments are always welcome.

Usual disclaimer: this code is provided “as is” without any warranty or support. It is provided for educational or personal use only. I’ll not be held responsible for any illegal activity performed with this code.


  1. I was testing your program on a known_hosts file with only one line where I knew the correct hostname beforehand. Unfortunately your program couldn’t find anything, so I decided to write my own bruteforcer. After getting that to work, I decided to figure out why your program didn’t work…

    In searchHash() you return 0, if the $host doesn’t match any of the lines and the line number(*) if it does. Unfortunately the first line is line number 0 and the code therefore won’t show any matches for the first (and maybe only) entry in known_hosts, no matter how long you’ll let it run.

    Quick fix: Change searchHash() to return $i+1; on success and remove the “+ 1”-part from the various printf(“*** Found host: %s (line %d) ***\n”, $tmpHostShort, $line + 1); lines.

    (*) It’s not really the line number, since you only increment $idx if ($hostHash =~ m/\|1\|/). Maybe store the real line numbers too and use those for the output? ($. might be useful.)

    BTW: fillString() does the same as “$char x $len” and getPos() does the same as “index($alphabet,$char)”. It’s probably faster to use perls own (optimized) functions than your own (non-optimized) functions.

    Another optimization step would be to move the decode_base64($saltStr[$i]) step from searchHash() (i.e. ALL THE TIME) to the loop where you read the file (i.e. ONCE per salt).

    I also think it make senses to use decode_base64() on the digest in the read loop (i.e. once) and later compare it with $hmac->digest (instead of b64digest). No need to spend time on base64 encoding all the time, if we don’t really need it.

  2. I tried the script just now on ubuntu 13.10. Tried first without arguments and then using ..
    $ ./ -f /home/me/.ssh/known_hosts
    CPU usage went to 100% for one processer each time. I killed the process after about 8 seconds.
    Is that result to be expected??

  3. Very nice, idea. I was stunned the first time I saw encrypted hosts in known_hosts. Though it makes sense to encrypt the hosts, still the unreadability has always annoyed me.

  4. Yeah, that’s right – works now. Missed this point, thx. 🙂

    How about autodetection in a way:
    if -s is specified, argument is taken
    if argument matches to /^\d+\.\d+\.\d+\.\d+$/ then is treated as IP by default
    if argument matches to /^\d+\.\d+\.\d+\.\d+\/\d+$/ then is treated as IP/mask (for range)
    Then you could use -i as “force treating as IP” and so on (-r for range?).

    BTW are you planning to use some svn/git? I’d put some changes into the code directly, but don’t want to make mess.

  5. Hi rozie,
    If you are looking for IP addresses, use the “-i” flag (combined with “-s” to start at a specific IP address). Example:

    $ ./ -i -s

    Thanks for the suggestion to scan a range! It has been added to my todolist.

  6. There’s probably a bug – I can recognize some hosts (if I provide their names with -s), but for some it just doesn’t work:

    $ perl -s -v
    Reading hashes from /home/rozie/.ssh/known_hosts …
    Done. is my IP address and I can log into this machine with ssh.

    Another thing: it would be nice to add support for range of IP’s:
    use Net::Netmask;
    $prefix = “”;
    $block = new Net::Netmask ($prefix);
    @ips_to_check = ($block->enumerate());

  7. Hi Robin,
    I understand what you mean. Indeed, it could be more efficient. I’ll update the tool asap. Thanks for your feedback!

  8. I have a set of DHCP addresses named noname0 to noname9, from your example I figured that the option -s noname0 would tell it just to modify the last character but it doesn’t. I tested your example with srv000 and that starts at srv something then moves on to srx.

    I noticed the same thing with IP addresses, if I know the subnet is then I want to be able to test just that range, at the moment specifying

    ./ -v -i

    starts checking at

    What would be really good would be to be able to specify the character(s) to replace with a symbol, so for the hostname I could say noname? or srv??? and for the ip range 192.168.0.?

    Great tool, I’ll definitely be adding it to my collection.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.