Update: New 25 GPU Monster Devours Passwords In Seconds

Posted by: Paul   December 4, 2012 19:1279 comments

Editor’s note: I’ve updated the article with some new (and in some cases) clarifying detail from Jeremi. I’ve left changes in where they were made. The biggest changes: 1) an updated link to slides 2) clarifying that VCL refers to Virtual OpenCL and 3) ¬†that the quote regarding 14char passwords falling in 6 minutes was for LM encrypted – not NTLM encrypted passwords. Long (8 char) NTLM passwords would take much longer…around 5.5 hours. ūüėČ ¬†– Paul

There needs to be some kind of Moore’s law analog to capture the tremendous advances in the speed of password cracking operations. Just within the last five years, there’s been an explosion in innovation in this ancient art, as researchers have realized that they can harness specialized silicon and cloud based computing pools to quickly and efficiently break passwords.

Password Cracking HPC

Gosney’s set-up uses a pool of 25 virtual AMD GPUs to brute force even very strong passwords.

A presentation at the Passwords^12 Conference in Oslo, Norway (slides available here РPDF), has moved the goalposts, again. Speaking on Monday, researcher Jeremi Gosney (a.k.a epixoip) demonstrated a rig that leveraged the Open Computing Language (OpenCL) framework and a technology known as Virtual OpenCL Open Cluster (VCL) to run the HashCat password cracking  program across a cluster of five, 4U servers equipped with 25 AMD Radeon GPUs and communicating at  10 Gbps and 20 Gbps over  Infiniband switched fabric.

Gosney’s system elevates password cracking to the next level, and effectively renders even the strongest passwords protected with weaker encryption algorithms, like Microsoft’s LM and NTLM, obsolete.

In a test, the researcher’s system was able to churn through 348 billion NTLM password hashes per second. That renders even the most secure password vulnerable to compute-intensive brute force and wordlist (or dictionary) attacks. A 14 character Windows XP password hashed using LM¬†NTLM (NT Lan Manager), for example, would fall in just six minutes, said Per Thorsheim, organizer of the Passwords^12 Conference.

[Note of clarification from Jeremi: “LM Is what is used on Win XP, and¬† LM converts all lowercase chars to uppercase, is at most 14 chars long, and splits the password into two 7 char strings before hashing — so we only have to crack 69^7 combinations at most for LM. At 20 G/s we can get through that in about 6 minutes. With 348 billion NTLM per second, this means we could rip through any 8 character password (95^8 combinations) in 5.5 hours.” ]

“Passwords on Windows XP? Not good enough anymore,” Thorsheim said.

Tools like Gosney’s GPU cluster aren’t suited for an “online” attack scenario against a live system. Rather, they’re used in “offline” attacks against collections of leaked or stolen passwords that were stored in encrypted form, Thorsheim said. In that situation, attackers aren’t limited to a set number of password attempts – hardware and software limitations are all that matter.

The clustered GPUs clocked impressive speeds against more sturdy hashing algorithms as well, including MD5 (180 billion attempts per second, 63 billion/second for SHA1 and 20 billion/second for passwords hashed using the LM algorithm. So called “slow hash” algorithms fared better. The bcrypt (05) and sha512crypt permitted 71,000 and 364,000 per second, respectively.

Benchmarks - Fast Hash Cracking

Published benchmarks against common hashing algorithms using the 25 GPU HPC cluster

In an IRC chat with Security Ledger, Gosney said he has been working on CPU clustering for about five years and GPU clustering for the last four years.

“Then we just started trying to build the biggest GPU rigs we could, packing as many GPUs into a single server as possible so that we wouldn’t have to deal with clustering or distributing load,” Gosney wrote.

He started developing the new platform since stumbling on VCL in April, after trying his hand at pooling traditional CPUs for password cracking.

“I was extremely disappointed that setting up a clustered VMware instance wouldn’t allow me to create a VM that spanned all the hosts in the cluster. E.g. if i had five VMware ESX hosts with 8 processor cores, I wanted to be able to create a single vm with 40 cores and use all nodes in the cluster,” he wrote.

Then he came across VCL, or Virtual Open Cluster, a small and heretofore little recognized project from the scientists who manage the MOSIX distributed operating system first released in the 1970s.

“It did just what I wanted,¬†not with an entire OS per se, but with an entire OpenCL application. and that’s good enough for me.”

After playing around with VCL for a while, Gosney approached¬†Prof. Amnon Barak, one of Mosix’s creators. Gosney was interested in adding features to VCL that would allow it to run the HashCat password cracking tool.

“Once we convinced Amnon ¬†that we did not aspire to turn the world into one giant botnet, he was very cooperative in working with (us) to resolve issues with VCL that was preventing it from working 100% with hashcat,” he said.

VCL makes load balancing across the cluster – once an arduous task that required months of custom scripting – a trivial matter. As a result, Gosney said that his team is at a point where their implementation of Hashcat on VCL could be scaled up far above the 25GPU rig he has created – supporting “at least 128 AMD GPUs.

“I always had these dreams of doing very simple and very manageable grid/cloud computing,” Gosney wrote. “It really is the marriage of two absolutely fantastic programs, which allows us to do unprecedented things,” he wrote.

Gosney is no stranger to password cracking. After 6.4 million Linkedin password hashes were leaked online, Gosney was one of the first researchers to decrypt them and analyze the findings. He and a partner were ultimately able to crack between 90% and 95% of the password values.

Gosney’s GPU cluster is just the latest leap forward in password cracking in a year that has already seen prominent encryption algorithms deemed compromised by an onslaught of cheap compute power. In June, Poul-Henning Kamp, creator of the md5crypt() function used by FreeBSD and Linux-based operating systems was forced to acknowledge that the hashing function is no longer suitable for production use¬†– a victim of GPU powered systems that could perform “close to 1 million checks per second on COTS (commercial off the shelf) GPU hardware,” he wrote. ¬†Gosney’s cluster cranked out more than 70 times that number –¬†¬†77 million brute force attempts per second against MD5crypt.

Recent years have also seen the launch of services like Moxie Marlinspike’s WPACracker and then¬†CloudCracker, a cloud-based platform for penetration testers that can do lookups of password hashes and other encrypted content against a dictionary of over hundreds of millions – or even billions – of potential matches —¬†all for under $200. ¬†And if that price is too rich, a team of U.S. based researchers have shown how¬†you can do the same thing – on the cheap¬†– by leveraging Google’s MapReduce and cloud based browsers. Then, in 2011, researcher Thomas Roth, who developed the Cloud Cracking Suite (CCS) – a tool that leveraged eight Amazon EC2-based Nvidia GPU instances¬†to crack the SHA1 encryption algorithm and dispense with tens of thousands of passwords per second.

Gosney said he plans to “make a bit of money” off his invention, either by renting out time on it or by offering it as a paid password recovery and domain auditing service. “I¬†have way too much invested in this to not get some kind of return out of it,” he wrote.

Tags:

79 Comments

  • I think you are missing two “not”s in this paragraph:

    “Tools like Gosney‚Äôs GPU cluster are suited for an ‚Äúonline‚ÄĚ attack scenario against a live system. Rather, they‚Äôre used in ‚Äúoffline‚ÄĚ attacks against collections of leaked or stolen passwords that were stored in encrypted form, Thorsheim said. In that situation, attackers are limited to a set number of password attempts ‚Äď hardware and software limitations are all that matter.”

    • yes. thanks. made that change.

      • dude, this has to be the best article with the absolute worst editing i have ever seen in my life. i was thinking about it pretty hard and i swaer I can’t think of a single article that i’ve read with not only so many errors but also an obvious complete lack of understanding of the subject matter.

        just thought id mention it. :)

  • And you never closed the parenthesis in this sentence:

    “The clustered GPUs clocked impressive speeds against more sturdy hashing algorithms as well, including MD5 (180 billion attempts per second, 63 billion/second for SHA1 and 20 billion/second for passwords hashed using the LM algorithm.”

  • “A 14 character Windows XP password hashed using NTLM (NT Lan Manager), for example, would fall in just six minutes, said Per Thorsheim.”

    This is incorrect. This speed is possible only when the password is hashed with LM (not NTLM), and when using the 62 alphanumeric characters only (no special characters). Given these circumstances, the two 7-character LM halves can be bruteforced in 2*(62^7)/20e9/60 = 5.9 minutes.

    • Yes. LM not NTLM. I corrected that and added a clarifying comment from Jeremi. Thanks for pointing this out.

    • Mrb, both 7-character halves are cracked in one pass, you don’t have to multiply by 2. The calculation that is stated in the article of ~6 minutes is based on using a space of 69 characters. If you use a space of 62 chars, the cracking will be around 3 minutes.

  • Heh, yeah, there are some inaccuracies in this article. I’ve already made Paul aware, hopefully he can get them corrected. The slides from the presentation tell the real story though, and the video of the talk will be up shortly from what Thorsheim tells me.

    mrb, you’re correct that the 6-minute time is for LM, not NTLM; however, LM is 69^7, not 62^7, and that is including alpha, digit, and special chars — essentially (95 – 26)^7

  • actually i just realized the link to the slides in this article is incorrect as well; the correct link is http://heim.ifi.uio.no/hennikl/passwords12/Jeremi_Gosney_Password_Cracking_HPC.pdf

    • Object not found!

      The requested URL was not found on this server. The link on the referring page seems to be wrong or outdated. Please inform the author of that page about the error.

      If you think this is a server error, please contact the webmaster.

      Error 404

      heim.ifi.uio.no
      Thu 06 Dec 2012 03:22:32 PM CET
      Apache/2.2.14 (Unix) mod_ssl/2.2.14 OpenSSL/0.9.8j DAV/2 mod_fastcgi/2.4.6 mod_wsgi/3.3 Python/2.5.2 PHP/5.4.7 mod_perl/2.0.4 Perl/v5.8.8

  • If you are coming here from Slashdot or Hacker News — please just read the slides, and all will be made much more clearly. It’s rare that reporters ever get things exactly right. The link to the slides above is incorrect; the actual slides can be found at:

    heim.ifi.uio.no / hennikl / passwords12 / Jeremi_Gosney_Password_Cracking_HPC.pdf

    (modified URL to bypass moderation filter)

    • Will update the link.

      • Link is still broken, missing the “h” in http.

      • Cuenta sabiduria en calculos hay por aqui..Bueno al final solo es cuestion de minutos arriba o abajo.¬Ņno?
        Y alguien me podria decir si esto esto no es m√°s bien un problema de etica no de lo que se puede o no se puede hacer.
        Me explico, robar a una ancinita es facil (pero no lo hacemos por que es una cobardia),pues para mi todo funciona igual.
        Quiero decir que depende de el fin para el que se hagan las cosas algo es licito o no.Quizas un poco maquiavelico pero, me explico.Robar el bolso a una viejecita porque sabes que lleva una bomba que va a matar a un monton de gente puede ser algo bueno.Al igual que creo que el exigir transparencia e informaci√≥n es algo licito y que si el hackeo nos ayuda a desvelar atrocidades hechas por alguien pues les quita el factor impunidad.Pero el descifrar contrase√Īas de particulares porque si.. es como, lo de la viejeciita pero sin bomba.. Espero que se me entienda

  • Why not store the results of the hashing in a large service, and offer quick lookups for cracking passwords? + also compute and store new hasheas paralelly? :)

    • UnfeignedShip

      Generating hashes is one thing. Searching them is another. Large dataset searches, no matter how you do them, indexed or not, are computationally intensive tasks. The difficulty arises in matching. Every iteration of a check has to processed.
      If I’ve got a GUID and I want to check against a list of ten billion (a very tiny number in that key space as it’s a possibility of 2^122 unique values). That’s an effort expressed in polynomial time. (look that up on Wikipedia)

      • Nabil Stendardo

        @UnfeignedShip If your 10 billion potential matches are in a sorted list, the search problem is expressed in logarithmic (and not polynomial) time. Just use a simple algorithm called binary search. Of course the sorting is done in O(N*Log(N)) but you only have to do that once.

  • Instead of VMware, he should have tried ScaleMP. VMware partitions resources while ScaleMP aggregates.

    • So, this portion of the article is a bit out of context, and was just given as historical background information on how we evolved to this point.

      What I was referring to in that quote were my aspirations for virtualization solutions that aggregate resources and permit generic, unmodified applications to utilize all resources in that pool transparently.

      The VMware quote was referring to my first experience with VMware ESX back in 2007, when I was working for Intel at the time. I was disappointed to discover that when setting up a clustered instance, the hypervisors themselves could not be used to form an HPC cluster, only an HA cluster. Mind you this was during a time period when most people had not even heard of ESX, let alone were fully aware of what it was and was not capable of. I had the same hopes for Eucalyptus a year later, which were similarly dashed.

      Thus, we had to settle for MPI at the time. But MPI is not a generic solution, your applications have to be MPI-aware. We had much bigger aspirations, but bit the bullet until technology caught up to our dreams.

      And so with that said, ScaleMP did not exist at the time any of this was occurring. Well, they existed as a company, but had yet to release any products. By the time ScaleMP had begun releasing products, we had already moved on to GPUs.

      Then it became a matter of building denser GPU systems, or writing custom scripts to distribute the load across smaller systems (simple enough for brute force, not so simple when dealing with wordlists, rules, hybrid attacks, etc.)

      And finally we arrive at Virtual OpenCL, which does everything I could have ever hoped it could do.

      • Have looked every where for a webpage ore mail where it is possible to contact Jeremi Gosney for time and price on password recovery of files, the article says : “make a bit of money‚ÄĚ off his invention, either by renting out time on it or by offering it as a paid password recovery and domain auditing service” anyone know if this is possible

        • my contact details are available in the slidedeck, and i can be reached via email as well, epixoip at bindshell.nl

          • Sorry epixoip have tried to mail you trough epixoip at bindshell.nl but get permanent error and cant find anything at slidedeck :-(

  • I recall an lecture on this issue claiming that simply requiring 3 separate simultaneous passwords per username would push the computing power for brute force attacks beyond anything remotely possible.

    • Brute force is just one method of offline password recovery that we employ. In fact, for most algorithms, brute force is a last resort. We have dozens of tricks up our sleeves that allow us to yield a very high percentage of user-selected passwords in a very short amount of time.

  • You guys ever hear of Bitcoin mining?

    • yes, of course. the majority of our used gpus (not just those featured in this cluster) have been purchased from miners who have stopped mining. and we actually do mine for bitcoin in our spare gpu cycles.

      • That’s cool. My first thought when I saw this was ‘what a waste of a bitcoin miner…’

        • it’s unlikely that gpu mining will continue to be profitable, with both the reward being halved and asics hitting the market. most gpu miners started selling off their hardware a few months ago.

    • Lol, my thoughts exactly. Could you just do an hour to this address:
      16NsKf9kde8AFHV9viU1MX2ERkq2eSedTK

      :-)

  • I appreciate the amount of computing power this system has, but IMHO this article is sort of misleading people. I wrote an article titled “The Brute Force Misconception” earlier this year, and it explains (with some calculations) why brute forcing passwords shouldn’t be given the hype that it gets sometimes.

    http://security.nathanbowman.us/2012/04/the-brute-force-misconception.html

    • While the article itself may focus on brute force, we sure do not. For most algorithms, brute force is a last-ditch effort for us. Keep in mind there are far more methods of offline password recovery than just brute force.

      • You mean rainbow tables? Those take just as long to generate…

        • … No, I absolutely do not mean rainbow tables. There are more attacks than just brute force and rainbow tables, you know. Wordlist, wordlist + rules, hybrid wordlist + mask, combinator, combinator + rules, Markov chains…

          • I see, I sort of regard those methods as the “first thing I do that rarely works” and therefore don’t refer to them often. I don’t see much success in recovering modern complex passwords using those methods.

            • Then you’re doing it wrong :) With large lists you should be able to recover the first 50-60% using nothing but wordlists + rules. Brute force helps you identify passwords that you don’t have in your wordlists and cannot find through other means, so that you can then add those words to your wordlists, create rules, etc. And even then we rarely do an exhaustive brute force, as Markov attacks are much better.

    • I see that you mention cryptohaze.com in your blog post there — what you may not know is that Bitweasil, creator of Cryptohaze, is part owner of this cluster.

  • How well does it do with passwords hashed using blowfish? I’ll assume not very well..

    • With a key size of up to 448 bits, blowfish would probably take a while. I think the general consensus of the blowfish algorithm is that it has endured years of cryptanalysis, and is therefore secure.

      • I assume you mean bcrypt, not Blowfish. It is mentioned in the article: “The bcrypt (05) and sha512crypt permitted 71,000 and 364,000 per second, respectively.” bcrypt is memory hard and not GPU-friendly, and therefore cannot be accelerated much.

        Keep in mind that the key size and its cryptographic strength really have no bearing on whether or not it is a good password storage algorithm.

        • No, I meant 448 bits

          https://en.wikipedia.org/wiki/Blowfish_(cipher)#The_algorithm

          Also, you are right, the key size doesn’t need to go larger than that.

          “…given the slow initialization of the cipher with each change of key, it is granted a natural protection against brute-force attacks, which doesn’t really justify key sizes longer than 448 bits.”

          • What you are referring to is irrelevant for password storage. In fact, you are reading the wrong article entirely. You should be reading http://en.wikipedia.org/wiki/Bcrypt

            • I didn’t mean it in terms of password storage. I was referring to Cody’s question that asked about reversing blowfish. Blowfish’s resistance to known plaintext attacks makes it cost prohibitive (in relation to time due to it’s slow initialization of the cipher with each change of key ) to brute force. Bcrypt is good to read up on also, being that it creates an irreversible hash with a blowfish encrypted salted password.

              • The question was, “How well does it do with passwords hashed using blowfish?” and not about reversing blowfish encryption.

                • I took “how well does it do” to mean how well does your setup do with cracking passwords encrypted with the blowfish cipher. My mistake, I didn’t know there was a distinction.

  • Couldn’t these attacks be even more massively parallelized by distributing the task better?

    What I’m thinking is if you had 19 of these 25 GPU clusters, and each one handles ONLY guesses starting with 5 assigned characters…ie: one cluster starts all guesses with only Q, q, W, w, E; the next does e, R, r, T, t, etc.

    You wouldn’t need any internetwork communication at all with this method I’d think…a machine simply brute forces through all of its assigned work, if it gets the password it moves on, and if it marches through every possibility without result, it moves on too.

    Then just set up a light network where machines can flag passwords that they’ve already correctly guessed, that way other machines aren’t wasting time running through their permutations and can move on to the next results.

    • You’re thinking only in terms of brute force, and we do much more than just brute force. The slidedeck talks about why we chose Infiniband, and why it is a good fit for this project.

      You’re also missing the bigger picture on VCL — it is a generic OpenCL virtualization platform, capable of running any unmodified* OpenCL application. And that is the real story here. The fact that I’ve applied it to password cracking is really just a footnote.

      • Also, with this method, we only see about 1.3% overhead while doing brute force attacks with fast hashes. So no, there’s not much to be gained by doing custom distributing. This is also a far more manageable and flexible solution, that doesn’t require any dev time.

  • Er, FreeBSD is definitely not a form of Linux, so a hashing algorithm “used by FreeBSD and other Linux-based operating systems” is jarring. This is like saying “Mustang and other GMC-based cars…..” Sure, they have wheels and drive on roads, but there’s some fundamental differences, too.

    Perhaps you mean “used by FreeBSD and Linux-based operating systems” instead.

  • I bet Mass Effect would look awesome on this machine…

  • how long do you think it would take your rig to crack the wikileaks insurance file password?

    • Encryption is not the same as hashing, and the software we use (Hashcat) does not support encrypted documents, only password hashes. Some software, like John the Ripper, does support a few encrypted document formats. However, I think only a few have been ported to GPU, and I don’t think JtR has any multi-GPU support yet.

  • I wonder if it can be used to crack the Indus Valley language of Pakistan, that has defied all attempts to crack the code?

  • That being said…my question is, nowadays, if you are design a new password storage system what should you use ?

    1) Adaptive Hashing algorithms such as Bcrypt/Script ?

    or….

    2) Encryption (yeah…encryption AES, for example), storing the keys in secure HSM ?

    Thanks…

    • the former, for sure. you definitely do not want to use reversible encryption. what you want to use is scrypt, bcrypt, pbkdf2, or any of the modern crypt(3) algorithms.

  • I am looking to make a single rig. Would it be possible to get a hardware breakdown? Obviously the graphics cards are listed but the information of the other hardware would be useful.

    What motherboard, PSU, case/chassis, processor, RAM etc? :)

    • If you want to build one like the one in the picture, pick up a TYAN FT77B7015 barebones. If you want something a bit smaller, grab a Chenbro RM41300-FS81 chassis and Gigabyte GA-990FXA-UD7 motherboard. CPU does not matter one bit, get the cheapest one you can find. Memory is cheap, get at least 8GB. Do not go cheap on the PSU, go 1200W – 1600W 80plus Gold or 80plus Platinum. You many need more than one depending on how many GPUs you run.

  • Yes very intresting… but it sounds to that its really been built as an excuse so Jeremi Gosney can play Crysis and something like FSX at full graphics detail levels. What would the 3Dmark score be for his system?

    • it would like playing it with one card as none of the cards are in crossfire.

      • Correct, there is no Crossfire setup. Also, this cluster is running 64-bit Linux, not Windows.

  • This is really just old news under a new guise. Brute force on new machine cracks old password algorithm. Remember: physical access to the hardware = compromised system. Limiting actual password attempts to something like 3-10 eliminates brute force attack. Anyone who needs to get in ‘should’ know the correct passwords already.

    • The news is really about using VCL to do GPU clustering, to get around the 8-GPU limit imposed by the AMD drivers and the physical limitations of many motherboards and BIOS. This is taking the most advanced password cracking software to date, and applying it to grid computing.

      Also, we support all attack modes and all 45+ algorithms supported by both oclHashcat-plus and oclHashcat-lite, not just brute force, and not just LM & NTLM.

      You do not need physical access to get password hashes, I’m not sure how you got that impression. Especially in the cases of web applications where some vulnerability may provide limited read access, and cracking password hashes is the only way to escalate privileges.

      You’re also neglecting the legitimate side of all this as well, which is penetration testing and password auditing.

  • What’s the cost of this monster? It’s just the 25 GPUs + 1 server rack mount?

  • University of Oslo has moved the location of my presentation to http://passwords12.at.ifi.uio.no/Jeremi_Gosney_Password_Cracking_HPC_Passwords12.pdf

  • finally the nuclear plant in Dimona can be supported in nuclear analisys and nuke emulations using a cluster of GPU and not obsolete CPU, saving space and energy. Good job, Barak! :)

  • Random Banana

    There is some confusion on the part of the author and the person who did the presentation- there is no such thing as an “NTLM hash”. There are LM hashes, and there are NT hashes. There are no “NTLM” hashes.

    • You are correct, the actual name of the hash is “NThash”; however, the names of the algorithms used in the presentation are the names of the algorithms as used within Hashcat. Most password cracking tools — and most password crackers themselves — refer to the algorithm that produces the NThash as “NTLM.” This is pretty much standard throughout the community, and this is the accepted name that is used for the algorithm. However, it does create a bit of a namespace collision as the average person assumes we are talking about NetNTLM, and not NThash.

  • Wow. Imagine what you could do with a set of Quadro, Teslla, or FirePro cards! (Not to mention a larger budget.)

    • For the purpose of password cracking, Quadro and Tesla cards are much slower at password cracking than their GTX equivalents. Further, Nvidia GPUs are much slower than AMD GPUs. It is a similar story on the AMD side, with almost all of the Radeons being significantly faster than the Firepro — with the sole exception of the new Firepro S1000.

  • It’s too bad you need to make money on it now. There are lots of flawed computing algorithms still in use. You could test all sorts of things. E.g., instead of testing hashes for weaknesses, you could test hashes to see which have the lowest collisions. Low-collision hashes are used for hash-tables. Many hash tables are becoming huge in size. It’s likely that some hashes which are great with binary are lousy with ASCII. So, you could test hashes to see which are best, low-collision, with ASCII, EBCDIC, or Unicode. You could also test hashes used in PKE and PGP etc. Since it’s so fast, you could even use it to design new hashes simply through brute-force testing and ranking of random or semi-random algorithms.

  • My password is 24 chracters long, and fully random. I don’t memorise it. I use Aladdin to type it for me. http://igg.me/aladdin-key

    Aladdin is trying to improve the current situation of people using simple or identical passwords everywhere by removing the need to memorise passwords. Aladdin works with Windows, Mac, Linux as well as Android and iPad.

    Aladdin is a USB key(board). No software needed.

  • clipfish Video runterladen online

    Thank you for sharing your thoughts. I really appreciate
    your efforts and I will be waiting for your further write ups thanks once again.