GPU Password Cracking

GPU password cracking uses the GPU’s parallelism to brute-force or dictionary-attack password hashes at high throughput. A canonical embarrassingly parallel workload: each candidate hashes independently.

Why are GPUs so good at this?

Hash functions (MD5, SHA-1, SHA-256, NTLM) are simple arithmetic that maps cleanly to GPU SIMT lanes. A consumer GPU computes tens of billions of MD5s per second, and a server GPU more. The tool of record is hashcat, which supports dozens of hash types.

Defenses, weakest to strongest:

Long hashes plus salt: prevents rainbow tables. Minimum bar; doesn’t slow a targeted attack
Slow hash functions: bcrypt (adjustable cost), scrypt, Argon2. Built to be deliberately inefficient. bcrypt with cost 12 takes ~300 ms, dropping GPU throughput from billions/sec to thousands/sec
Memory-hard functions: scrypt, Argon2id. Force large memory per hash so the GPU’s many-cores / low-mem-per-core advantage disappears
Peppering: secret key mixed into the hash, stored separately from the DB
Rate limiting plus 2FA on the auth endpoint

Real gotcha

If your password DB uses fast unsalted hashes (MD5, SHA-256), assume it is already cracked the moment it leaks. Use bcrypt or Argon2id.

From ECE459 L23

Each candidate password is a point in the GPU’s computation space, hashed simultaneously. A decent site locks accounts after N bad logins, but if the hash DB leaks, brute force is back on. UNIX passwords repeatedly applied the hash to stiffen the arms race. Today-uncrackable passwords are not 20-years-from-now-uncrackable, so rotate algorithms and passwords.

Pre-checks beat brute force:

Try “password”, “system”, etc. first
Search credential dumps: a user reusing mycrappywebsite.com creds on their bank needs no cracking at all
Quantum doesn’t wreck hashing the way it does asymmetric (RSA)
xkcd 538: the $5 wrench attack

scrypt [Per09], the hash behind Dogecoin, aims to make hashing expensive in both time and space, so brute-forcing needs not just more time but more circuitry.

A memory-hard algorithm on a RAM uses $S (n) \in Ω (T (n)^{1 - ε})$ space, where:

$S (n)$ is space used
$T (n)$ is operation count
$ε$ is a small constant

A sequential memory-hard function is one where the fastest sequential algorithm is memory-hard, and no parallel algorithm can asymptotically beat it. scrypt exhibits ReMix as an example, and BlockMix in a cache-aware model.

Rainbow tables [Kul09]

Rainbow tables compromise between brute force and a full hash→plaintext lookup. Build chains:

Start at plaintext P₀, compute hash H₀
Reduction function R maps H₀ to a new plaintext P₁ (e.g. take the first 6 digits of the hex). R is not the inverse of the hash, just a mapping back into the plaintext space
Hash P₁ → H₁, reduce → P₂, repeat n times
Store only (P₀, Hₙ) per chain. Chains compress millions of hashes into one row

To crack hash H’:

Look for H’ in the final-hash column. If present, re-walk that chain from its start to locate H’ and return the preceding plaintext
Otherwise reduce H’ to a new plaintext, hash, check again, up to n iterations
Handle collisions and loops carefully

Both generation and lookup are GPU-friendly since reductions parallelize. A GTX295 generated ~430M links/sec for MD5 and cracked K#n&r4Z in ~1m52s [cryptohaze]. Rainbow tables for common hash+charset combos are downloadable in the 25–900 GB range.

🛠️ Steven Gong

Table of Contents

GPU Password Cracking

From ECE459 L23

Rainbow tables [Kul09]

Graph View

Backlinks

🛠️ Steven Gong

Table of Contents

GPU Password Cracking

From ECE459 L23

Rainbow tables [Kul09]

Related

Graph View

Backlinks