Understanding Password Entropy: The Math Behind Strength
Understanding Password Entropy: The Math Behind Strength
Every time a password manager tells you your new password is "very strong," it's doing some arithmetic behind the scenes. That arithmetic has a name — entropy — and once you understand it, you'll never look at password advice the same way again. You'll also immediately see why "P@ssw0rd!" is an embarrassing choice despite ticking every box on most strength meters.
Let's dig in properly. No fluff. Actual math.
What Entropy Actually Measures
In information theory, entropy quantifies unpredictability. Claude Shannon introduced it in 1948 not to talk about passwords, but to measure how much information a signal carries. The core insight: the harder something is to predict, the more information it conveys when you finally observe it.
For passwords, we care about how hard it is for an attacker to guess. Entropy gives us a single number — expressed in bits — that captures this difficulty. Specifically, if a password has H bits of entropy, an attacker needs up to 2H guesses in the worst case to crack it by brute force.
The formula is deceptively simple:
H = L × log₂(N)
Where:
H= entropy in bitsL= length of the password (in characters)N= size of the character set (the "pool" of possible characters)
Let's put some real numbers in. A password using only lowercase letters has a pool of 26 characters. A 10-character lowercase-only password gives you:
H = 10 × log₂(26) = 10 × 4.7 ≈ 47 bits
Add uppercase, and your pool jumps to 52. Add digits: 62. Add 32 common symbols: 94. An 10-character password from a 94-character pool gives you roughly 65 bits. That's a significant jump — 18 bits more, meaning the search space is 218 ≈ 262,000 times larger.
The Dirty Secret About "Character Variety" Rules
Here's where most corporate password policies go wrong. They mandate complexity — uppercase + digit + symbol — but allow short passwords. A 8-character mixed password from a 94-character pool yields only 52.4 bits. Meanwhile, a 12-character all-lowercase password gives 56.4 bits. The longer password is statistically harder to crack, despite "failing" the complexity check.
NIST figured this out. Their 2017 SP 800-63B guidelines explicitly moved away from mandatory complexity rules toward length requirements. Length grows entropy linearly. Every additional character from a 94-char set adds ~6.55 bits. Every additional character from a 26-char set still adds 4.7 bits — which compounds fast at length 20 versus length 8.
The formula reveals something else: log₂(N) grows slowly. Going from 26 characters (lowercase) to 94 characters (full printable ASCII) multiplies your pool by 3.6x, but only adds 1.85 bits per character. Doubling your password length from 8 to 16, by contrast, doubles your bit count. Length wins, almost always.
How Many Bits Is "Enough"?
This depends entirely on the attacker's capabilities. Let's be concrete.
A high-end consumer GPU in 2024 — say an RTX 4090 — can crack MD5 hashes at roughly 164 billion attempts per second. Against bcrypt (cost factor 10), that same GPU manages around 184,000 attempts per second. The hashing algorithm matters enormously.
Against MD5:
- 40 bits of entropy: cracked in under 2 hours
- 50 bits: about 70 days
- 60 bits: roughly 180 years on one GPU
- 80 bits: longer than the universe has existed
Against bcrypt (cost 10):
- 40 bits: about 69 days on one GPU
- 50 bits: 194 years
- 60 bits: 200,000 years
This is why the underlying hash function is arguably more important than the password's entropy alone. But you can't always control how a service stores your password. Defense in depth means using strong passwords and trusting services that use proper key-derivation functions (bcrypt, scrypt, Argon2).
For most practical purposes, 80+ bits of entropy from a random source is the threshold worth targeting. That corresponds to a randomly generated 13-character password from a 94-character set, or a 6-word Diceware passphrase (more on that shortly).
The Word List Method: Diceware and the BIG Surprise
There's a completely different way to build high-entropy passwords, and it produces things humans can actually remember. Diceware uses physical dice (or their digital equivalent) to select words from a numbered list. The EFF long wordlist contains 7,776 words (which is 65 — five dice rolls per word).
Each word contributes:
log₂(7776) ≈ 12.9 bits
A six-word Diceware passphrase therefore has:
6 × 12.9 ≈ 77.5 bits
That's better than most "strong" random passwords. "correct horse battery staple" — XKCD's famous example — uses 4 words from a 2,048-word list, giving only 44 bits (the full EFF list would give 51.6 bits). XKCD was right about the concept but conservative on word count. Use six words from the EFF list and you get something memorizable and genuinely resistant to brute force.
Where Entropy Breaks Down: Pattern Recognition
The entropy formula assumes truly random selection from the pool. This is where human-generated passwords fall apart catastrophically.
"P@ssw0rd!" technically comes from a 74-character pool (letters + digits + ! @). Its theoretical entropy is about 60 bits — sounds decent. But password crackers don't work by trying random combinations. Tools like Hashcat use rule-based attacks: take a dictionary word, try leet substitutions (a→@, o→0, e→3), append common endings (!1?123#), and mix in capitalization patterns. The actual entropy of "P@ssw0rd!" against a rule-based attack is closer to 12-15 bits — cracked in seconds.
This is the gap between theoretical entropy (calculated from character pool) and practical entropy (resistant to real-world attacks). Zxcvbn — the open-source password strength estimator used by Dropbox and many others — attempts to measure practical entropy by running the same patterns attackers use, then reporting how many guesses your password would take. It's far more honest than "characters × log₂(pool)" when applied to human-chosen passwords.
The lesson: entropy math is exact only when the password was chosen uniformly at random. For anything a human constructed, assume the effective entropy is much lower.
Breach Checks: A Different Threat Model Entirely
Entropy protects against brute-force guessing. Breach databases represent a completely different attack: the attacker already has your exact password, because you reused it on a site that got hacked.
This is where Have I Been Pwned (HIBP) and its Pwned Passwords API enter. The API uses k-anonymity: your password manager hashes your candidate password with SHA-1, sends the first 5 characters of that hash to the API, and receives back all hashes matching that prefix. Your client checks if your full hash is in the returned set — and the server never sees your actual password or full hash. It's an elegant privacy-preserving design.
The Pwned Passwords database currently holds over 10 billion compromised passwords. A password with 80 bits of theoretical entropy that also appears in this database has effective real-world entropy of zero — it'll be tried in the first wave of credential-stuffing attacks regardless of how mathematically unguessable it "should" be.
This is why modern password generators do both: they generate cryptographically random passwords (maximizing entropy from first principles) and check them against breach lists before suggesting them. If a generated password somehow appears in a breach dump — rare, but worth checking — they generate a new one.
What a Good Password Generator Actually Does
When you click "generate" in a quality tool, several things happen:
- CSPRNG sampling — a Cryptographically Secure Pseudo-Random Number Generator (not Math.random(), which is predictable) samples characters from your chosen pool with uniform probability. No bias, no patterns.
- Entropy calculation — the tool computes theoretical bits and displays it.
- Breach check — a k-anonymous API call confirms the password hasn't appeared in known dumps.
- Optional readability pass — some generators exclude visually ambiguous characters (0/O, 1/l/I) for passwords you might need to type manually.
What the formula cannot do is protect you from shoulder surfing, phishing, clipboard hijacking, or a keylogger. Entropy is purely about resistance to computational guessing attacks. The whole security stack matters.
Practical Takeaways
The math converges on simple conclusions:
- Target at least 80 bits of entropy for sensitive accounts. A 13+ character fully random password or 6-word Diceware phrase gets you there.
- Length matters more than complexity, but mixing character types still helps expand the pool modestly.
- Human-constructed passwords — regardless of how clever they feel — have dramatically lower practical entropy than the formula suggests. Use a CSPRNG.
- Run breach checks, not as a replacement for entropy, but as a parallel defense layer against credential stuffing.
- The hash algorithm at the service you're logging into is outside your control, but matters hugely. Where you have a choice (self-hosted apps, anything you build), use Argon2id.
Entropy isn't a guarantee of security — it's a measure of how hard guessing will be for someone who doesn't already have your password. Combine it with non-reuse, proper storage, breach monitoring, and MFA, and the math works in your favor.