What Is a Hash Function

A hash function takes arbitrary-length input data and produces a fixed-length output called a hash value (or digest). The same input always yields the same hash, but recovering the original data from the hash is computationally infeasible - a property known as one-way resistance.

For example, SHA-256 converts hello into 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824, a 64-character hexadecimal string. Changing even one character in the input produces a completely different hash (the avalanche effect), giving no clue about the original input.

Hashing differs from encryption. Encryption is reversible with a key; hashing is not. This irreversibility is precisely why hash functions are chosen for password storage and data integrity verification.

Algorithm Comparison

Designed in 1991. Fast but collision resistance is broken. Unsuitable for security purposes; only useful for file checksum verification.

Theoretical collision attacks demonstrated in 2005; Google produced an actual collision in 2017 (SHAttered). Still found in legacy systems but deprecated for new use.

The flagship of the SHA-2 family. No known collision attacks. Widely used in TLS certificates, blockchain, and file verification. The standard choice for general-purpose hashing.

Designed specifically for password storage. Features a configurable cost factor that intentionally slows computation, resisting GPU-accelerated brute-force attacks. Automatically generates and embeds a salt.

Use SHA-256 for general data verification and bcrypt (or Argon2) for password storage. Never use MD5 or SHA-1 for security purposes.

Password Storage

Web services must never store passwords in plaintext. If the database leaks, every user's password would be exposed. The correct approach is to hash passwords before storage and compare hashes during login.

However, simple hashing is insufficient. Identical passwords produce identical hashes, enabling attackers to use precomputed lookup tables (rainbow tables) to reverse hashes back to passwords.

The defense is salting: appending a random string unique to each password before hashing. Even identical passwords produce different hashes with different salts, rendering rainbow tables useless. bcrypt and Argon2 handle salt generation automatically. Combined with a password manager generating long random passwords, the risk of rainbow table or brute-force attacks becomes negligible.

Blockchain Applications

Blockchain leverages hash function properties by including the previous block's hash in each new block, forming a chain. Altering any block changes its hash, breaking the chain from that point forward. Recalculating all subsequent blocks requires overwhelming the entire network's computing power.

Bitcoin mining (Proof of Work) is a competition to find an input whose SHA-256 hash meets specific criteria (a certain number of leading zeros). Due to the one-way property, the only approach is brute-force trial, and this computational cost secures the blockchain.

Rainbow Table Attacks and Salting

Rainbow table attacks use massive precomputed tables mapping common passwords to their hash values. Leaked hashes are matched against these tables to recover original passwords.

Salt: A random value appended to each password before hashing. Different salts produce different hashes even for identical passwords, invalidating precomputed tables.
Pepper: An application-wide secret value added alongside the salt. Even if the database leaks, the pepper stored separately makes hash reversal significantly harder.
Stretching: Repeating the hash computation thousands of times to increase the time per hash. bcrypt's cost factor and PBKDF2's iteration count implement this approach.

Common Misconceptions

Hashing and encryption are the same thing: Encryption is reversible with a key; hashing is a one-way operation. Passwords should be hashed (not encrypted) for storage, while communication should be encrypted for transit protection.
SHA-256 hashing is sufficient for password storage: SHA-256 is too fast for password hashing - GPUs can compute billions of SHA-256 hashes per second. Password storage requires intentionally slow algorithms like bcrypt or Argon2.

Hash Function