SHA256 Algorithm Overview
SHA256 Algorithm Overview
Previous  Top  Next 

SHA256 Algorithm Overview  WebAssembly Does Not Have A “raw binary” Data Type 
SHA256: What Is It?
The SHA256 hash algorithm is one of the Secure Hash Algorithm2 family of cryptographic functions published by the United States National Security Agency in 2001.
The purpose of this family of algorithms is to generate an output called a hash that, for all practical purposes, can be considered unique for the given input. In this sense, a hash acts like a message’s unique digital fingerprint.
In the same way that the probability of finding two human beings with identical fingerprints is unfeasibly low, so the probability that any two input messages will generate the same SHA256 hash value is also unfeasibly low. Putting this another way, “hash collisions” are so unlikely that for all practical purposes, a SHA256 hash can be considered entirely unique.
In more technical language, for any secure hash value of length n
bits, the probability that a brute force attack can discover the input value that generated it is one chance in 2^{n}
.
In our case, we are creating a hash value 256 bits long, so that’s 1 chance in 2^{256} or 1.15792089237 * 10^{77} — and herein lies the strength of the SHA2 family of algorithms; namely, that the chances of being able to use a forged hash value are so astronomically small that it’s not even worth starting.
Small Input Changes Create Large Output Changes
All the algorithms in the SHA2 family start by generating a digest (also known as a “schedule”) of a particular length (512 bytes in our case). Then, using a oneway compression^{1} algorithm, they generate a unique output value whose bit pattern is highly susceptible to change.
This susceptibility to change is based on the fact that the algorithms exhibit a behaviour known as the avalance effect; that is, if a single input bit changes, then there is a 50% probability that each output bit will change.
The SHA2 family of algorithms perform an initial preparation phase, then repeat a 2phase compression process:
Phase 0: Preparation
Seed value preparation
 Define 16, 32bit values
s[0..15]
where each value is the fractional part of the square root of the first 16 prime numbers  Define 64, 32bit values
k[0..63]
where each value is the fractional part of the cube root of the first 64 prime numbers  Create 8, 32bit hash values
h[0..7]
and initialise such thath[n] = s[n]
Message Preparation
 Append a single bit
1
to the message (I.E. for data obtained from a file, append0x80
)  Calculate the message’s total bit length (which will always be ≥ 1)
 Append sufficient bit
0
s to bring the message length up to the next 512bit boundary, minus 64 bits  Write the bit length as a bigendian, 64bit integer into the last 64 bits of the message
The message now occupies an integer number of 512bit blocks.
Phase 1: Build The Digest
The message digest is a 512byte block viewed as 64, 32bit words (md[0..63]
)
 Copy the next 64byte message chunk to words
0..15
of the message digest 
Populate the remaining 48 message digest words as follows:
for n in 16..63 σ0 = rotr(md[n15], 7) XOR rotr(md[n15], 18) XOR shr(md[n15], 3) σ1 = rotr(md[n2], 17) XOR rotr(md[n2], 19) XOR shr(md[n2], 10) md[n] = md[n16] + σ0 + md[n7] + σ1 end
Phase 2: Compress The Digest

Initialise 8 working variables
a..h
from the corresponding hash values:a = h[0] b = h[1] c = h[2] d = h[3] e = h[4] f = h[5] g = h[6] h = h[7]

For each word in the message digest, calculate a set of intermediate values and use them to update the working variables as follows:
for n in 0..63 // Calculate intermediate values Σ0 = rotr(a, 2) XOR rotr(a, 13) XOR rotr(a, 22) Σ1 = rotr(e, 6) XOR rotr(e, 11) XOR rotr(e, 25) choice = (e AND f) XOR (NOT(e) AND g) majority = (a AND b) XOR (a AND c) XOR (b AND c) temp1 = h + Σ1 + choice + k[n] + md[n] temp2 = Σ0 + majority // Shunt working variables h = g g = f f = e e = d + temp1 d = c c = b b = a a = temp1 + temp2 end

After the above loop has processed all 64 words in the message digest, phase 2 ends by adding the working variables
a..h
to the corresponding hash valuesh[0..7]
. Any arithmetic overflows are simply ignored.h[0] += a h[1] += b h[2] += c h[3] += d h[4] += e h[5] += f h[6] += g h[7] += h
Final Output
Phases 1 and 2 are repeated until the input message has been consumed, then the final digest is simply the concatenation of the eight hash values h[0..7]
.

Be careful not to confuse the “oneway compression” used by the SHA2 algorithms with the more familiar “data” or “twoway compression” performed by programs such as
zip
.
Programs such aszip
are only useful because they specifically create a twoway mapping between the compressed data and the original data. Without this, you’d never be able tounzip
your files.
However, in cryptography, this twoway mapping is precisely what we must avoid creating!
Consequently, the SHA2 family of algorithms have been specifically designed to exclude any practical possibilty of recovering the original data from its compressed form; yet at the same time, the compressed form must be constructed in such a way that it could only have come from the source data. ↩