Integrity
Integrity of a message is that the correctness and completeness of the transmitted information can be verified.
Message Digests And Cryptographic Hash Functions
- A message is a piece of data processed by cryptographic algorithm.
Cryptographic Hash Functions
orCHF
is an algorithm.
- A
CHF
maps message of arbitrary size and converts it to a relative shorter fixed-size array of bits.
- This fixed size array of bits is called a
Message Digest
or aCryptographic Hash
- A
message digest
is the output of running aCHF
on a message.
- Requirements for a good
CHF
: Deterministic
Same message must always produce the same message digest.Irreversible
It must be impossible or extremely difficult to recover the original message from the digest.- It must be comutationally infeasible to find two distinct messages that produce the same message digest. If 2 messages are found with the same digest this is known as a hash collision.
- Any change to the message big or small must result in an extensive change to the digest. So extensive that the two digests should be impossible to be related.
- Such extreme reaction is called an
Avalanche Effect
- The goal of all this is to measure correctness and completeness of information.
- REMEMBER: As the domain of all possible inputs to a
CHF
is much larger than the domain of all possible hash functions there will be hash collisions (when the hash of 2 different input run through a CHF produces the same hash) and this uniqueness in reality is thus impossible to realise. This is because inputs can be of any length it wants to be, but your hash function will always be of the same length.
- So in reality one now requires that it merely be difficult, not impossible, but difficult to find different inputs that will be mapped to the same hash value.
- If this is the case for a hash function then the hash function is called a collision resistant hash function and be used for integrity related purposes.
Applications of A Message Digest
- You can perform
Data Integrity Verification
Verify the data that you have downloaded is the same data that is meant to be downloaded.
- It forms the basis for
HMAC
. It combines a secret key and CHF for data authentication.
- It also is an integral part of creating
Digital Signatures
. X.509 certificates in TLS protocol.
- Used in
network protocols
like TLS, SSH etc.
- This also finds use in
Password Verification
.
- Is used in
Source Control Management
as Content Identifier. GIT, Mercurial use hashes to uniquely identify stored objects such as files, commits, branches and tags
- Blockchain And Cryptocurrency
- Proof-of-work system.
The security level of a CHF depends on the size of the message digest. If the message digest is n bits, the maximum attack complexity is 2n/2 for the collision attack and 2n fr the preimage attack. It is impossible to have a higher complexity than 2n/2 for the collision attack because the birthday attack, based on the birthday paradox, can always find collisions in 2n/2 time. For example: SHA-256 has a 2^128 collision attack complexity. Hence the security level is 128.
Reviewing Popular Hash Functions
- SHA-2 Family of Hashes
- Contains the most popular used:
SHA-256
hashing algorithm.
- This outputs a
256
bit digest and has a collision resistance level of 128 bits
SHA-256
is the default hash function in the TLS protocol.
- Default signing function for
X.509
certificates and SSH Keys.
- Bitcoin uses
SHA-256
to verify transactions and proof-of-work
- GIT SCM is migrating to
SHA-256
hashes for it’s blockchain implementation and object identification process.
- Used in
SSH
,IPSec
,DNSSEC
,PGP
etc.
- Other
SHA-2
HF: SHA-224
: Modification ofSHA-256
. Security Level: 112 bitsSHA-512
: Algo is similar toSHA-256
but works on 64 bit words. SL: 256 bits
- Developed by NSA and published by NIST in 2001 as federal standard.
- The alogirthm is patented but available under royalty-free license.
- SHA-3 Family of Hashes
- Chosen through an algorithm competition.
- Similar to how
AES
algorithm was chosen.
- NIST orgnized the competitions, because of successful attacks on
SHA-2
predecessors, namelySHA-1
,SHA-0
,MD5
etc.
SHA-3
is based on theKeccak algorithm
from a team of Belgian Cryptographers. One of the team mate is also the person who co-authored theAES
.
SHA-3
SHA3-224
SHA3-256
SHA3-384
SHA3-512
SHAKE128
SHAKE256
- The
SHA-3
Keccak algorithm is slower thanSHA-2
because of more sequential operations.
- As a result
SHAKE128
andSHAKE256
were developed.
- Strictly they are not Hash functions but
Extendable Output Functions
orXOFs
.
- Also same authors introduced the
Kangaroo12
extendable Output Function which are 13 times faster than theSHA3-256
and also has SL of 128.
SHA-3
is currently used in theEthereum blockchain
asproof-of-work
checking.
- Other Notable Hash Functions
- SHA-1
- NSA in 1990
- This is not secure anymore.
- Broken by Google and Centrum Wiskunde & Informatica research center
- Used for
X.509
,PGP
,S/MIME
,DSA
,Git
andMercurial SCM
- NIST deprecated
SHA-1
in 2011
- Web browsers stopped it’s support in 2017
- MD Family
MD2
,MD4
,MD5
,MD6
MD1
was proprietary
MD3
was experimental
- Designed by
Ronald Rivest
who also invented symmetrtic cipher RC likeRC2/4/5
etc
MD4
was used for hashing passwords in Windows NT, 2000 and XP
- You can still enable MD password hashing
- BLAKE 2
BLAKE2s
andBLAKE2b
BLAKE2s
produces 256 bit message digest
BLAKE2b
produced 512 bit message digest
- Similar to
SHA-3
- Faster than
MD5
,SHA-2
,SHA-3
on more modern CPUs
- Based on teh
ChaCha Stream
cipher
- Popular in WhatsApp, 7-Zip, WinRAR, Rsync, Chef, Wireguard
Calculating Message Digest using OpenSSL
- Check which message digest algorithms are supported
openssl dgst -list Supported digests: -blake2b512 -blake2s256 -md4 -md5 -md5-sha1 -mdc2 -ripemd -ripemd160 -rmd160 -sha1 -sha224 -sha256 -sha3-224 -sha3-256 -sha3-384 -sha3-512 -sha384 -sha512 -sha512-224 -sha512-256 -shake128 -shake256 -sm3 -ssl3-md5 -ssl3-sha1 -whirlpool
- Let’s calculate SHA3-256 digest
seq 2000 > message.txt openssl dgst -sha3-256 message.txt SHA3-256(message.txt)= 6cea69b64fbbcb58732abb54a1f02557886b9935ddcd89aa9d2f6211443a1732
Examples of using other Hash Functions
First let’s see the release section of the openssl.org website:
You can see that each of the downloads has a
SHA256, PGP Signature, SHA1
checksum attached to them. So let’s take the openssl-1.1.1w.tar.gz
as an example. Copy the link of the file and download the filewget https://www.openssl.org/source/openssl-1.1.1w.tar.gz --no-check-certificate
Let’s get the
SHA1
and the SHA256
checksums as well. Copy the link and download the files.>wget https://www.openssl.org/source/openssl-1.1.1w.tar.gz.sha256 --no-check-certificate >wget https://www.openssl.org/source/openssl-1.1.1w.tar.gz.sha1 --no-check-certificate
Now the idea is we calculate the same
SHA1
checksum of the downloaded file and check against this value to verify the integrity of the downloaded file. If the values match that means the files were not tampered while on it’s way. If the values don’t match that means the files were corrupted either intentionally or un-intentionally.Let’s calculate the
SHA1
hash of the downloaded file. If you are using:- MAC: Use the
shasum
command
- Linux:
sha1sum
command
I am using MAC so the
shasum
command by default calculates the SHA1
checksum of a file:cat openssl-1.1.1w.tar.gz.sha1 shasum openssl-1.1.1w.tar.gz
You will see that both the checksum values match. Which shows that the integrity of the file is verified.
Let’s do another one,
sha256
. Again- MAC: Use the
shasum
command and use option-a 256
- Linux: Use
sha256sum
command
cat openssl-1.1.1w.tar.gz.sha1 shasum -a 256 openssl-1.1.1w.tar.gz
We can also use
OpenSSL
directly to calculate the hash of a file as well.Just use the
openssl sha256/sha1 <filename>
commandcat openssl-1.1.1w.tar.gz.sha1 openssl sha256 openssl-1.1.1w.tar.gz openssl sha1 openssl-1.1.1w.tar.gz
You can also use openssl to create a checksum file. Let’s create a checksum file for the
openssl-1.1.1w.tar.gz
file and match it against the downloaded checksum file as an exercise>openssl sha256 -hex -out openssl.sha256 openssl-1.1.1w.tar.gz >cat openssl.sha256 SHA2-256(openssl-1.1.1w.tar.gz)= cf3098950cb4d853ad95c0841f1f9c6d3dc102dccfcacd521d93925208b76ac8 >cat openssl-1.1.1w.tar.gz.sha256 cf3098950cb4d853ad95c0841f1f9c6d3dc102dccfcacd521d93925208b76ac8
The command format is
openssl sha256 -hex -out <output filename> <filename of which we need to calculate the checksum>
Let’s create for a sample file that we create for ourselves.
>echo "hello" > hello.txt >cat hello.txt hello >openssl sha256 -hex -out hello.txt.sha256 hello.txt >cat hello.txt.sha256 SHA2-256(hello.txt)= 5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03
Now if you want to you can share
hello.txt
along with it’s checksum file hello.txt.sha256
with someone else and they will use the same method as above to verify if the integrity of the file is maintained during the transfer.