Want to create interactive content? It’s easy in Genially!

Get started free

modulo 7 hashing

Tirocinante Consorzi

Created on September 14, 2023

Start designing with a free template

Discover more than 1500 professional designs like these:

Transcript

Preface (1/2)

hashing

In this module, we will explore the fundamental concepts of hashing, its purpose, and its functionality in information security. Additionally, we will delve into the realm of attacking hashes, secure hashing algorithms, and strategies for enhancing security.

Preface (2/2)

At the end of this module, you have reached the following goals:

  • You have a good understanding of hashing and how it can contribute to information security
  • You can give practical examples in which hashing is used
  • You know what the principles of a safe hashing algorithm are
  • You can give an overview of different hashing algorithms
  • You know how hash function can be attacked
  • You know methods to enhance the security of a hash value
.

What is hashing ? (1/3)

Hashing is

  • the process of converting an input of any length or any size into a fixed size string of text using a mathematical function.
  • any text, no matter how long it is, can be converted into bytes through an algorithm.
  • The message to be hashed is called the input
  • The algorithm used to do so is the hash function
  • The output is called the hash value.
INPUT + HASH FUNCTION = HASH VALUE .

What is hashing ? (2/3)

  • HASHING IS DIFFERENT FROM ENCRYPTION
  • Hashing is IRREVERSIBLE
  • Having a hash value, it is infeasible to generate the originating message
  • Encryption is REVERSIBLE

What is hashing ? (3/3)

The result of a hash is a numberHash values are mostly encoded in HEX or base64 form

  • In HEX form - example:
6538643261373763353162353939633130363531363038613564386332383666
  • In Base64 form - example:
e8d2a77c51b599c10651608a5d8c286f

General

The main purpose of hashing is

  • to verify the integrity of a piece of data
  • it acts as a unique “fingerprint” of the input data

Example of file integrity (1/2)

  • A hash function can be used to calculate the file integrity.
  • It means that the content of a file isn’t changed while it’s transported.
  • You don’t want a hacker to modify the software you’re downloading, don’t you?

Example of file integrity (2/2)

For example, technology vendors with publicly available downloads provide checksums. This checksum is a hash value calculated on the originating file. For example, see the SHA256SUMS on the download server of Belnet.After downloading, you can calculate the hash value of the download and compare it with the hash value on the server.

Hashing and track reputation of files (1/2)

  • Hashing can also be used to track the reputation of files.
  • When a malicious file is known, the hash value of the file is calculated.
  • These checksums of malicious files are stored in security databases, creating a library of known bad files.

Hashing and track reputation of files (2/2)

  • The md5 hash value of the eicar test file is 44d88612fea8a8f36de82e1278abb02f
  • EICAR test file is a computer file that was developed by the European Institute for Computer Antivirus Research (EICAR) and Computer Antivirus Research Organization (CARO), to test the response of computer antivirus (AV) programs
  • If we check the hash against a reputation server like virustotal, it will be recognised as malicious

Hashing and password validation

  • password cannot be saved in plain text in databases
  • if there is a data breach, the password are all known
  • so, most systems store hashed values of your password within their databases so that when you authenticate, the system has a way to validate your identity against an unreadable hashed version of your password
  • if there is a data breach, only the hash values are breached
  • because a hash value is irreversible, it should be safe

HMAC (1/4)

HMAC (Hash-based Message Authentication Code)

  • is a type of a message authentication code (MAC) that is acquired by executing a hash function on the data to be authenticated and a secret shared key
  • it is used for both data integrity and authentication
  • HMAC can be used to ensure that the person who created the HMAC is who they say they are (authenticity), and that the message hasn’t been modified or corrupted (integrity).
We give an example

HMAC (2/4)

Bob sends out a message to Alice. The message is “Hello Alice, meet me at the Cybership!”. Bob will also send a HMAC together with the message, namely “734ac3f12a5ce5f116906443e1b9a172”. This HMAC was calculated with the following formula:

HMAC (3/4)

When Alice receives the message, she is not sure whether it was really Bob who sended the message. But she and Bob share a common secret key. To be sure the message came from Bob and was not tampered with when the message was under way, she uses the same formula:

HMAC (4/4)

Now Alice is sure the message came from Bob, because the result from her calculation is the same number as Bob sended together with the message. And this can only happen when there is a common secret key used, only known to Bob and Alice.

Deterministic

A hash function should be deterministic. It means that for a given input value it must always generate the same hash value. Knowing this, a hash function cannot depend on pseudo-random numbers. It cannot, for example, include the time of the day, the temperature, memory allocation and so on.

Collision resistant (1/2)

Collisions can occur due the the pigeon hole effect

  • As there are more inputs than outputs, some of the inputs must give the same output.
  • If you have 128 pigeons and 96 pigeonholes, some of the pigeons are going to have to share a hole.

Collision resistant (2/2)

A good hash function is designed in a way that collisions will occur as little as possible. A good hash function is designed in a way that collisions cannot be created intentionally by engineering the algorithm. The harder they are to find, the more secure the hash function is.

Uniformity

  • A hash function should be designed in a way that the inputs are mapped “as evenly as possible” over the output.
  • A certain collision may not occur more than other collisions.

Computationally efficient (1/2)

  • When you have a normal day spending behind your computer, you’re frequently asked for your password via different applications
  • The password itself will not be checked against the database, but its hash value will be calculated and checked against the hashed database.
  • So a slow slow hash function would impact the applications in a negative way.
  • Hash function should be computationally efficient

One way

Once a hash value has been generated, it must be impossible to convert it back into the original data.

CRC (1/2)

CRCs are specifically designed to protect against common types of errors on communication channels, where they can provide quick and reasonable assurance of the integrity of messages delivered. For example, a frame (PDU in OSI layer 2) contains a frame trailer with a frame check sequence which is a CRC to check the integrity of the frame.

CRC (2/2)

A CRC is not suitable for protecting against intentional alteration of data.

  • if someone purposely changes a data frame along the way, calculates a new CRC and adds it to the frame, the CRC check will not produce any errors.

MD5 (1/2)

  • Stands for the message-digest algorithm.
  • Was developed in 1991 by Ronald Rivest.
  • It is based on the Merkle Damgard Algorithm.
  • It is a hash function algorithm that takes the message as input of any length and changes it into a fixed-length message of 16 bytes or 128 bits.
In the screenshot you can see that the MD5 digests of different inputs always consist of 128 bits encoded in Base64.

MD5 (2/2)

  • The most common application of the MD5 algorithm:
  • check file integrity after a transfer.
  • to store passwords in databases.
Just like CRC, MD5 is also not suitable for protecting against intentional alteration of data Despite still being commonly used, MD5 should not be considered safe anymore.

sha

  • Secure Hash Algorithm
  • Developed by the National Institute of Standards and Technology (NIST) along with NSA
  • Previously released as a Federal Information Processing Standard, later in 1995, it was named as SHA algorithm
For many years, MD5 has been prone to weaknesses (see next Topic) which were covered by the design of SHA. Moreover, MD5 was quite slow compared to the optimized SHA algorithm. SHA has a lot of versions, where the later one is an improvement for the earlier one.

SHA-0

It is a retronym that is applied to the basic version of the year-old 160 bit or 20-byte long hash function, which was published back in 1993 with the name of the SHA algorithm. It was withdrawn very shortly after it was published due to a major flaw, and therefore SHA-1 came into the picture.

SHA-1

SHA-1 has a digest length of 160 bits. SHA-1 should not be considered as secure anymore.

SHA-2

SHA-2 comes with digits lengths of 224, 256, 384 and 512 bits. It can be considered as a safe hashing algorithm.

SHA-3

SHA-3 also comes with digits lengths of 224, 256, 384 and 512 bits. Why would you use SHA-3 above SHA-2?

  • SHA-3 has a performance benefit, which is that it's cheap to implement in specialized hardware. It's very fast on a dedicated hardware circuit.

Functions used in hashing

  • The different hash functions use the same logic in the background
  • There all binary functions which can be calculated fast
    • AND
    • OR
    • NOT
    • Shifting bits

Brute force attack (1/2)

When you have a hash value, you can calculate the hashes of every possible input. When you have a match, you know what the original input was…

Brute force attack (2/2)

Imagine

  • The hash value you’ve found is an MD5 hash value.
  • You know that the passwords which are used in the system from which you have the hash value, must consist of 8 characters
  • In the system there are 95 different/allowed characters.
It means that there are 95^8 possible passwords, that is 6.634.204.312.890.625 possibilities! Suppose you have a GPU which can calculate approximately 200.000.000 MD5 hashes per second. Then the password is cracked in maximum 6.634.204.312.890.625/200.000.000, which is 33.171.022 seconds or 384 days! Some people use the same password for a long time, so if you’re lucky the password hasn’t changed yet.

Dictionary attack

A dictionary is a huge collection of possible inputs and corresponding hash values. So these dictionaries store a mapping between the hash of a password, and the correct password for that hash. And every day new pairs are calculated for different hashing algorithms so the dictionaries are still growing. Example of online dictionary: https://crackstation.net/

Rainbow table attack

Dictionaries use a lot of space. Why not use rainbow tables? Rainbow tables use a reduction function to use less storage than dictionaries. The reduction function changes the hash value into plaintext. A new hash value is then generated from this text. In a rainbow table, this takes place not only one time, but many times, resulting in a chain.

Collision attack

A good hash function should be collision resistant. When we’re looking at the MD5 algorithm, a collision attack exists that can find collisions within seconds on an average PC. That is why MD5 is not considered as secure. The picture shows the collision attack.

Combining or nesting hash methods

You can choose to combine or nest different hashing functions. An example: sha256(md5($pass)). The advantage? Let’s test our function on crackstation with an easy password … $pass=lola123” -> EASY PASSWORD 🙄 md5($pass) -> FOUND 😲 sha256($pass) -> FOUND 😲 sha256(md5($pass)) -> NOT FOUND 😃

Salt

Salting means that you add a unique, randomly generated string (i.e. salt) to each password as part of the hashing process. The salt is typically unique for a user, making it significantly harder for the attacker to find the original input. (INPUT +SALT) + HASH FUNCTION = HASH VALUE

Pepper

  • A pepper is also, like a salt, random data that is added to the password before the hash value is calculated.
  • The difference is that the pepper is kept secret by storing it in a separate secure location or not storing it at all.
  • In addition, while a salt must be long enough to be unique, a pepper must be at least 112 bits long in order to be considered secure, according to NIST guidelines from 2017.

Key stretching

Key stretching techniques are used to make a possibly weak key, typically a password or passphrase, more secure against a brute-force attack by increasing the resources (time and possibly space) it takes to test each possible key. But there is a downside to this. Remember that one principle of hashing algorithms is that they should be computationally efficient.

Algorithms with built in security enhancements (1/2)

There are hashing algorithms which have built in security enhancements. Bcrypt

  • password-hashing function
  • incorporating a salt to protect against dictionary attacks, bcrypt is an adaptive function.
  • the iteration count can be increased to make it slower, so it remains resistant to brute-force search attacks even with increasing computational power.
PBKDF2
  • has also a built-in function with a sliding computational cost, used to reduce vulnerabilities of brute-force attacks.

Algorithms with built in security enhancements (2/2)

There are hashing algorithms which have built in security enhancements. Argon2, it has 6 input parameters:

  • password
  • salt
  • memory cost (the memory usage of the algorithm)
  • time cost (the execution time of the algorithm and the number of iterations)
  • parallelism factor (the number of parallel threads)
  • hash length.

conclusion

In conclusion, the importance of hashing in application and information security cannot be overstated. Throughout this document, we have explored the fundamental role that hashing plays in ensuring data integrity and authentication