Want to create interactive content? It’s easy in Genially!

modulo 6 Encoding

Tirocinante Consorzi

Created on September 14, 2023

Start designing with a free template

Discover more than 1500 professional designs like these:

Dynamic Visual Presentation

Corporate Christmas Presentation

Customer Service Manual

Business Results Presentation

Meeting Plan Presentation

Business vision deck

Economic Presentation

Explore all templates

preface (1/2)

Encoding in IT

Welcome to the module on Encoding in IT, an essential guide to understand the concept of encoding and its various applications. This document aims to provide a comprehensive introduction to encoding, covering topics such as ASCII encoding, URL encoding, HTML encoding, base64 encoding, hexadecimal encoding, and the use of encoding in cyber attacks.

preface (2/2)

At the end of this module, you have reached the following goals

You know the meaning of encoding and you know the purpose of encoding
You know the ASCII encoding scheme and you can use it
You know URL encoding en you can interpret it
You know the different kinds of HTML encoding and can use it
You know how base64 encoding is working and you can decode it
You know the working and purpose of hexadecimal encoding
You know where encoding is used in cyber attacks

What is encoding? (1/2)

Worldwide we see different forms of the amount five

decimal numerical system: 5
roman numeral system: V
hieroglyph writing: |||||
with 🖐️
….

What is encoding? (2/2)

Encoding is the process of putting a sequence of characters (letters, numbers, punctuation, and certain symbols) into a specialized format for efficient transmission or storage. Decoding is the opposite process, it is the conversion of an encoded format back into the original sequence of characters.

The need for encoding

When we send over data, we cannot be sure that the data would be interpreted in the same format as we intended it to be. So, we send over data coded in some format that both parties understand. It is important that developed encoding schemes are accurate. The encoded data should have the same content then the decoded data. Encoding itself is NOT A SECURITY solution

History of encoding: Morse code (1/2)

In history, a lot of encoding schemes have been used. An example is Morse code

sequences of two signal durations, called dits and dashes
used in telegraphy
international Morse code encodes the 26 basic Latin letters A through Z
there is no distinction between upper and lower case letters

Example: SOS in morse . . . - - - . . .

History of encoding: Morse code (2/2)

Where is encoding used? (1/2)

Encoding can be used:

to convert information to the appropriate form for transmission.

in data storage and data processing

in data compression and decompression

Where is encoding used? (2/2)

In the picture you see two different representations of the same data. The first one has no parsing characters, the second one has parsing characters. Depending on the technology will one representation or the other be better to use.

What is ASCII encoding?

Ascii encoding

stands for American Standard Code for Information Interchange
a type of code that is used for converting characters into a code
used in computers, telecommunications equipment and other devices

Before ASCII

each computer manufacturer represented alphabets, numerals, and other characters in its own way.
different models of computers could not communicate with each other

Original ASCII Table (1/2)

The original ASCII code

based on the (modern) English alphabet.
128 specified characters into seven-bit integers

95 of them are printable
33 non-printing control codes which originated with Teletype machines

Original ASCII Table (2/2)

Extended ASCII tables and codepages

ASCII is created for the english alphabet

What about other alphabets?

With the arrival of the computer age, systems processed data in bytes

ASCII extended from 128 characters to 255 characters
Different regions of the world chose to use this extra space differently
Codepages were born

Unicode (1/2)

Different codepages ?But every device should be able to display the same information! So, Unicode was born!

is an effort to include all characters from all currently and historically used human languages into single character enumeration
is effectively one large single code page

Unicode (2/2)

These days, the Unicode standard defines values for over 128,000 characters and can be seen at the Unicode Consortium. It has several character encoding forms:

UTF-8: widely used in email systems and on the internet
UTF-16: used by systems such as the Microsoft Windows API, the Java programming language and JavaScript
UTF-32: is capable of representing every Unicode character as one number, is huge and almost never used

URI and URL

URI

identifies a resource and differentiates it from others by using a name, location, or both

URL

identifies the web address or location of a unique resource.

URI (1/7)

The URI generic syntax consists of components organized hierarchically in order of decreasing significance from left to right.

URI (2/7)

The first component “scheme” is obligated, it defines the addressing system. It can contain any combination of letters, digits, plus signs, periods, or hyphens followed by a colon. The most common URI schemes include HTTP, HTTPS, FTP, mailto, and file.

URI (3/7)

The authority component is an optional component preceded by a double slash and terminated by a slash, a question mark, or a hash symbol. It consists of three sub-components:

Userinfo
Host
Port

authority = [user:password@"] host [":" port]

URI (4/7)

The path component contains a sequence of data segments that describes the location of a resource in a directory structure. It should be empty or separated by a slash. For example, telnet://192.0.2.16:80/ is a valid URI with an empty path since there’s no indication of the specific resource location.

URI (5/7)

The query component is an optional component that contains a query string of non-hierarchical data. It is often a string of key=value pairs. This component is preceded by a question mark. For example, if the URI is https://example.org/test/test1?search=test-question#part2, the query component is search=test-question.

URI (6/7)

The fragment component includes a fragment identifier that provides the direction to a secondary resource. It refers to a different section of the primary resource. A fragment is preceded by a hash symbol and terminated by the end of a URI. For instance, the fragment component from https://example.org/test/test1?search=test-question#part2 is part2.

URI examples

mailto://mailboxX.com:6267/complaints/there?name=help#nose.This URI contains a scheme name, an authority with host and port, a path,a query and a fragment. telnet://192.0.2.16:80/. In this example, “telnet” is the scheme name and the numbers (IP address) after the double slash make up the authority. The path is empty, which is why no characters come after the slash.

URL (1/2)

URL

abbreviation of Uniform Resource Locator
is a specific type of URI
Does not only identify the resource but tells you how to access it or where it’s located.

URL (1/2)

URL

abbreviation of Uniform Resource Locator
is a specific type of URI
Does not only identify the resource but tells you how to access it or where it’s located.

URL (2/2)

Each URL should follow the URI syntax that has a similar structure to a URI. Below is an example of URL syntax: https://www.example.com/forum/questions/?tag=networking&order=newest#top

The need for URL encoding

URL is composed out ASCII characters
Some ASCII characters are not allowed to be placed directly within URLs (backspace, tab,..)
Some characters have a special meaning within URLs (?, /, #,...)
Unsafe characters are also not allowed to be placed directly within URLs (“”, <>,...)

URL encoding (1/2)

converts reserved, unsafe, and non-ASCII characters in URLs to a format that is universally accepted and understood by all web browsers and servers.

It first converts the character to one or more bytes.
Then each byte is represented by two hexadecimal digits preceded by a percent sign (%).
The percent sign is used as an escape character.

URL encoding is also called percent encoding since it uses percent sign (%) as an escape character.

URL encoding (2/2)

Example

ASCII value of space character in decimal is 32
converted to hex comes out to be 20
we just precede the hexadecimal representation with a percent sign (%)
this gives us the URL encoded value - %20

Below you can see an example: http://www.example.com/new%20pricing.htm

HTML encoding

The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser.
There are various characters that are part of the HTML markup itself (such as < and >).
To use these within the document as content you need to HTML encode them by using HTML character codes.

HTML numeric character reference

A first way for character encoding in HTML is to make use of numeric character references.
A numeric character reference in HTML refers to a character by its Unicode code point, and uses the format &#nnnn or &#xhhhh where nnnn is the code point in decimal form, and hhhh is the code point in hexadecimal form.

HTML character entity references

A second way to use encoding in HTML is by referring to a character by the name of an entity which has the desired character as its replacement text.
It has the format &name; where name is a case-sensitive alphanumeric string.

Base 64 encoding (1/2)

Base64 is used to encode binary data as printable text.
This allows you to transport binary over protocols or mediums that cannot handle binary data formats and require simple text.
Base64 is a group of binary-to-text encoding schemes that represent binary data (a sequence of 8-bit bytes) in sequences of 24 bits that can be represented by four 6-bit base64 digits.

Base 64 encoding (2/2)

The encoding process follows the next steps:

The base64 encoding algorithm receives an input stream of bytes (8 bits)
It processes the input from left to right and divides the input into 24-bit groups by concatenating three 8-bit bytes.
These 24-bit groups are then treated as 4 concatenated 6-bit groups.
Finally, each 6-bit group is converted to a single character using the base64 table.

Example of base64 table

Base64 padding

What is padding? In the process of base64 encoding, there will be some cases in which the last group (of 24-bits) doesn't have enough bits, then there are 2 cases:

If the group has only 8 bits of input data, we pad 16 bits of zero’s. The last 2 characters will be overridden with 2 equal signs (==)
If the group has only 16 bits of input data, we pad 8 bits of zero’s. The last 1 character will be overridden with 1 equal sign (=)

Hexadecimal encoding

Hexadecimal encoding is also called base16 encoding.
It uses 16 distinct symbols.
The hexadecimal symbols 0 till 9 are used to represent decimal values from 0 to 9
The hexadecimal symbols A to F (case insensitive) are used to represent the decimal values from 10 to 15

Hexadecimal notation table

Misuse of URL encoding - Path Traversal (1/3)

Path traversal is an attack

that exploits weak access control implementations on the server side, particularly for file access

an attacker would try to access restricted files by injecting invalid input into the website.

Misuse of URL encoding - Path Traversal (2/3)

For example, we have a public website which is accessible viahttp://mysecuresite.com/public If path traversal is possible, the attacker can try to reach other files on the server. If, for example, the website is hosted on a linux server, the attacker can try to take a look at the local user file via http://mysecuresite.com/public/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/etc/passwd which is the same as http://mysecuresite.com/public/../../../../../../etc/passwd

Misuse of URL encoding - Path Traversal (3/3)

How to protect yourself against path traversal?

sanitize the user input. For example, in order to mitigate the attack mentioned above, we must validate the user input and ensure that it does not contain invalid characters.
restrict the access to other files on the system
use safelisting, it consists of creating a list of possible paths that can be accessed safely

Encoding as evasion technique (1/2)

Encoding can be used

to evade malware detection
commands are encoded and cannot be read in plain text
to make evasion stronger, the malware is encoded several times

An example can be seen on the next slide

Encoding as evasion technique (2/2)

conclusion

Encoding is very essential and necessary in the use of IT and to be able to forward, process and store data in a good way. Unfortunately, these techniques are also misused by people with bad intentions and can be part of a cyber attack. In research into cyber attacks, it is therefore very important that you know and recognize these encoding techniques so that you can fully understand what exactly happened during the attack.

View

Dynamic Visual Presentation

View

Corporate Christmas Presentation

View

Customer Service Manual

View

Business Results Presentation

View

Meeting Plan Presentation

View

Business vision deck

View

Economic Presentation

modulo 6 Encoding

Start designing with a free template

View

Dynamic Visual Presentation

View

Corporate Christmas Presentation

View

Customer Service Manual

View

Business Results Presentation

View

Meeting Plan Presentation

View

Business vision deck

View

Economic Presentation

Transcript