Want to create interactive content? It’s easy in Genially!

Get started free

Understanding - Chapitre 1 Lesson 2

Samy

Created on November 14, 2024

Start designing with a free template

Discover more than 1500 professional designs like these:

Essential Course

Practical Course

Course 3D Style

Neodigital CPD Course

Laws and Regulations Course

Customer Service Course

Dynamic Visual Course

Transcript

Lesson 2: Description Complexity of Integers

Problem: Can numbers be simpler than the length of their binary representation?
Integers are central to an algorithmic information approach to artificial intelligence.

Start

Beyond standard binary coding

Integer complexity

Beyond standard binary coding

  • Integers are used to retrieve objects by their rank in lists. Even seemingly complex objects such as convoluted emojis are no more complex that their address in the set of emojis. And this address is nothing but an integer.
  • Integers are much more important than that. Integers are involved in the formal description of most objects AI is dealing with.
  • Integers are objects of interest in themselves as well, because of their role in mathematics (note that "real" numbers with limited precision are, in fact, pairs of integers).
So let’s find a way to measure the complexity of integers.Mathematics (since Leibniz and, before him, ancient Chinese scientists) and Computer science like to represent numbers with bits. A first estimate of complexity consists of counting the number of bits in the standard binary representation (see table).

Beyond standard binary coding

One can see that with this method, the complexity of n, as measured by the length of its binary representation, is (for n > 0): ⌈log2(1+n)⌉ where ⌈x⌉ (with upper brackets) designates the smallest integer larger than x. This method, however, totally ignores the existence of "round" numbers. This is unfortunate. A round number like one billion (1 000 000 000 in base 10) looks much simpler than the 30 bits of its binary representation (111011100110101100101000000000).

Beyond standard binary coding

Integer complexity - video

Beyond standard binary coding

Integer complexity in nutshell

The video explains how integers can be coded in a compact way.
  • Since the standard binary code of n always starts with 1 (except for n = 0), let’s code n as the standard binary code of n+2 stripped of its heading 1. In python: bin(n+2) [3:]
  • To distinguish round numbers (in base 10), we reintroduce a bit at the front: 1 for "normal" integers, 0 for round numbers.
  • A round number n is coded as 0 followed by the compact code of n stripped of its trailing zeros, followed by the compact code of the number of zeros.
  • Space separators are used to make the code unambiguous.

Beyond standard binary coding

Integer complexity in nutshell

Question: Why are we ignoring space separators when counting the length of round numbers compact codes? Answer: Separators are not symbols. You can’t have two separators in a row. They may result from physical signal processing, as for the Morse code, or from low-level word segmentation (e.g. lists of strings in Python: '0010 11010 111' ↔ ['0010','11010','111']). (Note: In Chapter 3, we will consider the possibility of eliminating all space separators altogether. Efficient space-free codings of integers can be designed.)

Beyond standard binary coding

Integer complexity in nutshell

The above video proposed a compact coding method for integers:compact code for integers

Beyond standard binary coding

Coding Integers (1)

Beyond standard binary coding

Coding Integers (2)

Beyond standard binary coding

Our coding system acknowledges the fact that round numbers are simple. What about their immediate neighbours? Intuitively, 5000000011 isn’t much more complex than 5000000000, as it can be expressed as 5000000000 + 11. The coding method illustrated in the figures below captures this idea. We prefix the code by 00 for a positive shift from the round number, and by 01 for a negative shift. Then we indicate the two-part code of the round number, and finally the compact code of the shift.

Beyond standard binary coding

Caveat: These codes use space delimiters, as in the Morse code.

Beyond standard binary coding

Complexity of 1000008
Question ouverte premium

Complexity is "continuous"

Problem: Does the complexity of integers vary erratically or smoothly?
Thanks to the previous coding method, round numbers and their neighbours receive shorter codes than other numbers of the same magnitude. The two plots below show the complexity of each number n on the x-axis. They have been drawn using the above compact integer coding method, implemented in a Python program . The purple curve indicates log2(n) for reference. On a large scale (first image), we can see that the compact code length roughly evolves like log(n). Note that the complexity cannot exceed log2(n) by more than one bit. We can observe the round-number effect: complexity drops for multiples of 10, and even more so for multiples of 100 and of 1000. A zoom around 50000 (second image) shows that C(n) is roughly "continuous".

Complexity is "continuous"

To capture the fact that the complexity of integers is a "quasi-continuous" function, we may write that for any integers n and h :|C(n+h) – C(n)| < f(|h|) + O(1)where f is an increasing function of |h| such that f(0) = 0 and O(1) is a constant.

Complexity is "continuous"

Continuity of complexity

Which function f among the following is the most appropriate to capture the "quasi-continuity" of complexity?

Complexity and Structure

Problem: Can we define the complexity of new objects, based on their structure?

Complexity and Structure

For now, we are able to estimate the complexity of integers and of objects that are stored in memory (through the complexity of their address). What about new objects?Consider, for instance, the case of passwords. When looking for a good password, we try to outsmart machines. This situation offers a nice illustration of the importance of complexity.
Question: Complex passwords are impossible to remember. Can we define what a good password should be? Answer: A good password should be easy to remember, but hard to guess. In AIT’s terms, it should be complex to anyone but you.

Complexity and Structure

Password complexity

How would you rank the complexity of the following passwords (from 1=simplest to 6=most complex)?

Complexity and Structure

Password complexity

Passwords are character strings. Some strings are obviously simpler than others. How can we quantify this? A first step consists of finding a compact formal representation of the string, such as repetition(a, 8) , and then translate this formal representation into bits using some appropriate code, and finally counts the bits.

Complexity and Structure

String Structures in a nutshell

Alphabetic strings can be represented using operators: repetition, increment, mapping.These operators can be combined. For instance:mapping(increment(1, a, 4), repetition(X, 2))first generates abcd, and then maps the repetition operation on it to produce aabbccdd.

Complexity and Structure

Simple Structure 1

Using the coding scheme presented in the video, determine which string is represented by: mapping(increment(1,a,4), increment(1,X,2))).

Complexity and Structure

Simple Structure 2

Provide an expression that represents abcdefgh.
question premium

Complexity and Structure

Complexity and Structure

Suppose we want to measure the complexity of a password such as abcdefgh. We can code for the letters one by one. The codes that appear in the video assume standard binary code. We can get a slightly more compact code using our compact integer coding scheme (see table).If we add up all the bits used in codes, we get a total of 16 bits for abcdefgh (not counting space delimiters). But we can do better than that, if we consider that abcdefgh can be represented as increment(1, a, 8). To do so, suppose that we limit ourselves to the following list of available operators, with the corresponding codes.

Complexity and Structure

Suppose now that we consider three kinds of entities, coded accordingly:
Using this coding scheme, an object like letter e will be coded as 00 10, where the first two bits designate letters and the last two bits designate the rank of e among letters.

Complexity and Structure

Using the code 1

What would be the correct code to designate number 6 in our small language?

Complexity and Structure

Using the code 2

What would be the correct code to designate operator mapping in our small language?

Complexity and Structure

Using the code 3

What would be the correct code to designate letter b in our small language?

Complexity and Structure

Since we are able to represent abcdefgh as increment(1, a, 8), we can now translate it into a binary representation. We still use our compact integer coding system. Note that when the type of the object is known, we can avoid using a complete code and thus get an overall shorter representation.

Complexity and Structure

We end up with the following code: 0 1 1 00 0 010 to represent abcdefgh. It is 9 bits long (ignoring space delimiters), which is more compact that the 16 bits of the letter-by-letter code (note: the code is decodable only if we keep the spaces between code words). Our coding scheme is sufficient to provide an estimate of the description complexity of alphabetic strings.

Description length of abcdabcd

Qst premium

Description length of abcdabcd

Note that the resulting description length of abcdabcd is larger than the description length of abcdefgh. This might explain why abcdabcd is perceived as more complex than abcdefgh.

Description length of abcdabcd

Lesson 2 completed !

Score :

Chapter 1: Describing Data :

Lesson 1: Description Length

Lesson 2: Description Complexity of Integers

Lesson 3: Defining Complexity

Lesson 4: Conditional Complexity

Conclusion (4 Questions)

Back to home