Want to create interactive content? It’s easy in Genially!
Get started free
Understanding - Chapitre 1 Lesson 2
Samy
Created on November 14, 2024
Start designing with a free template
Discover more than 1500 professional designs like these:
View
Essential Course
View
Practical Course
View
Course 3D Style
View
Neodigital CPD Course
View
Laws and Regulations Course
View
Customer Service Course
View
Dynamic Visual Course
Transcript
Lesson 2: Description Complexity of Integers
Problem: Can numbers be simpler than the length of their binary representation?
Integers are central to an algorithmic information approach to artificial intelligence.
Start
Beyond standard binary coding
Integer complexity
Beyond standard binary coding
- Integers are used to retrieve objects by their rank in lists. Even seemingly complex objects such as convoluted emojis are no more complex that their address in the set of emojis. And this address is nothing but an integer.
- Integers are much more important than that. Integers are involved in the formal description of most objects AI is dealing with.
- Integers are objects of interest in themselves as well, because of their role in mathematics (note that "real" numbers with limited precision are, in fact, pairs of integers).
Beyond standard binary coding
One can see that with this method, the complexity of n, as measured by the length of its binary representation, is (for n > 0): ⌈log2(1+n)⌉ where ⌈x⌉ (with upper brackets) designates the smallest integer larger than x. This method, however, totally ignores the existence of "round" numbers. This is unfortunate. A round number like one billion (1 000 000 000 in base 10) looks much simpler than the 30 bits of its binary representation (111011100110101100101000000000).
Beyond standard binary coding
Integer complexity - video
Beyond standard binary coding
Integer complexity in nutshell
The video explains how integers can be coded in a compact way.- Since the standard binary code of n always starts with 1 (except for n = 0), let’s code n as the standard binary code of n+2 stripped of its heading 1. In python: bin(n+2) [3:]
- To distinguish round numbers (in base 10), we reintroduce a bit at the front: 1 for "normal" integers, 0 for round numbers.
- A round number n is coded as 0 followed by the compact code of n stripped of its trailing zeros, followed by the compact code of the number of zeros.
- Space separators are used to make the code unambiguous.
Beyond standard binary coding
Integer complexity in nutshell
Question: Why are we ignoring space separators when counting the length of round numbers compact codes? Answer: Separators are not symbols. You can’t have two separators in a row. They may result from physical signal processing, as for the Morse code, or from low-level word segmentation (e.g. lists of strings in Python: '0010 11010 111' ↔ ['0010','11010','111']). (Note: In Chapter 3, we will consider the possibility of eliminating all space separators altogether. Efficient space-free codings of integers can be designed.)
Beyond standard binary coding
Integer complexity in nutshell
The above video proposed a compact coding method for integers:compact code for integers
Beyond standard binary coding
Coding Integers (1)
Beyond standard binary coding
Coding Integers (2)
Beyond standard binary coding
Our coding system acknowledges the fact that round numbers are simple. What about their immediate neighbours? Intuitively, 5000000011 isn’t much more complex than 5000000000, as it can be expressed as 5000000000 + 11. The coding method illustrated in the figures below captures this idea. We prefix the code by 00 for a positive shift from the round number, and by 01 for a negative shift. Then we indicate the two-part code of the round number, and finally the compact code of the shift.
Beyond standard binary coding
Caveat: These codes use space delimiters, as in the Morse code.
Beyond standard binary coding
Complexity of 1000008
Question ouverte premium
Complexity is "continuous"
Problem: Does the complexity of integers vary erratically or smoothly?
Thanks to the previous coding method, round numbers and their neighbours receive shorter codes than other numbers of the same magnitude. The two plots below show the complexity of each number n on the x-axis. They have been drawn using the above compact integer coding method, implemented in a Python program . The purple curve indicates log2(n) for reference. On a large scale (first image), we can see that the compact code length roughly evolves like log(n). Note that the complexity cannot exceed log2(n) by more than one bit. We can observe the round-number effect: complexity drops for multiples of 10, and even more so for multiples of 100 and of 1000. A zoom around 50000 (second image) shows that C(n) is roughly "continuous".
Complexity is "continuous"
To capture the fact that the complexity of integers is a "quasi-continuous" function, we may write that for any integers n and h :|C(n+h) – C(n)| < f(|h|) + O(1)where f is an increasing function of |h| such that f(0) = 0 and O(1) is a constant.
Complexity is "continuous"
Continuity of complexity
Which function f among the following is the most appropriate to capture the "quasi-continuity" of complexity?
Complexity and Structure
Problem: Can we define the complexity of new objects, based on their structure?
Complexity and Structure
For now, we are able to estimate the complexity of integers and of objects that are stored in memory (through the complexity of their address). What about new objects?Consider, for instance, the case of passwords. When looking for a good password, we try to outsmart machines. This situation offers a nice illustration of the importance of complexity.
Question: Complex passwords are impossible to remember. Can we define what a good password should be? Answer: A good password should be easy to remember, but hard to guess. In AIT’s terms, it should be complex to anyone but you.
Complexity and Structure
Password complexity
How would you rank the complexity of the following passwords (from 1=simplest to 6=most complex)?
Complexity and Structure
Password complexity
Passwords are character strings. Some strings are obviously simpler than others. How can we quantify this? A first step consists of finding a compact formal representation of the string, such as repetition(a, 8) , and then translate this formal representation into bits using some appropriate code, and finally counts the bits.
Complexity and Structure
String Structures in a nutshell
Alphabetic strings can be represented using operators: repetition, increment, mapping.These operators can be combined. For instance:mapping(increment(1, a, 4), repetition(X, 2))first generates abcd, and then maps the repetition operation on it to produce aabbccdd.
Complexity and Structure
Simple Structure 1
Using the coding scheme presented in the video, determine which string is represented by: mapping(increment(1,a,4), increment(1,X,2))).
Complexity and Structure
Simple Structure 2
Provide an expression that represents abcdefgh.
question premium
Complexity and Structure
Complexity and Structure
Suppose we want to measure the complexity of a password such as abcdefgh. We can code for the letters one by one. The codes that appear in the video assume standard binary code. We can get a slightly more compact code using our compact integer coding scheme (see table).If we add up all the bits used in codes, we get a total of 16 bits for abcdefgh (not counting space delimiters). But we can do better than that, if we consider that abcdefgh can be represented as increment(1, a, 8). To do so, suppose that we limit ourselves to the following list of available operators, with the corresponding codes.
Complexity and Structure
Suppose now that we consider three kinds of entities, coded accordingly:
Using this coding scheme, an object like letter e will be coded as 00 10, where the first two bits designate letters and the last two bits designate the rank of e among letters.
Complexity and Structure
Using the code 1
What would be the correct code to designate number 6 in our small language?
Complexity and Structure
Using the code 2
What would be the correct code to designate operator mapping in our small language?
Complexity and Structure
Using the code 3
What would be the correct code to designate letter b in our small language?
Complexity and Structure
Since we are able to represent abcdefgh as increment(1, a, 8), we can now translate it into a binary representation. We still use our compact integer coding system. Note that when the type of the object is known, we can avoid using a complete code and thus get an overall shorter representation.
Complexity and Structure
We end up with the following code: 0 1 1 00 0 010 to represent abcdefgh. It is 9 bits long (ignoring space delimiters), which is more compact that the 16 bits of the letter-by-letter code (note: the code is decodable only if we keep the spaces between code words). Our coding scheme is sufficient to provide an estimate of the description complexity of alphabetic strings.
Description length of abcdabcd
Qst premium
Description length of abcdabcd
Note that the resulting description length of abcdabcd is larger than the description length of abcdefgh. This might explain why abcdabcd is perceived as more complex than abcdefgh.
Description length of abcdabcd
Lesson 2 completed !
Score :
Chapter 1: Describing Data :
Lesson 1: Description Length
Lesson 2: Description Complexity of Integers
Lesson 3: Defining Complexity
Lesson 4: Conditional Complexity
Conclusion (4 Questions)
Back to home