Coding theory. Types of coding. The birth of coding theory Classification of coding methods

06.11.2021

"The purpose of this course is to prepare you for your technical future."

Hi, Habr. Remember the awesome article "You and Your Job" (+219, 2442 bookmark, 394k reads)?

So Hamming (yes, yes, self-checking and self-correcting Hamming codes) has a whole book based on his lectures. We translate it, because the man speaks the matter.

This book is not just about IT, this is a book about the thinking style of incredibly cool people. “It's not just a charge of positive thinking; it describes the conditions that increase the chances of doing great work. "

We have already translated 28 (out of 30) chapters. And we are working on the publication "on paper".

Coding theory - I

Having considered computers and how they work, we will now look at the issue of information presentation: how computers represent the information that we want to process. The meaning of any character can depend on the way it is processed, the machine has no specific meaning for the bit used. When discussing the history of software in Chapter 4, we considered some synthetic programming language, in which the code of the stop instruction coincided with the code of other instructions. This situation is typical for most languages, the meaning of the instruction is determined by the corresponding program.

To simplify the problem of presenting information, let us consider the problem of transferring information from point to point. This question is related to the issue of preserving information. The problems of transmitting information in time and space are identical. Figure 10.1 shows a typical communication model.

Figure 10.1

On the left in Figure 10.1 is the source of information. When considering a model, we don't care about the nature of the source. It can be a set of alphabet symbols, numbers, mathematical formulas, musical notes, symbols with which we can represent dance movements - the nature of the source and the meaning of the symbols stored in it are not part of the transmission model. We consider only the source of information, with this limitation, a powerful, general theory is obtained that can be extended to many areas. It is an abstraction from many applications.

When Shannon created information theory in the late 1940s, it was believed that it should be called communication theory, but he insisted on the term information. This term has become a constant source of both increased interest and constant disappointment in theory. Investigators wanted to build whole "information theories", they degenerated into a theory about the set of symbols. Returning to the transmission model, we have a data source that needs to be encoded for transmission.

The encoder consists of two parts, the first part is called the source encoder, the exact name depends on the type of source. Sources of different data types correspond to different types of encoders.

The second part of the coding process is called channel coding and depends on the type of data transmission channel. Thus, the second part of the coding process is matched to the type of transmission channel. Thus, when using standard interfaces, data from the source is first encoded according to the requirements of the interface, and then according to the requirements of the used data transmission channel.

According to the model in Figure 10.1, the data link is exposed to “extra random noise”. All the noise in the system is combined at this point. It is assumed that the encoder receives all symbols without distortion, and the decoder performs its function without errors. This is some idealization, but for many practical purposes it is close to reality.

The decoding phase also consists of two stages: channel - standard, standard - data receiver. At the end of the transfer, the data is transferred to the consumer. Again, we do not consider the question of how the consumer interprets this data.

As noted earlier, a data transmission system, for example, telephone messages, radio, TV programs, presents data as a set of numbers in the registers of a computer. Again, transmission in space is no different from transmission in time or storage of information. Do you have information that will be required after a while, then it must be encoded and stored at the data storage source. The information is decoded if necessary. If the encoding and decoding system is the same, we transmit the data through the transmission channel unchanged.

The fundamental difference between the presented theory and conventional theory in physics is the assumption that there is no noise at the source and receiver. In fact, errors occur in any piece of hardware. In quantum mechanics, noise occurs at any stage according to the uncertainty principle, and not as an initial condition; in any case, the concept of noise in information theory is not equivalent to that in quantum mechanics.
For definiteness, we will further consider the binary form of data representation in the system. Other forms are handled in a similar way, for the sake of simplicity, we will not consider them.

Let's start by looking at systems with variable length encoded characters, as in the classic Morse code of dots and dashes, in which common characters are short and rare characters are long. This approach allows you to achieve high code efficiency, but it is worth noting that Morse code is ternary, not binary, since it contains a space character between dots and dashes. If all characters in the code are of the same length, then such a code is called block code.

The first obvious necessary property of the code is the ability to unambiguously decode a message in the absence of noise, at least this seems to be a desirable property, although in some situations this requirement can be neglected. The data from the transmission channel looks like a stream of ones and zeros to the receiver.

Let's call two adjacent symbols double expansion, three adjacent symbols triple expansion, and in general, if we send N symbols, the receiver sees additions to the base code of N symbols. The receiver, not knowing the value of N, must split the stream into translated blocks. Or, in other words, the receiver must be able to decompose the stream in a unique way in order to recover the original message.

Consider an alphabet with a small number of characters, usually much larger alphabets. The alphabets of languages start from 16 to 36 characters, including upper and lower case letters, number marks, punctuation. For example, in the ASCII table, 128 = 2 ^ 7 characters.
Consider a special code consisting of 4 characters s1, s2, s3, s4

s1 = 0; s2 = 00; s3 = 01; s4 = 11.

How the receiver should interpret the next received expression

How s1s1s4 or how s2s4?

You cannot give an unambiguous answer to this question, this code is definitely not decoded, therefore, it is unsatisfactory. On the other hand, the code

s1 = 0; s2 = 10; s3 = 110; s4 = 111

Decodes the message in a unique way. Let's take an arbitrary string and see how the receiver will decode it. You need to build a decoding tree According to the form in Figure 10.II. Line

1101000010011011100010100110 …

Can be broken into blocks of characters

110, 10, 0, 10, 0, 110, 111, 0, 0, 0, 10, 10, 0, 110, …

According to the following rule for constructing a decoding tree:

If you are at the top of the tree, you read the next character. When you reach the leaf of the tree, you convert the sequence to a character and go back to the start.

The reason such a tree exists is that no character is a prefix of another, so you always know when to go back to the beginning of the decode tree.

It is necessary to pay attention to the following. First, decoding is a strictly streaming process in which each bit is examined only once. Second, the protocols usually include characters that mark the end of the decoding process and are necessary to indicate the end of the message.

Avoiding the use of the trailing character is a common mistake in code design. Of course, a constant decoding mode can be provided, in which case the trailing symbol is not needed.

Figure 10.II

The next question is codes for streaming (instant) decoding. Consider the code that is obtained from the previous one by displaying characters

s1 = 0; s2 = 01; s3 = 011; s4 = 111.

Suppose we got the sequence 011111...111 ... The only way you can decode the text of the message is to group the bits from the end by 3 in the group and select the groups with a leading zero before ones, after that you can decode. Such a code can be decoded in a unique way, but not instantaneously! To decode, you must wait until the end of the transmission! In practice, this approach negates the decoding speed (McMillan's theorem), therefore, it is necessary to look for ways of instant decoding.

Consider two ways to encode the same character, Si:

s1 = 0; s2 = 10; s3 = 110; s4 = 1110, s5 = 1111,

The decoding tree for this method is shown in Figure 10.III.

Figure 10.III

Second way

s1 = 00; s2 = 01; s3 = 100; s4 = 110, s5 = 111,

The decoding tree of this departure is shown in Figure 10.IV.

The most obvious way to measure code quality is the average length for a set of messages. To do this, it is necessary to calculate the length of the code of each character multiplied by the corresponding probability of occurrence of pi. This will give the length of the entire code. The formula for the average code length L for an alphabet of q characters is as follows

Where pi is the probability of occurrence of the character si, li is the corresponding length of the encoded character. For an efficient code, the value of L should be as small as possible. If P1 = 1/2, p2 = 1/4, p3 = 1/8, p4 = 1/16 and p5 = 1/16, then for code # 1 we get the value of the code length

And for code # 2

The obtained values indicate the preference of the first code.
If all words in the alphabet have the same probability of occurrence, then the second code will be preferable. For example, with pi = 1/5 the length of the code # 1

And the length of code # 2

This result shows the preference of 2 codes. Thus, when designing "good" code, the likelihood of symbols occurring must be considered.

Figure 10.IV

Figure 10.V

Consider Kraft's inequality, which determines the limiting value of the character code length li. Based on basis 2, the inequality is represented in the form

This inequality means that there cannot be too many short symbols in the alphabet, otherwise the sum will be quite large.

To prove Kraft's inequality for any fast unique decoding code, we construct a decoding tree and apply the method of mathematical induction. If a tree has one or two leaves, as shown in Figure 10.V, then no doubt the inequality is true. Further, if the tree has more than two leaves, then we split the long m tree into two subtrees. According to the induction principle, we assume that the inequality is true for each branch of height m -1 or less. According to the principle of induction, applying the inequality to each branch. Let us denote the lengths of the codes of the branches K "and K" ". When two branches of the tree are combined, the length of each increases by 1, therefore, the length of the code consists of the sums K’ / 2 and K ’’ / 2,

The theorem is proved.

Consider the proof of Macmillan's theorem. We apply Kraft's inequality to non-streamlined decoding codes. The proof is based on the fact that for any number K> 1 the n-th power of the number is certainly greater than a linear function of n, where n is a rather large number. We raise Kraft's inequality to the nth power and represent the expression as a sum

Where Nk is the number of characters of length k, the summation starts with the minimum length of the nth character representation and ends with the maximum length nl, where l is the maximum length of the encoded character. It follows from the unique decoding requirement that. The amount is presented in the form

If K> 1, then it is necessary to set n sufficiently large for the inequality to become false. Therefore, k<= 1; теорема Макмиллана доказана.

Consider several examples of the application of Kraft's inequality. Could there be a uniquely decoded code with lengths 1, 3, 3, 3? Yes, since

What about lengths 1, 2, 2, 3? Calculate according to the formula

Inequality is violated! There are too many short characters in this code.

Dot codes (comma codes) are codes that consist of 1 characters, ending with a 0 character, except for the last character, which consists of all ones. One of the special cases is the code

s1 = 0; s2 = 10; s3 = 110; s4 = 1110; s5 = 11111.

For this code, we obtain the expression for the Kraft inequality

In this case, we achieve equality. It is easy to see that for point codes the Kraft inequality degenerates into an equality.

When building the code, you need to pay attention to the Kraft amount. If the Kraft sum begins to exceed 1, then this is a signal about the need to include a character of a different length to reduce the average code length.

It should be noted that Kraft's inequality does not mean that this code is uniquely decoded, but that there is a code with characters of such length that is uniquely decoded. To construct a unique decoded code, you can assign a binary number to the corresponding length in bits li. For example, for lengths 2, 2, 3, 3, 4, 4, 4, 4 we obtain Kraft's inequality

Therefore, such a unique decoded streaming code may exist.

s1 = 00; s2 = 01; s3 = 100; s4 = 101;

S5 = 1100; s6 = 1101; s7 = 1110; s8 = 1111;

I want to draw your attention to what actually happens when we exchange ideas. For example, at this moment in time, I want to transfer an idea from my head to yours. I am pronouncing some words through which I believe you will be able to understand (get) this idea.

But when you later want to convey this idea to your friend, you will almost certainly say completely different words. In reality, meaning or meaning is not confined to any specific words. I have used some words, but you can use completely different words to convey the same idea. Thus, different words can convey the same information. But, as soon as you tell your interlocutor that you do not understand the message, then as a rule, to convey the meaning, the interlocutor will choose another set of words, a second or even a third. Thus, information is not contained in a set of specific words. Once you have received these or those words, then do a great job of translating the words into the idea that the interlocutor wants to convey to you.

We learn to choose words in order to adapt to the interlocutor. In a sense, we choose words by matching our thoughts and the noise level in the channel, although this comparison does not accurately reflect the model that I use to represent noise in the decoding process. In large organizations, the inability to hear what the other person is saying is a major problem. In senior positions, employees hear “what they want to hear”. In some cases, you need to keep this in mind as you move up the career ladder. The presentation of information in a formal form is a partial reflection of the processes from our lives and has found wide application far beyond the boundaries of formal rules in computer applications.

To be continued...

Who wants to help with translation, layout and publication of the book - write in a personal or email [email protected]

By the way, we have also launched the translation of another coolest book -

Coding. Basic concepts.

Various coding methods have been widely used in human practice since time immemorial. For example, decimal positional notation is a way of encoding natural numbers. Another way of encoding natural numbers is Roman numerals, and this method is more visual and natural, indeed, finger - I, five - V, two five - X. However, with this method of encoding, it is more difficult to perform arithmetic operations on large numbers, so it was supplanted by the method coding based on positional decimal notation. From this example, we can conclude that various coding methods have specific features inherent only to them, which, depending on the coding goals, can be either an advantage of a particular coding method or its disadvantage.

Methods of numerical coding of geometric objects and their position in space are widely known: Cartesian coordinates and polar coordinates. And these coding methods differ in their inherent specific features.

Until the 20th century, coding methods and tools played an auxiliary role, but with the advent of computers, the situation changed radically. Coding is widely used in information technology and is often a central issue in solving a variety of problems such as:

- presentation of data of arbitrary nature (numbers, text, graphics) in the computer memory;

- optimal data transmission over communication channels;

- protection of information (messages) from unauthorized access;

- ensuring noise immunity during data transmission via communication channels;

- compression of information.

From the point of view of information theory, coding is a process of unambiguous comparison of the alphabet of a message source and a certain set of conditional symbols, carried out according to a certain rule, and a code (code alphabet) is a complete set (set) of different conditional symbols (code symbols) that can be used for encoding the original message and which are possible with the given encoding rule. The number of different code symbols that make up the code alphabet is called the volume of the code or the volume of the code alphabet. Obviously, the volume of the code alphabet cannot be less than the volume of the alphabet of the original message being encoded. Thus, encoding is the transformation of the original message into a collection or sequence of code symbols that represent the message transmitted over the communication channel.

Encoding can be numeric (digital) and non-numeric, depending on the type in which the code symbols are represented: numbers in any number system or some other objects or signs, respectively.

In most cases, code symbols are a collection or sequence of some of the simplest components, for example, a sequence of numbers in code symbols of a numerical code, which are called elements of a code symbol. The location or serial number of an element in a codeword is determined by its position.

The number of elements of the code character used to represent one character of the alphabet of the original source of the messages is called the code value. If the value of the code is the same for all symbols of the alphabet of the original message, then the code is called uniform, otherwise - uneven. The number of elements included in the code symbol is sometimes called the length of the code symbol.

From the point of view of redundancy, all codes can be divided into non-redundant codes and redundant codes. In redundant codes, the number of elements of code symbols can be reduced due to more efficient use of the remaining elements, in non-redundant codes, reduction of the number of elements in code symbols is impossible.

The problems of coding in the absence of interference and in their presence are significantly different. Therefore, a distinction is made between efficient (entropy) coding and correcting (error-correcting) coding. With efficient coding, the task is to achieve the representation of the alphabet characters of the message source with a minimum number of code symbol elements on average per one character of the alphabet of the message source by reducing the redundancy of the code, which leads to an increase in the message transmission rate. And with corrective (error-correcting) coding, the task is to reduce the probability of errors in the transmission of symbols of the original alphabet by detecting and correcting errors by introducing additional redundancy of the code.

A separate task of coding is to protect messages from unauthorized access, distortion and destruction. With this type of encoding, messages are encoded in such a way that even having received them, an attacker would not be able to decode them. The process of this type of message encoding is called encryption (or encryption), and the decoding process is called decryption (or decryption). The encoded message itself is called encrypted (or simply encryption), and the encoding method used is called a cipher.

Quite often, encoding methods are distinguished into a separate class, which allow constructing (without loss of information) message codes that are shorter in length compared to the original message. Such encoding techniques are called compression or data packing techniques. Compression quality is determined by the compression ratio, which is usually measured as a percentage and which shows how much of the encoded message is shorter than the original.

In automatic processing of information using a computer, as a rule, numerical (digital) coding is used, and, naturally, the question arises of justifying the used number system. Indeed, with a decrease in the base of the number system, the alphabet of the elements of the code symbols is simplified, but the code symbols are lengthened. On the other hand, the larger the base of the number system, the smaller the number of digits required to represent one code symbol, and, consequently, the less time for its transmission, but with the growth of the base of the number system, the requirements for communication channels and technical means of recognizing elementary signals increase significantly corresponding to various elements of the code symbols. In particular, the code of a number written in the binary system is on average approximately 3.5 times longer than the decimal code. Since in all information processing systems it is necessary to store large information arrays in the form of numerical information, one of the essential criteria for choosing the alphabet of the elements of the symbols of the numerical code (i.e., the basis of the number system used) is to minimize the number of electronic elements in storage devices, as well as their simplicity and reliability.

When determining the number of electronic elements required to fix each of the elements of the code symbols, it is necessary to proceed from the practically justified assumption that this requires the number of simplest electronic elements (for example, transistors) equal to the base of the number system a... Then for storage in some device n code character elements will be required M electronic elements:

M = a n. (2.1)

The largest number of distinct numbers that can be recorded in this device N:

N = a n.

Taking the logarithm of this expression and expressing from it n we get:

n= ln N/ ln a.

Transforming expression (2.1) to the form

M= a ∙ ln N/ ln a(2.2)

it is possible to determine at what base of the logarithms a amount of elements M will be minimal for a given N... Differentiating by a function M = f (a) and equating its derivative to zero, we get:

Obviously, for any finite a

ln N/ ln 2 a ≠ 0

and therefore

ln a - 1 = 0,

where a = e ≈ 2.7.

Since the radix can only be an integer, then a is chosen equal to 2 or 3. For example, let us set the maximum capacity of the storage device N= 10 6 numbers. Then for different bases of the number systems ( a)amount of elements ( M) in such a storage device will be, in accordance with expression (2.2), the following (Table 2.1):

Table 2.1.

a
M	39,2	38,2	39,2	42,9		91,2

Therefore, if we proceed from the minimization of the amount of equipment, then the most advantageous will be the binary, ternary and quaternary number systems, which are close in this parameter. But since the technical implementation of devices operating in the binary number system is much simpler, the most widespread in numerical coding are codes based on the base 2 number system.

Is a branch of information theory that studies ways of identifying messages with signals reflecting them. The task of coding theory is to match the source of information with the communication channel.

The object of coding is both discrete and continuous information that comes to the consumer through the information source. The concept of coding means converting information into a form that is convenient for transmission over a specific communication channel.

The reverse operation - decoding - consists in restoring the received message from the encoded form to the generally accepted one available to the consumer.

There are a number of directions in coding theory:

static or efficient coding;
anti-jamming coding;
corrective codes;
cyclic codes;
arithmetic codes.

With the advent of control systems, in particular computers, the role of coding has significantly increased and changed, since the transmission of information is impossible without coding. Recently, in connection with the development of telecommunication systems and the widespread use of computer technology for processing and storing information, a new area of knowledge has arisen - information security.

Coding is a universal way of displaying information during its storage, processing and transmission in the form of a system of correspondences between signals and message elements, with the help of which these elements can be fixed.

A code is a rule for unambiguously converting a message from one symbolic form of a message to another, usually without any loss of information.

If all codewords have the same length, then the code is called uniform, or block.

By an abstract alphabet we mean an ordered discrete set of symbols.

Alphabetic coding. Alphabetic, i.e. letter by letter, encoding can be set by the code table. In fact, the conversion code is some kind of substitution.

Where the alphabet A, the set of words composed in the alphabet B. The set of letter codes is called the set of elementary codes. Alphabetic coding can be used for any set of messages.

Computer data processing is based on the use of binary code. This universal encoding method is suitable for any data, regardless of its origin and content.

Text encoding

Texts are sequences of characters included in a certain alphabet. Text encoding is reduced to the binary encoding of the alphabet on the basis of which it is built. The most commonly used byte coding of the alphabet. In this case, the maximum cardinality of the alphabet is 256 characters. Such an alphabet can contain two sets of alphabetic characters (for example, Russian and Latin), numbers, punctuation and mathematical signs, a space and a small number of additional characters. An example of such an alphabet is the ASCII code.

However, the limited set of 256 character codes no longer meets the increased needs of international communication. The universal 16-bit character coding system UNICODE is becoming more widespread.

The power of the alphabet in the UNICODE coding system is 216 = 65 536 different codes, of which 63 484 codes correspond to the characters of most alphabets, and the remaining 2048 codes are divided in half and form a table of 1024 columns x 1024 lines. There are over a million cells in this table, which can accommodate over a million different characters. These are symbols of "dead" languages, as well as symbols that have no lexical content, pointers, signs, etc. These additional characters require a pair of 16-bit words (16-bit for the line number and 16-bit for the column number).

Thus, the UNICODE system is a universal coding system for all symbols of national writing systems and has the ability to significantly expand.

Image encoding

Drawings, pictures, photos are encoded in raster format... In this view, each image is a rectangular table of color dots. The color and brightness of each individual point are expressed in numerical form, which allows binary code to be used to represent graphical data.

It is customary to represent black and white images in grayscale; for this, the GreyScale model is used. If the luminance of a dot is encoded in one byte, 256 different gray tones can be used. This accuracy is consistent with the sensitivity of the human eye and the capabilities of printing technology.

When coding color images, the principle of color decomposition into components is used; for this, the RGB model is used. The color image on the screen is obtained by mixing three basic colors: red (Red, R), blue (Blue, B) and green (Green, G).

Each pixel on the screen is made up of three closely spaced elements that glow with these colors.

Color displays using this principle are called RGB monitors.

A pixel color code contains information about the proportion of each base color.

color formation scheme

If all three components have the same intensity (brightness), then 8 different colors can be obtained from their combinations (23):








			Brown

Formation of colors at 24-bit color depth:

The greater the color depth, the wider the range of available colors and the more accurate their representation in the digitized image. A pixel with a bit depth equal to one has only 2 (in the first power) possible states - two colors: black or white. A pixel with a bit depth of 8 has 28 or 256 possible color values. A pixel with a bit depth of 24 units has 224 degrees) or 16.7 million possible values. It is believed that 24-bit images, containing 16.7 million colors, accurately reproduce the colors of the world around us. Typically, the bit resolution is specified in the range from 1 to 48 bits / pixel.

When printing on paper, a slightly different color model is used: if the monitor emitted light, the tint was obtained as a result of the addition of colors, then paints absorb light, colors are subtracted. Therefore, cyan (Cyan, C), magenta (Magenta, M) and yellow (Yellow, Y) paints are used as the main ones. In addition, due to the imperfection of dyes, a fourth is usually added to them - black (black, K). To store information about each paint, and in this case, 1 byte is most often used. This coding system is called CMYK.

A coarser color representation uses fewer bits. For example, encoding color graphics with 16-bit numbers is called High Color. In this case, five digits are assigned to each color.

Audio and video encoding

Techniques for working with sound information came to computer technology the latest. An analytical coding method applicable to any audio signal is based on analog-to-digital conversion. The original analog signal is represented as a sequence of digital signals recorded in binary code. The conversion bit determines the amount of data corresponding to a single digital signal. When playing sound, perform the inverse digital-to-analog conversion.

This coding method contains an error, so that the reproduced signal is slightly different from the original.

The tabular synthesis coding method is applicable only to a piece of music. Samples (samples) of sounds of various musical instruments are stored in prepared tables. Numeric codes define the instrument, note, and duration.

When encoding a video signal, you need to record a sequence of images (frames) and sound (soundtrack). The video recording format allows both data streams to be included in one digital sequence.

04.04.2006 Leonid Chernyak Category: Technology

"Open Systems" The creation of computers would have been impossible if the theory of signal coding had not been created simultaneously with "their appearance. The theory of coding" is one of those areas of mathematics that significantly influenced the development of computing.

"Open Systems"

The creation of computers would have been impossible if the theory of signal coding had not been created simultaneously with their appearance.

Coding theory is one of those areas of mathematics that markedly influenced the development of computing. Its scope extends to the transmission of data via real (or noisy) channels, and the subject is to ensure the correctness of the transmitted information. In other words, it learns how best to package data so that after a signal has been transmitted, useful information can be reliably and simply extracted from the data. Sometimes the theory of coding is confused with encryption, but this is not true: cryptography solves the opposite problem, its purpose is to make it difficult to obtain information from data.

The need to encode data was first encountered more than 150 years ago, shortly after the invention of the telegraph. The channels were expensive and unreliable, which made the task of minimizing the cost and increasing the reliability of telegram transmission urgent. The problem was further exacerbated by the laying of transatlantic cables. Since 1845, special code books have come into use; they were used by telegraph operators to manually "compress" messages, replacing common word sequences with shorter codes. At the same time, to check the correctness of the transmission, they began to use parity control, a method that was used to check the correctness of entering punched cards also in computers of the first and second generations. To do this, a specially prepared card with a checksum was inserted into the deck being inserted. If the input device was not very reliable (or the deck was too large), then an error could occur. To correct it, the entry procedure was repeated until the calculated checksum matched the one stored on the card. Not only is this scheme awkward, it also misses double errors. With the development of communication channels, a more effective control mechanism was required.

The first theoretical solution to the problem of data transmission over noisy channels was proposed by Claude Shannon, the founder of the statistical theory of information. Shannon was the star of his time, he was one of the US academic elite. As a graduate student of Vannevar Bush, in 1940 he received the Nobel Prize (not to be confused with the Nobel Prize!), Awarded to scientists under 30 years of age. While at Bell Labs, Shannon wrote The Mathematical Theory of Message Transmission (1948), where he showed that if the bandwidth of the channel is higher than the entropy of the message source, then the message can be encoded so that it will be transmitted without unnecessary delays. This conclusion is contained in one of the theorems proved by Shannon, its meaning boils down to the fact that if there is a channel with sufficient bandwidth, a message can be transmitted with some time delays. In addition, he showed the theoretical possibility of reliable transmission in the presence of noise in the channel. The formula C = W log ((P + N) / N), carved on a modest monument to Shannon, erected in his hometown in Michigan, is compared in value with Albert Einstein's formula E = mc 2.

Shannon's works gave food for a lot of further research in the field of information theory, but they had no practical engineering application. The transition from theory to practice was made possible by the efforts of Richard Hamming, a colleague of Shannon at Bell Labs, who became famous for the discovery of a class of codes that have come to be called "Hamming codes." There is a legend that the invention of their Hamming codes was prompted by the inconvenience of working with punched cards on the Bell Model V relay calculating machine in the mid-40s. He was given time to work on the machine on weekends when there were no operators, and he himself had to fiddle with the input. Be that as it may, but Hamming proposed codes that can correct errors in communication channels, including data transmission lines in computers, primarily between the processor and memory. The Hamming codes provide evidence of how the possibilities pointed out by Shannon's theorems can be practically realized.

Hamming published his article in 1950, although his coding theory dates back to 1947 in internal reports. Therefore, some believe that Hamming should be considered the father of coding theory, not Shannon. However, it is useless to look for the former in the history of technology.

What is certain is that it was Hamming who first proposed the Error-Correcting Code (ECC). Modern modifications of these codes are used in all data storage systems and for exchange between the processor and main memory. One of their variants, Reed-Solomon codes are used in CDs, allowing you to play records without squeaks and noise that could cause scratches and dust particles. There are many versions of codes based on Hamming, they differ in coding algorithms and the number of check bits. Such codes have acquired particular importance in connection with the development of long-distance space communication with interplanetary stations, for example, there are Reed-Muller codes, where there are 32 control bits for seven information bits, or 26 for six.

Among the newest ECC codes are LDPC (Low-Density Parity-check Code) codes. In fact, they have been known for thirty years, but a special interest in them was revealed precisely in recent years, when high-definition television began to develop. LDPC codes are not 100% accurate, but the error rate can be adjusted to the desired level and the bandwidth is utilized as much as possible. Close to them are "Turbo Codes", they are effective when working with objects in deep space and limited bandwidth.

The name of Vladimir Aleksandrovich Kotelnikov is firmly inscribed in the history of coding theory. In 1933, in "Materials on radio communications for the First All-Union Congress on the technical reconstruction of communications," he published the work "On the bandwidth of the air?" and? wire? " The name of Kotelnikov, as an equal, is included in the title of one of the most important theorems of coding theory. This theorem defines the conditions under which the transmitted signal can be restored without loss of information.

This theorem is called by various names, including the "WKS theorem" (the abbreviation WKS is taken from Whittaker, Kotelnikov, Shannon). Some sources use both the Nyquist-Shannon sampling theorem and the Whittaker-Shannon sampling theorem, and in Russian university textbooks, the most common is simply the “Kotelnikov theorem”. In fact, the theorem has a longer history. Its first part was proved in 1897 by the French mathematician Emile Borel. Edmund Whittaker made his contribution in 1915. In 1920, the Japanese Kinnosuki Ogura published amendments to Whittaker's research, and in 1928 the American Harry Nyquist clarified the principles of digitizing and recovering an analog signal.

Claude Shannon(1916 - 2001) from school years showed equal interest in mathematics and electrical engineering. In 1932 he entered the University of Michigan, in 1936 - at the Massachusetts Institute of Technology, from which he graduated in 1940, receiving two degrees - a master's in electrical engineering and a doctorate in mathematics. In 1941, Shannon joined Bell Laboratories. Here he began to develop ideas that later resulted in information theory. In 1948, Shannon published an article "Mathematical theory of communication", where the basic ideas of the scientist were formulated, in particular, the determination of the amount of information through entropy, and also proposed a unit of information that determines the choice of two equally probable options, that is, what was later called a bit ... In 1957-1961, Shannon published works where he proved the theorem on the capacity of noisy communication channels, which now bears his name. In 1957, Shannon became a professor at the Massachusetts Institute of Technology, from where he retired 21 years later. On his "well-deserved retirement" Shannon completely surrendered to his long-standing passion for juggling. He built several juggling machines and even developed a general theory of juggling.

Richard Hamming(1915 - 1998) began his education at the University of Chicago, where he received a bachelor's degree in 1937. In 1939 he received his master's degree from the University of Nebraska and his Ph.D. in mathematics from the University of Illinois. In 1945, Hamming began work as part of the Manhattan Project, a large-scale government research project to create the atomic bomb. In 1946, Hamming went to work at Bell Telephone Laboratories, where he worked with Claude Shannon. In 1976, Hamming received a chair at the Naval Graduate School in Monterey, California.

The work that made him famous, the fundamental study of error detection and correction codes, was published by Hamming in 1950. In 1956, he worked on one of the early mainframes of the IBM 650. His work laid the foundation for a programming language that later evolved into high-level programming languages. In recognition of Hamming's accomplishments in computer science, the IEEE instituted the Distinguished Service Medal for Informatics and Systems Theory, which it named after him.

Vladimir Kotelnikov(1908 - 2005) in 1926 entered the Electrical Engineering Faculty of the Moscow Higher Technical School named after N.E.Bauman (MVTU), but became a graduate of the Moscow Power Engineering Institute (MEI), which separated from the MVTU as an independent institute. During his postgraduate studies (1931-1933) Kotelnikov mathematically formulated and proved the "counting theorem", which was later named after him. After graduating from graduate school in 1933, Kotelnikov, while remaining teaching at MPEI, went to work at the Central Scientific Research Institute of Communications (TsNIIS). In 1941, V.A.Kotelnikov formulated a clear statement of what requirements a mathematically undecipherable system should satisfy, and a proof of the impossibility of its decryption was given. In 1944, Kotelnikov took the position of professor, dean of the radio engineering faculty of MPEI, where he worked until 1980. In 1953, at the age of 45, Kotelnikov was immediately elected a full member of the USSR Academy of Sciences. From 1968 to 1990, V.A.Kotelnikov was also a professor, head of a department at the Moscow Institute of Physics and Technology.

The birth of coding theory

Coding theory. Types of coding Basic concepts of coding theory Previously, coding tools played an auxiliary role and were not considered as a separate subject of mathematical study, but with the advent of computers, the situation has changed radically. Coding literally permeates information technology and is a central issue in solving a variety of (almost all) programming problems: ۞ presentation of data of an arbitrary nature (for example, numbers, text, graphics) in computer memory; ۞ protection of information from unauthorized access; ۞ ensuring noise immunity during data transmission via communication channels; ۞ compression of information in databases. Coding theory is a branch of information theory that studies ways of identifying messages with the signals that represent them. Objective: To coordinate the source of information with the communication channel. Object: Discrete or continuous information coming to the consumer through a source of information. Encoding is the transformation of information into a formula convenient for transmission over a specific communication channel. An example of coding in mathematics is the coordinate method introduced by Descartes, which makes it possible to study geometric objects through their analytical expression in the form of numbers, letters and their combinations - formulas. The concept of coding means converting information into a form that is convenient for transmission over a specific communication channel. Decoding - restoration of a received message from a coded form into a form available to the consumer.

Topic 5.2. Alphabetic coding In general, the coding problem can be represented as follows. Let there be given two alphabets A and B, consisting of a finite number of symbols: and. The elements of the alphabet are called letters. An ordered set in the alphabet A will be called a word, where it is denoted by n = l () = | |. , the number n shows the number of letters in the word and is called the length of the word, An empty word is denoted: For a word, the letter a1 is called the beginning, or prefix, of the word, the letter an is the ending, or postfix, of the word. , and Words can be connected. To do this, the prefix of the second word must immediately follow the postfix of the first, while in the new word they naturally lose their status, unless one of the words was empty. The connection of words and is indicated, the connection of n identical words is indicated, moreover. The set of all non-empty words of the alphabet A will be denoted by A *: Set A is called the alphabet of messages, and set B is called the coding alphabet. The set of words composed in the alphabet B will be denoted by B *.

We denote by F the mapping of words from the alphabet A to the alphabet B. Then the word is called the code of the word. Coding is a universal way of displaying information during its storage, transmission and processing in the form of a system of correspondences between message elements and signals, with the help of which these elements can be fixed. Thus, a code is a rule for unambiguous transformation (i.e., a function) of a message from one symbolic form of representation (original alphabet A) to another (object alphabet B), usually without any loss of information. The process of converting F: A * B * → words of the original alphabet A into alphabet B is called information coding. The process of converting a word back is called decoding. Thus, decoding is the inverse function of F, i.e. F1. to word Since for any encoding the decoding operation must be performed, the mapping must be reversible (bijection). If | B | = m, then F is called mth coding, the most common case is B = (0, 1) binary coding. It is this case that is considered in what follows. If all codewords have the same length, then the code is called uniform, or block. Alphabetic (or letter by letter) encoding can be specified with a code table. Some kind of substitution will serve as a code or encoding function. Then, where,. Such letter-by-letter coding is denoted as a set of elementary codes. Alphabetic coding can be used for any set of messages. Thus, alphabetic coding is the simplest and can always be entered on non-empty alphabets. ... Many letter codes

EXAMPLE Let the alphabets A = (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) B = (0, 1) be given. Then the encoding table can be a substitution:. This is a binary decimal encoding, it is one-to-one and therefore can be decoded. However, the scheme is not one-to-one. For example, a set of six ones 111111 can match either word 333 or 77, or 111111, 137, 3311, or 7111 plus any permutation. An alphabetic coding scheme is called prefix if the elementary code of one letter is not a prefix of the elementary code of another letter. An alphabetical coding scheme is called separable if any word composed of elementary codes can be uniquely decomposed into elementary codes. The alphabetic coding with a separable scheme allows decoding. It can be proved that the prefix scheme is separable. For an alphabetic coding scheme to be separable, the lengths of the elementary codes must satisfy a relation known as Macmillan's inequality. Macmillan Inequality If Alphabetical Coding Scheme

is separable, then the inequality is true EXAMPLE Alphabetical coding scheme A = (a, b) and B = (0, 1), is separable, since, and, therefore, Macmillan's inequality holds. the elementary code of the letter a is the prefix of the elementary code of the letter b. Topic 5.3. Minimum Redundancy Coding In practice, it is important that the message codes are as short as possible. Alphabetic coding is suitable for any message, but if nothing is known about the set of all words of the alphabet A, then it is difficult to formulate the optimization problem precisely. In practice, however, additional information is often available. For example, for messages presented in natural language, such additional information may be the probability distribution of letters appearing in the message. Then the problem of constructing an optimal code acquires an exact mathematical formulation and a rigorous solution.

Let some separable alphabetical coding scheme be given. Then any scheme where the ordered set is a permutation of the ordered set will also be separable. In this case, if the lengths of the elementary set of codes are equal, then their permutation in the scheme does not affect the length of the encoded message. In the event that the lengths of the elementary codes are different, then the length of the message code directly depends on which elementary codes are assigned to which letters, and on the composition of the letters in the message. If a specific message and a specific coding scheme are specified, then such a permutation of codes can be selected, in which the length of the message code will be minimal. Algorithm for assigning elementary codes, in which the length of the code of a fixed message S will be minimal for a fixed scheme: ۞ sort letters in descending order of the number of occurrences; ۞ sort elementary codes in ascending order of length; ۞ put the codes in accordance with the letters in the prescribed manner. Let the alphabet and the probabilities of the letters appear in the message be given:

Where pi is the probability of occurrence of the letter ai, and letters with a zero probability of occurrence in the message are excluded and the letters are ordered in descending order of probability of their occurrence. message, which is designated and defined as EXAMPLE. For a separable alphabetical coding scheme A = (a, b), B = (0,1), with a probability distribution, the coding cost is, and with a probability distribution, the coding cost is

Topic 5.4. Huffman Encoding This algorithm was invented in 1952 by David Huffman. Topic 5.5. Arithmetic Coding As with the Huffman algorithm, it all starts with a table of elements and the corresponding probabilities. Suppose the input alphabet consists of only three elements: a1, a2 and a3, and at the same time P (a1) = 1/2 P (a2) = 1/3 P (a3) = 1/6 Suppose also that we need to encode sequence a1, a1, a2, a3. We split the interval, where p is some fixed number, 0<р<(r1)/2r, а "мощностная" граница где Tr(p)=p logr(p/(r 1))(1р)logr(l p), существенно улучшена. Имеется предположение, чт о верхняя граница полученная методом случайного выбора кода, является асимптотически точной, т. е. Ir(п,[ рп])~пТ r(2р).Доказательство или опровержение этого предположения одна из центральны х задач теории кодирования. Большинство конструкций помехоустойчивых кодов являются эффективными, когда длин а пкода достаточновелика. В связи с этим особое значение приобретают вопросы, связанны е со сложностью устройств,осуществляющих кодирование и декодирование (кодера и деко дера). Ограничения на допустимый типдекодера или его сложность могут приводить к увел ичению избыточности, необходимой для обеспечениязаданной помехоустойчивости. Напр., минимальная избыточность кода в В n 2, для крого существует декодер,состоящий из регист

a shift and one majoritarian element and correcting one error has the order (compare with (2)). As a mathematical. The encoder and decoder models are usually considered from the scheme of functional elements and the complexity is understood as the number of elements in the circuit. For the known classes of codes with error correction, a study of possible algorithms for algorithms and algorithms is carried out, and upper bounds for the complexity of the encoder and decoder are obtained. Some relationships have also been found between the coding transmission rate, the coding noise immunity and the decoder complexity (see). Another direction of research in coding theory is connected with the fact that many results (for example, Shannon's theorem and bound (3)) are not "constructive", but represent theorems on the existence of infinite sequences of (Kn) codes. to prove these results in the class of such sequences (K n) of codes, for which there is a Turing machine that recognizes that an arbitrary word of length l belongs to a set at a time and has a slower order of growth with respect to l (for example, llog l). Certain new constructions and methods for obtaining bounds, developed in coding theory, have led to significant progress in questions that, at first glance, are very far from the traditional problems of coding theory. Here it is necessary to point out the use of a maximal code with correction of one error in an asymptotically optimal method for realizing functions of the algebra of logic by contact circuits; in principle, improvement of the upper bound for the packing density of a re-dimensional Euclidean space by equal balls; on the use of inequality (1) when estimating the complexity of the implementation by formulas of one class of functions of the algebra of logic. The ideas and results of coding theory find their further development in the problems of synthesizing self-correcting circuits and reliable circuits from unreliable elements. Lit .: Shannon K., Works on information theory and cybernetics, trans. from English., M., 1963; Berlekamp E., Algebraic Coding Theory, trans. from English, M., 1971; Peterson, W., Weldon, E., Error-correcting codes, trans. from English, 2nd ed., M., 1976; Discrete mathematics and mathematical problems of cybernetics, v.1, Moscow, 1974, section 5; Bassalygo L. A., Zyablov V. V., Pinsker M. S, "Probl. Information transmission", 1977, v. 13, no. 3, p. 517; [V] Sidelnikov VM, "Mat. Sb.", 1974, vol. 95, v. 1, p. 148 58. V. I. Levenshtein.

Encyclopedia of Mathematics. - M .: Soviet encyclopedia. I. M. Vinogradov. 1977-1985.  ENCODING ALPHABETIC  COEUCLIDE SPACE See also in other dictionaries:  DECODING - see Encoding and decoding ... Encyclopedia of Mathematics  Coding audio information - This article should be wikified. Please fill it out according to the rules of article formatting. Sound coding using a PC is based on the process of converting air vibrations into electrical vibrations ... Wikipedia code images), committed by definition. rules, the aggregate to rykh is called. cipher K., ... ... Philosophical Encyclopedia  INFORMATION CODING - establishing correspondence between message elements and signals, with the help of which these elements can be fixed. Let B, the set of elements of the message, A the alphabet with symbols, Let the finite sequence of symbols be called. word in ... ... Physical encyclopedia  OPTIMAL CODING - (in engineering psychology) (English optimal coding) creation of codes that ensure the maximum speed and reliability of receiving and processing information about an object of control by a human operator (see Information Reception, Decoding). Problem of K. o. ... ... Big psychological encyclopedia  DECODING (in engineering psychology) - (eng see Coding ... ... Big psychological encyclopedia

 Decoding - recovery of a message encoded by transmitted and received signals (see Encoding) ... Economics and Mathematics Dictionary  ENCODING - ENCODING. One of the stages of speech production, while “decoding” is reception and interpretation, the process of understanding a speech message. See psycholinguistics ... New dictionary of methodological terms and concepts (theory and practice of teaching languages)  CODING - (English coding). 1. Conversion of a signal from one energy form to another. 2. Conversion of one system of signals or signs to another, which is often also called "transcoding", "code change" (for speech, "translation"). 3. K. (mnemonic) ... ... Big Encyclopedia of Psychology  Decoding - This article is about code in information theory, for other meanings of this word see code (meanings). The code is a rule (algorithm) for matching each specific message with a strictly defined combination of symbols (signs) (or signals). Also called a code ... ... Optimal encoding The same message can be encoded in different ways. Optimally coded will be considered a code in which the minimum time is spent on message transmission. If the transmission of each elementary symbol (0 or 1) takes the same time, then the optimal code will be one that will have the minimum possible length. Example 1. Let there be a random variable X (x1, x2, x3, x4, x5, x6, x7, x8), which has eight states with a probability distribution. 000, 001, 010, 011, 100, 101, 110, 111 To answer whether this code is good or not, you need to compare it with the optimal value, that is, determine the entropy

Having determined the redundancy L according to the formula L = 1H / H0 = 12.75 / 3 = 0.084, we see that it is possible to reduce the code length by 8.4%. The question arises: is it possible to compose a code in which one letter will be, on average, fewer elementary characters. Such codes exist. These are ShannonFano and Huffman codes. The principle of constructing optimal codes: 1. Each elementary symbol must carry the maximum amount of information, for this it is necessary that elementary symbols (0 and 1) in the encoded text occur on average equally often. Entropy in this case will be maximum. 2. It is necessary to assign the shorter code words of the secondary alphabet to the letters of the primary alphabet, which have a high probability.