The dynamic range is compressed or standard. Reverse mastering: can the dynamic range of compressed recordings be increased? Using dynamic compression

  • 27.10.2021

This group of methods is based on the fact that the transmitted signals are subjected to nonlinear transformations of the amplitude, and in the transmitting and receiving parts of the nonlinearity are reciprocal. For example, if the transmitter uses the non-linear function Öu, the receiver uses u 2. Sequential application of inverse functions will result in the transformation remaining linear as a whole.

The idea of ​​non-linear methods of data compression boils down to the fact that the transmitter can transmit a larger range of changes in the transmitted parameter (that is, a larger dynamic range) with the same amplitude of the output signals. Dynamic range is the ratio of the largest allowable signal amplitude to the smallest, expressed in relative units or decibels:

; (2.17)
. (2.18)

The natural desire to increase the dynamic range by decreasing U min is limited by the sensitivity of the equipment and an increase in the influence of interference and intrinsic noise.

Most often, dynamic range compression is performed using a pair of inverse logarithm and potentiation functions. The first operation to change the amplitude is called compression(by compression), the second is expansion(by stretching). The choice of these functions is associated with their greatest compression capability.

At the same time, these methods also have disadvantages. The first is that the logarithm of a small number is negative in the limit:

that is, the sensitivity is very non-linear.

To mitigate these disadvantages, both functions are modified with offset and approximation. For example, for telephone channels the approximated function has the form (type A,):

and A = 87.6. The gain from compression is 24 dB.

Data compression by means of non-linear procedures is implemented by analog means with large errors. The use of digital tools can significantly improve the accuracy or speed of conversion. At the same time, the direct use of computer technology (that is, the direct calculation of logarithms and exponentials) will not give the best result due to low performance and the accumulating calculation error.

Due to limitations in accuracy, data compression by compression is used in non-critical cases, for example, for voice transmission over telephone and radio channels.

Efficient coding

Effective codes were proposed by K. Shannon, Fano and Huffman. The essence of the codes lies in the fact that they are uneven, that is, with an unequal number of bits, and the length of the code is inversely proportional to the probability of its occurrence. Another remarkable feature of efficient codes is that they do not require separators, that is, special characters that separate adjacent code combinations. This is achieved by following a simple rule: shorter codes are not the beginning of longer ones. In this case, the continuous bitstream is uniquely decoded since the decoder detects shorter codewords first. Effective codes have long been purely academic, but recently they have been successfully used in the formation of databases, as well as in the compression of information in modern modems and in software archivers.

Due to the unevenness, the average length of the code is entered. Average length is the mathematical expectation of the length of the code:

moreover, l cf tends to H (x) from above (that is, l cf> H (x)).

The fulfillment of condition (2.23) increases with increasing N.

There are two types of effective codes: Shannon-Fano and Huffman. Let's see how to get them using an example. Suppose the probabilities of symbols in a sequence have the values ​​shown in Table 2.1.

Table 2.1.

Symbol probabilities

N
p i 0.1 0.2 0.1 0.3 0.05 0.15 0.03 0.02 0.05

Symbols are ranked, that is, they are presented in a series in descending order of probabilities. After that, according to the Shannon-Fano method, the following procedure is periodically repeated: the entire group of events is divided into two subgroups with the same (or approximately the same) total probabilities. The procedure continues until one element remains in the next subgroup, after which this element is eliminated, and the indicated actions continue with the remaining ones. This happens until one element remains in the last two subgroups. Let's continue our example, which is summarized in Table 2.2.

Table 2.2.

Shannon-Fano coding

N P i
4 0.3 I
0.2 I II
6 0.15 I I
0.1 II
1 0.1 I I
9 0.05 II II
5 0.05 II I
7 0.03 II II I
8 0.02 II

As can be seen from Table 2.2, the first symbol with the probability p 4 = 0.3 participated in two procedures of splitting into groups and both times fell into the group with number I. Accordingly, it is coded with a two-digit code II. The second element at the first stage of the partition belonged to group I, at the second - to group II. Therefore, its code is 10. Codes of other symbols do not need additional comments.

Typically, non-uniform codes are depicted as code trees. A code tree is a graph indicating the allowed code combinations. The directions of the edges of this graph are pre-set, as shown in Fig. 2.11 (the choice of directions is arbitrary).

They are guided by the graph as follows: make a route for the selected symbol; the number of digits for it is equal to the number of edges in the route, and the value of each digit is equal to the direction of the corresponding edge. The route is compiled from the starting point (in the drawing it is marked with the letter A). For example, a route to vertex 5 consists of five edges, of which all but the last have direction 0; we get the code 00001.

Let's calculate the entropy and average word length for this example.

H (x) = - (0.3 log 0.3 + 0.2 log 0.2 + 2 0.1 log 0.1+ 2 0.05 log 0.05+

0.03 log 0.03 + 0.02 log 0.02) = 2.23 bit

l av = 0.3 2 + 0.2 2 + 0.15 3 + 0.1 3 + 0.1 4 + 0.05 5 +0.05 4+

0.03 6 + 0.02 6 = 2.9 .

As you can see, the average word length is close to entropy.

Huffman codes are constructed according to a different algorithm. The coding procedure consists of two stages. At the first stage, one-time compression of the alphabet is carried out sequentially. One-time compression - replacing the last two characters (with the lowest probabilities) with one, with the total probability. Compression is carried out until two characters remain. At the same time, the coding table is filled in, in which the resulting probabilities are put down, and the routes along which the new symbols go to the next stage are also depicted.

At the second stage, the actual coding takes place, which begins from the last stage: the first of the two characters is assigned a code of 1, the second - 0. After that, one moves to the previous stage. The codes from the next stage are attributed to the characters that did not participate in compression at this stage, and the code of the character obtained after gluing is twice ascribed to the last two characters and added to the code of the upper character 1, the lower one - 0. If the character is not further in the gluing participates, its code remains unchanged. The procedure continues until the end (that is, until the first stage).

Table 2.3 shows the Huffman coding. As you can see from the table, coding was carried out in 7 stages. On the left are the probabilities of the symbols, on the right - intermediate codes. The arrows show the movements of the newly formed symbols. At each stage, the last two characters differ only in the least significant bit, which corresponds to the coding technique. Let's calculate the average word length:

l av = 0.3 2 + 0.2 2 + 0.15 3 ++ 2 0.1 3 + +0.05 4 + 0.05 5 + 0.03 6 + 0.02 6 = 2.7

This is even closer to entropy: the code is even more efficient. In fig. 2.12 shows the Huffman code tree.

Table 2.3.

Huffman coding

N p i code I II III IV V VI Vii
0.3 0.3 11 0.3 11 0.3 11 0.3 11 0.3 11 0.4 0 0.6 1
0.2 0.2 01 0.2 01 0.2 01 0.2 01 0.3 10 0.3 11 0.4 0
0.15 0.15 101 0.15 101 0.15 101 0.2 00 0.2 01 0.3 10
0.1 0.1 001 0.1 001 0.15 100 0.15 101 0.2 00
0.1 0.1 000 0.1 000 0.1 001 0.15 100
0.05 0.05 1000 0.1 1001 0.1 000
0.05 0.05 10011 0.05 1000
0.03 0.05 10010
0.02

Both codes satisfy the requirement of decoding unambiguity: as can be seen from the tables, shorter combinations are not the beginning of longer codes.

With an increase in the number of symbols, the efficiency of the codes increases, therefore, in some cases, larger blocks are encoded (for example, when it comes to texts, you can encode some of the most common syllables, words, and even phrases).

The effect of the introduction of such codes is determined by comparing them with a uniform code:

(2.24)

where n is the number of bits of the uniform code, which is replaced by the effective one.

Huffman codes modifications

The classical Huffman algorithm refers to two-pass, i.e. requires first the collection of statistics on symbols and messages, and then the procedures described above. This is inconvenient in practice, since it increases the processing time of messages and accumulation of the dictionary. One-pass methods are used more often, in which the accumulation and coding procedures are combined. Such methods are also called adaptive compression according to Huffman [46].

The essence of Huffman adaptive compression is reduced to the construction of the initial code tree and its sequential modification after the arrival of each next character. As before, the trees are binary here, i.e. at most two arcs emanate from each vertex of the tree-graph. It is customary to call the original vertex the parent, and the two subsequent vertices associated with it, the children. Let us introduce the concept of the weight of a vertex - this is the number of symbols (words) corresponding to a given vertex, obtained when submitting the original sequence. Obviously, the sum of the weights of the children is equal to the weight of the parent.

After the introduction of the next character of the input sequence, the code tree is revised: the weights of the vertices are recalculated and, if necessary, the vertices are rearranged. The vertex permutation rule is as follows: the weights of the lower vertices are the smallest, and the vertices on the left of the graph have the least weights.

At the same time, the vertices are numbered. The numbering starts from the bottom (dangling, i.e., having no children) vertices from left to right, then moves to the top level, etc. before the numbering of the last, original vertex. In this case, the following result is achieved: the less the weight of the vertex, the less is its number.

Permutation is done mainly for hanging peaks. When rearranging, the rule formulated above should be taken into account: vertices with higher weight also have a higher number.

After passing the sequence (it is also called control or test), code combinations are assigned to all pendant vertices. The rule for assigning codes is similar to the above: the number of code bits is equal to the number of vertices through which the route passes from the source to the given hanging vertex, and the value of a particular bit corresponds to the direction from parent to "child" (for example, the transition to the left from the parent corresponds to the value 1, to the right - 0 ).

The resulting code combinations are entered into the memory of the compression device together with their counterparts and form a dictionary. The use of the algorithm is as follows. The compressed sequence of characters is split into fragments in accordance with the available dictionary, after which each of the fragments is replaced by its code from the dictionary. Fragments not found in the dictionary form new hanging vertices, gain weight, and are also entered into the dictionary. Thus, an adaptive dictionary replenishment algorithm is formed.

To improve the efficiency of the method, it is desirable to increase the size of the dictionary; in this case, the compression ratio is increased. Practically the size of the dictionary is 4 - 16 Kbytes of memory.


Let us illustrate the given algorithm with an example. In fig. 2.13 shows the original diagram (it is also called the Huffman tree). Each vertex of the tree is shown by a rectangle, in which two numbers are inscribed through a fraction: the first means the number of the vertex, the second - its weight. As you can see, the correspondence of the weights of the vertices and their numbers is fulfilled.

Suppose now that the symbol corresponding to vertex 1 is encountered a second time in the test sequence. The vertex weight has changed as shown in Fig. 2.14, as a result of which the vertex numbering rule is violated. At the next stage, we change the location of the hanging vertices, for which we swap vertices 1 and 4 and renumber all the tree vertices. The resulting graph is shown in Fig. 2.15. Further, the procedure continues in the same way.

It should be remembered that each dangling vertex in the Huffman tree corresponds to a certain symbol or a group of them. A parent differs from children in that the group of symbols corresponding to him is one symbol shorter than that of his children, and these children differ in the last symbol. For example, the symbols "kar" correspond to the parent; then the children may have the sequences "kara" and "carp".

The given algorithm is not academic and is actively used in archiving programs, including when compressing graphic data (we will talk about them below).

Lempel - Ziv Algorithms

These are the most commonly used compression algorithms today. They are used in most archiving programs (eg PKZIP. ARJ, LHA). The essence of the algorithms is that a certain set of symbols is replaced during archiving by its number in a specially formed dictionary. For example, the phrase "Outgoing number to your letter ...", which is often found in business correspondence, may occupy position 121 in the dictionary; then, instead of transmitting or storing the mentioned phrase (30 bytes), you can store the phrase number (1.5 bytes in binary - decimal form or 1 byte - in binary).

The algorithms are named after the authors who first proposed them in 1977. The first of these is LZ77. For archiving, a so-called sliding window is created, consisting of two parts. The first part, of a larger format, serves to form a dictionary and has a size of the order of several kilobytes. The second, smaller part (usually up to 100 bytes in size) accepts the current characters of the text being viewed. The algorithm tries to find in the dictionary a collection of symbols that match those received in the viewport. If this succeeds, a code is formed that consists of three parts: the offset in the dictionary relative to its initial substring, the length of this substring, the character following this substring. For example, the selected substring consists of the characters "app" (6 characters in total) followed by the character "e". Then, if the substring has the address (place in the dictionary) 45, then the entry in the dictionary has the form "45, 6. e". After that, the contents of the window are shifted by one position, and the search continues. Thus, a dictionary is formed.

The advantage of the algorithm is the easily formalized dictionary compilation algorithm. In addition, it is possible to unzip without the original dictionary (it is desirable to have a test sequence at the same time) - the dictionary is formed in the course of unzipping.

The disadvantages of the algorithm appear when the size of the dictionary increases - the search time increases. In addition, if a string of characters that is not in the dictionary appears in the current window, each character is written with a three-element code, i.e. the result is not compression, but stretching.

The LZSS algorithm, proposed in 1978, has the best characteristics. It has differences in sliding window maintenance and compressor output codes. In addition to the window, the algorithm forms a binary tree similar to the Huffman tree to speed up the search for matches: each substring that leaves the current window is added to the tree as one of the children. This algorithm allows you to additionally increase the size of the current window (it is desirable that its size is equal to powers of two: 128, 256, etc. bytes). The sequence codes are also formed in a different way: an additional 1-bit prefix is ​​introduced to distinguish uncoded characters from the "offset, length" pairs.

An even higher compression ratio is obtained when using algorithms such as LZW. The algorithms described earlier have a fixed window size, which makes it impossible to enter phrases longer than the window size into the dictionary. In the LZW algorithms (and their predecessor LZ78), the viewport has an unlimited size, and the dictionary accumulates phrases (not a collection of characters, as before). The dictionary has an unlimited length, and the encoder (decoder) works in the phrase standby mode. When the phrase that matches the dictionary is formed, the matching code (i.e. the code of this phrase in the dictionary) and the code of the next character after it are returned. If, as the characters accumulate, a new phrase is formed, it is also entered into the dictionary, like the shorter one. The result is a recursive procedure that provides fast encoding and decoding.

Additional compression capability is provided by compressed repetitive symbol encoding. If in the sequence some characters follow in a row (for example, in the text it can be characters "space", in a numerical sequence - consecutive zeros, etc.), then it makes sense to replace them with a pair of "character; length" or "sign, length ". In the first case, the code indicates the sign that the sequence will be encoded (usually 1 bit), then the code of the repeated symbol and the length of the sequence. In the second case (provided for the most frequently occurring repeating characters), the prefix simply indicates the repeating attribute.

, Media players

Records, especially older ones that were recorded and produced prior to 1982, were much less likely to be mixed and made louder. They reproduce natural music with a natural dynamic range that is retained on record and lost in most standard digital or high definition formats.

There are, of course, exceptions - listen to the recently released Steven Wilson album from MA Recordings or Reference Recordings and you will hear how good digital sound can be. But this is rare, most modern recordings are loud and compressed.

Compression of music has come under serious criticism lately, but I would argue that almost all of your favorite recordings are compressed. Some of them are less, some more, but still compressed. Dynamic range compression is a scapegoat for poor musical sound, but highly compressed music is not a new trend: listen to 60s Motown albums. The same can be said for the classics of Led Zeppelin or the younger albums by Wilco and Radiohead. Dynamic range compression reduces the natural relationship between the loudest and quietest sounds on a recording, so whispers can be as loud as screams. It's pretty hard to find pop music of the past 50 years that hasn't been compressed.

I recently had a nice chat with Tape Op founder and editor Larry Crane about the good, bad and evil aspects of compression. Larry Crane has worked with bands and artists such as Stefan Marcus, Cat Power, Sleater-Kinney, Jenny Lewis, M. Ward, The Go-Betweens, Jason Little, Eliot Smith, Quasi and Richmond Fontaine. He also runs Jackpot Recording Studio! in Portland, Oregon, home to The Breeders, The Decemberists, Eddie Vedder, Pavement, R.E.M., She & Him and many, many more.

As an example of surprisingly unnatural sounding, but still great songs, I cite Spoon's album They Want My Soul, released in 2014. Crane laughs and says that he listens to him in the car because he sounds great there. Which brings us to another answer to the question of why music is compressed: because the compression and extra "clarity" make it better heard in noisy places.

Larry Crane at work. Photo by Jason Quigley

When people say they like the sound of an audio recording, I believe they like music, as if sound and music were inseparable terms. But for myself, I differentiate these concepts. From a music lover's point of view, the sound can be rough and raw, but that won't matter to most listeners.

Many are in a hurry to accuse mastering engineers of overusing compression, but compression is applied directly during recording, during mixing, and only then during mastering. If you were not personally present at each of these stages, then you will not be able to tell how the instruments and vocals sounded at the very beginning of the process.

Crane was on fire: "If a musician wants to deliberately make the sound insane and distorted like Guided by Voices recordings, there is nothing wrong with that - the desire always outweighs the sound quality." The performer's voice is almost always compressed, the same happens with bass, drums, guitars and synthesizers. Compression maintains the vocal volume at the desired level throughout the song, or slightly stands out from the rest of the sounds.

Correct compression can make the drum sound livelier or intentionally weird. In order for music to sound great, you need to be able to use the necessary instruments for this. This is why it takes years to figure out how to use compression and not overdo it. If the mix engineer has compressed the guitar part too much, the mastering engineer will no longer be able to fully restore the missing frequencies.

If musicians wanted you to listen to music that did not go through the mixing and mastering stages, then they would release it to store shelves straight from the studio. Crane says the people who create, edit, mix and master music aren't there to get lost in the musicians' feet - they've been helping artists from the beginning, for over a century.

These people are part of the creation process that produces amazing works of art. Crane adds, "You don't want a version of 'Dark Side of the Moon' that hasn't gone through mixing and mastering." Pink Floyd released the song the way they wanted to hear it.

The sound level is the same throughout the entire composition, there are several pauses.

Narrowing the dynamic range

Narrowing the dynamic range, or more simply compression, is necessary for various purposes, the most common of them are:

1) Achieving a uniform volume level throughout the entire composition (or instrument part).

2) Achievement of a uniform level of volume of songs throughout the album / radio broadcast.

2) Improving intelligibility, mainly when compressing a certain part (vocals, bass drum).

How does the dynamic range decrease?

The compressor analyzes the input sound level by comparing it to a user-defined Threshold value.

If the signal level is below the value Threshold- then the compressor continues to analyze the sound without changing it. If the sound level exceeds the Threshold value, then the compressor starts its action. Since the role of the compressor is to narrow the dynamic range, it is logical to assume that it limits the largest and smallest values ​​of the amplitude (signal level). At the first stage, the largest values ​​are limited, which decrease with a certain force, which is called Ratio(Attitude). Let's look at an example:

The green curves represent the sound level, the greater the amplitude of their oscillations from the X axis, the greater the signal level.

The yellow line is the threshold (Threshold) of the compressor. By making the Threshold value higher, the user moves it away from the X axis. By making the Threshold value lower, the user brings it closer to the Y axis. It is clear that the lower the threshold value, the more often the compressor will be triggered, and vice versa, the higher, the less often. If the Ratio value is very high, then after reaching the Threshold signal level, all subsequent signal will be suppressed by the compressor until silence. If the Ratio value is very small, then nothing will happen. The choice of Threshold and Ratio values ​​will be discussed later. Now we should ask ourselves the following question: What is the point of suppressing all subsequent sound? Indeed, there is no point in this, we only need to get rid of the amplitude values ​​(peaks) that exceed the Threshold value (marked in red on the graph). It is to solve this problem that there is a parameter Release(Decay), which sets the duration of the compression.

The example shows that the first and second exceeding the Threshold threshold last less than the third exceeding the Threshold threshold. So, if the Release parameter is set to the first two peaks, then when processing the third one, an unprocessed part may remain (since exceeding the Threshold threshold takes longer). If the Release parameter is set to the third peak, then when processing the first and second peaks, an unwanted decrease in the signal level is formed behind them.

The same goes for the Ratio parameter. If the Ratio parameter is set to the first two peaks, then the third will not be suppressed enough. If the Ratio parameter is set to process the third peak, then the processing of the first two peaks will be too overestimated.

These problems can be solved in two ways:

1) Setting the Attack parameter is a partial solution.

2) Dynamic compression is a complete solution.

Parameter ataki (Attack) is designed to set the time after which the compressor will start its work after exceeding the Threshold. If the parameter is close to zero (equal to zero in the case of parallel compression, see the corresponding article) - then the compressor will start suppressing the signal immediately, and the amount of time set by the Release parameter will work. If the attack speed is high, then the compressor will begin its action after a certain period of time (this is necessary to give clarity). In our case, you can adjust the parameters of the Threshold, Release and Compression Level (Ratio) to process the first two peaks, and set the Attack value close to zero. Then the compressor will suppress the first two peaks, and when processing the third it will suppress it until the Threshold is exceeded. However, this does not guarantee high-quality sound processing and is close to limiting (a rough cut of all amplitude values, in this case the compressor is called a limiter).

Let's look at the result of the sound processing by the compressor:

The peaks disappeared, I note that the processing settings were quite gentle and we suppressed only the most prominent amplitude values. In practice, the dynamic range narrows much more and this trend is only progressing. In the minds of many composers, they make the music louder, but in practice they completely deprive it of the dynamics for those listeners who might listen to it at home and not on the radio.

It remains for us to consider the last compression parameter, it is Gain(Gain). Gain is intended to increase the amplitude of the entire composition and is essentially equivalent to another sound editor tool, normalize. Let's see the final result:

In our case, the compression was justified and improved the sound quality, since the prominent peak is more an accident than a deliberate result. In addition, it can be seen that the music is rhythmic, therefore it has a narrow dynamic range. In cases where high amplitude values ​​have been deliberately made, compression may become an error.

Dynamic compression

The difference between dynamic and non-dynamic compression is that at the first, the signal suppression level (Ratio) depends on the input signal level. Dynamic compressors are available in all modern programs, the Ratio and Threshold parameters are controlled using the window (each parameter has its own axis):

There is no single standard for displaying the graph, somewhere along the Y-axis the level of the input signal is displayed, somewhere on the contrary, the signal level after compression. Somewhere the point (0,0) is in the upper right corner, somewhere in the lower left. In any case, moving the mouse cursor over this field changes the values ​​of the numbers that correspond to the Ratio and Threshold parameters. Those. You set the compression level for each Threshold value, which makes it very flexible to adjust the compression.

Side Chain

The side chain compressor analyzes the signal of one channel, and when the sound level exceeds the threshold, it applies compression to the other channel. Side chain has its advantages of working with instruments that are located in the same frequency region (the bass-bass drum is actively used), but sometimes instruments located in different frequency regions are used, which leads to an interesting side-chain effect.

Part Two - Stages of Compression

There are three stages of compression:

1) The first stage is the compression of individual sounds (singleshoots).

The timbre of any instrument has the following characteristics: Attack, Hold, Decay, Delay, Sustain, Release.

The stage of compression of individual sounds is divided into two parts:

1.1) Compressing individual rhythm instrument sounds

Often times, the constituent bits require separate compression to give them crispness. Many people process the bass drum separately from other rhythm instruments, both at the stage of compressing individual sounds, and at the stage of compressing individual parts. This is due to the fact that it is in the low-frequency region, where, in addition to it, only bass is usually present. The clarity of the bass drum is understood as the presence of a characteristic click (the bass drum has a very short attack and hold time). If there is no click, then you need to process it with a compressor, setting the threshold to zero and the attack time from 10 to 50 ms. The realese of the compressor must end before the kick again. The last problem can be solved using the formula: 60,000 / BPM, where BPM is the tempo of the composition. So, for example) 60,000 / 137 = 437.96 (time in milliseconds until a new strong fraction of a 4-dimensional composition).

All of the above applies to other rhythmic instruments with a short attack time - they should have an accented click, which should not be suppressed by the compressor at any stage of the compression levels.

1.2) Compressionindividual soundsharmonic instruments

Unlike rhythmic instruments, harmonic instrument parts are rarely composed of separate sounds. However, this does not mean that they should not be processed at the level of sound compression. If you use a sample with a recorded part, then this is the second compression level. This compression level includes synthesized harmonic instruments only. These can be samplers, synthesizers using various methods of sound synthesis (physical modeling, FM, additive, subtractive, etc.). As you probably already guessed - we are talking about programming synthesizer settings. Yes! This is compression too! Almost all synthesizers have a programmable envelope parameter (ADSR), which means envelope. The envelope is used to set the Attack, Decay, Sustain, and Release times. And if you tell me that this is not a compression of every single sound - you are my enemy for life!

2) Second stage - Compression of individual parts.

By compressing individual parts, I mean narrowing the dynamic range of a series of combined individual sounds. This stage also includes recordings of parts, including vocals, which require compression processing to make it clear and intelligible. When processing the compression of the parts, you need to take into account the fact that when adding individual sounds, unwanted peaks may appear, which you need to get rid of at this stage, since if you do not do this now, then the picture may worsen at the stage of mixing the entire composition. At the stage of compressing individual parts, you need to take into account the compression of the stage of processing individual sounds. If you have achieved the clarity of the bass drum, then incorrect reprocessing in the second stage can ruin everything. Compressor processing of all parts is optional, just as it is not necessary to process all individual sounds. I advise you to put an amplitude analyzer just in case to determine the presence of unwanted side effects of combining individual sounds. In addition to compression, at this stage it is necessary to ensure that the parts are, if possible, in different frequency ranges, so that quantization is performed. It is also useful to remember that the sound has such a characteristic as masking (psychoacoustics):

1) The quieter sound is masked by the louder sound coming in front of it.

2) A quieter sound at a low frequency is masked by a louder sound at a high frequency.

So, for example, if you have a synthesizer part, then often the notes start playing before the previous notes finish their sound. Sometimes this is necessary (creating harmony, playing style, polyphony), but sometimes not at all - you can cut off their end (Delay - Release) if it is heard in solo mode, but not heard in all parts playback mode. The same applies to effects such as reverb - it should not last until the sound source starts again. By cutting out and removing unnecessary signal, you make the sound cleaner, and this can also be considered as compression - because you remove unnecessary waves.

3) The third stage - Compression of the composition.

When compressing an entire composition, keep in mind that all parts are a combination of many separate sounds. Therefore, when combining them and then compressing them, you need to take care that the final compression does not spoil what we achieved in the first two stages. You also need to separate compositions in which a wide or narrow range is important. when compressing compositions with a wide dynamic range, it is enough to put a compressor that will crush short-term peaks that were formed as a result of adding the parties together. Compressing a composition in which a narrow dynamic range is important is much more complicated. Here the compressors are called maximizers lately. Maximizer is a plug-in that combines a compressor, limiter, graphic equalizer, enhancer and other sound conversion tools. At the same time, he must necessarily have sound analysis tools. Maximizing, the final processing by a compressor, is largely needed to deal with the mistakes made in the previous stages. Mistakes are not so much about compression (however, if you do at the last stage what you could have done at the first stage is already a mistake), as in the initial choice of good samples and instruments that would not interfere with each other (we are talking about frequency ranges) ... It is for this that the frequency response is corrected. It often happens that with strong compression on the master, you need to change the compression and mixing parameters at earlier stages, since with a strong narrowing of the dynamic range, quiet sounds that were previously masked come out, the sound of individual components of the composition changes.

In these parts, I deliberately did not talk about specific compression parameters. I felt it necessary to write about the need to pay attention to all sounds and all parts during compression at all stages of composition creation. This is the only way in the end you will get a harmonious result not only from the point of view of music theory, but also from the point of view of sound engineering.

The table below provides practical advice for the processing of individual batches. However, in compression, numbers and presets can only tell you the area you need to look for. The ideal compression setting will vary from case to case. The Gain and Threshold parameters assume a normal sound level (logical use of the entire range).

Part Three - Compression Options

Quick reference:

Threshold - determines the sound level of the incoming signal, upon reaching which the compressor starts to work.

Attack - defines the time after which the compressor will start working.

Level (ratio) - determines the degree of decrease in the amplitude values ​​(in relation to the original value of the amplitude).

Release - defines the time after which the compressor will stop working.

Gain - Determines the level of increase in the input signal, after being processed by the compressor.

Compression table:

Tool Threshold Attack Ratio Release Gain Description
Vocals 0 dB 1-2 ms

2-5 mS

10 msec

0.1 ms

0.1 ms

less than 4: 1

2,5: 1

4:1 – 12:1

2:1 -8:1

150 ms

50-100 mS

150 ms

150 ms

0.5s

Compression during recording should be minimal, requires mandatory processing at the stage of mixing to make it clear and intelligible.
Wind instruments 1 - 5ms 6:1 – 15:1 0.3s
Barrel from 10 to 50 ms

10-100 mS

4: 1 or higher

10:1

50-100 ms

1 mS

The lower the Thrshold and the higher the Ratio and the longer the Attack, the more pronounced the click at the beginning of the kick.
Synthesizers Depends on the wave type (ADSR envelopes).
Snare drum: 10-40 mS

1- 5ms

5:1

5:1 – 10:1

50 mS

0.2s

Hi-Hat 20 mS 10:1 1 mS
Overhead microphones 2-5 mS 5:1 1-50 mS
Drums 5ms 5:1 – 8:1 10ms
Bas-guitar 100-200 mS

4ms to 10ms

5:1 1 mS

10ms

Strings 0-40 mS 3:1 500 mS
Sint. bass 4ms - 10ms 4:1 10ms Depends on the envelopes.
Percussion 0-20 mS 10:1 50 mS
Acoustic guitar, Piano 10-30 mS

5 - 10ms

4:1

5:1 -10:1

50-100 mS

0.5s

Electro-nitara 2 - 5ms 8:1 0.5s
Final compression 0.1 ms

0.1 ms

2:1

2: 1 to 3: 1

50 ms

0.1 ms

0dB output The attack time depends on the target - whether you want to remove the peaks or make the track smoother.
Limiter after final compression 0 mS 10:1 10-50 mS 0dB output If you need a narrow dynamic range and a coarse "cut" of waves.

The information was taken from various sources, which are referred to by the various resources on the Internet. The difference in compression parameters is explained by the difference in sound preferences and work with different material.

Dynamic compression(Dynamic range compression, DRC) - narrowing (or expanding in the case of an expander) the dynamic range of a phonogram. Dynamic range, it is the difference between the quietest and loudest sound. Sometimes the quietest sound in the soundtrack will be a little louder than the noise level, and sometimes a little quieter than the loudest. Hardware devices and programs that perform dynamic compression are called compressors, distinguishing four main groups among them: compressors themselves, limiters, expanders and gates.

Vacuum tube analog compressor DBX 566

Up and down compression

Down compression(Downward compression) decreases the volume of a sound when it begins to exceed a certain threshold, leaving quieter sounds unchanged. The extreme down-compression option is limiter. Up-compression(Upward compression), on the contrary, increases the sound volume if it is below the threshold value, without affecting louder sounds. In this case, both types of compression narrow the dynamic range of the audio signal.

Down compression

Up-compression

Expander and Gate

If the compressor decreases the dynamic range, the expander increases it. When the signal level rises above the threshold level, the expander increases it even more, thus increasing the difference between loud and quiet sounds. Such devices are often used when recording drum kits to separate the sounds of some drums from others.

The type of expander that is used not to amplify loud sounds, but to drown out quiet sounds that do not exceed the threshold level (for example, background noise) is called Noise gate... In such a device, as soon as the sound level falls below the threshold, the signal flow stops. Usually the gate is used to suppress noise during pauses. On some models, you can make sure that the sound does not stop abruptly when the threshold level is reached, but gradually fades out. In this case, the decay rate is set with the Decay knob.

Gate, like other types of compressors, can be frequency-dependent(i.e. handle certain frequency bands differently) and can operate in the side-chain(see below).

Compressor working principle

The signal entering the compressor is split into two copies. One copy is sent to an amplifier, in which the degree of amplification is controlled by an external signal, the second copy forms this signal. It goes into a device called a side-chain, where the signal is measured, and based on this data, an envelope is created that describes the change in its volume.
This is how most modern compressors are arranged, this is the so-called feed-forward type. In older devices (feedback type) the signal level is measured after the amplifier.

There are various analogue variable-gain amplification technologies, each with its own advantages and disadvantages: tube, optical using photoresistors, and transistor. When working with digital sound (in a sound editor or DAW), you can use your own mathematical algorithms or emulate the work of analog technologies.

Basic parameters of compressors

Threshold

The compressor reduces the audio signal level if its amplitude exceeds a certain threshold value. It is usually specified in decibels, with a lower threshold (eg -60 dB) means more sound will be processed than a higher threshold (eg -5 dB).

Ratio

The amount of level reduction is determined by the ratio parameter: ratio 4: 1 means that if the input level is 4 dB higher than the threshold, the output signal level will be 1 dB higher than the threshold.
For example:
Threshold = −10 dB
Input signal = −6 dB (4 dB above the threshold)
Output signal = −9 dB (1 dB above the threshold)

It is important to keep in mind that the suppression of the signal level continues for some time after it falls below the threshold level, and this time is determined by the value of the parameter release.

Compression with a maximum ratio of ∞: 1 is called limiting. This means that any signal above the threshold level is suppressed to the threshold level (except for a short period after a sudden increase in the input volume). See Limiter below for details.

Examples of different Ratio values

Attack and Release

The compressor provides some control over how quickly it responds to changes in signal dynamics. The Attack parameter determines the time it takes for the compressor to reduce the gain to the level determined by the Ratio parameter. Release determines the time during which the compressor, on the contrary, increases the gain, or returns to normal if the input signal level falls below the threshold value.

Attack and Release phases

These parameters indicate the time (usually in milliseconds) it will take to change the gain by a certain amount of decibels, usually 10 dB. For example, in this case, if Attack is set to 1ms, it will take 1ms to decrease the gain by 10dB and 2ms to decrease the gain by 20dB.

In many compressors the Attack and Release parameters can be adjusted, but in some they are preset and not adjustable. They are sometimes referred to as "automatic" or "program dependent", i.e. vary depending on the input signal.

Knee

One more compressor parameter: hard / soft Knee... It determines whether the start of the compression is hard or soft. Soft knee reduces the noticeable transition from dry signal to compressed signal, especially at high Ratios and sudden increases in volume.

Hard Knee and Soft Knee Compression

Peak and RMS

The compressor can respond to peaks (short-term maximum) values ​​or to the average input level. Using peaks can lead to sharp fluctuations in the compression ratio, and even distortion. Therefore, compressors apply an averaging function (usually RMS) of the input signal when comparing it to the threshold value. This gives a more comfortable compression, closer to the human perception of loudness.

RMS is a parameter that reflects the average volume of a phonogram. From a mathematical point of view, RMS (Root Mean Square) is the root mean square value of the amplitude of a certain number of samples:

Stereo linking

A compressor in stereo linking mode applies the same gain to both stereo channels. This avoids displacement of the stereo panorama that can result from the individual processing of the left and right channels. This shift occurs if, for example, a loud element is panned off-center.

Makeup gain

As the compressor reduces the overall level of the signal, it is common to add a fixed output gain option to obtain the optimum level.

Look-ahead

The look-ahead feature is designed to solve the problems of both too high and too low Attack and Release values. An attack time that is too long does not allow us to effectively intercept transients, and too short one may not be comfortable for the listener. When using the look-ahead function, the main signal is delayed relative to the manager, this allows compression to begin early, even before the signal reaches the threshold.
The only drawback of this method is the signal time delay, which is undesirable in some cases.

Using dynamic compression

Compression is used everywhere, not only in musical phonograms, but also wherever it is necessary to increase the overall volume without increasing peak levels, where inexpensive sound-reproducing equipment is used or a limited transmission channel (warning and communication systems, amateur radio, etc.) ...

Compression is used when playing background music (in shops, restaurants, etc.) where any noticeable changes in volume are undesirable.

But the most important application for dynamic compression is in music production and broadcasting. Compression is used to give the sound "density" and "drive", for better combination of instruments with each other, and especially when processing vocals.

Vocals in rock and pop music are usually compressed to make them stand out from the accompaniment and add clarity. A special kind of compressor tuned only to certain frequencies - deesser - is used to suppress hissing phonemes.

In instrumental parts, compression is also used for effects that are not directly related to volume, for example, quickly decaying drum sounds can become longer.

Side-chaining is often used in electronic dance music (EDM) (see below) - for example, the bass line can be driven by a kick drum or similar to prevent bass and drum clash and create dynamic ripple.

Compression is widely used in broadcasting (radio, television, webcasting) to increase the perceived loudness while reducing the dynamic range of the original audio (usually CD). Most countries have legal restrictions on the instantaneous maximum volume that can be broadcast. Usually these restrictions are implemented by permanent hardware compressors in the ether chain. In addition, increasing the perceived loudness improves the "quality" of the sound from the point of view of most listeners.

see also Loudness war.

Sequentially increasing the volume of the same song remastered for CD from 1983 to 2000.

Side-chaining

Another common compressor switch is the side chain. In this mode, the sound is compressed not depending on its own level, but depending on the level of the signal entering the connector, which is usually called the side chain.

There are several uses for this. For example, a vocalist lisps and all the letters "s" stand out from the overall picture. You pass his voice through the compressor, and into the side chain connector you feed the same sound, but passed through the equalizer. On the equalizer, you remove all frequencies except those used by the vocalist when pronouncing the letter "c". Usually around 5 kHz, but can range from 3 kHz to 8 kHz. If you then put the compressor in the side chain mode, then the compression of the voice will occur in those moments when the letter "c" is pronounced. Thus, the device known as the "de-esser" was obtained. This way of working is called frequency dependent.

Another use of this feature is called "ducker". For example, in a radio station, the music goes through the compressor, and the DJ's words go through the side chain. When the DJ starts chatting, the volume of the music is automatically reduced. This effect can be used with great success in recording, for example, by turning down the volume of the keyboard parts while singing.

Brick wall limiting

Compressor and limiter work in about the same way, we can say that a limiter is a compressor with a high Ratio (from 10: 1) and, usually, a low Attack time.

There is the concept of Brick wall limiting - limiting with a very high Ratio (20: 1 and higher) and a very fast attack. Ideally, it does not allow the signal to exceed the threshold level at all. The result will be unpleasant to the ear, but it will prevent damage to sound-reproducing equipment or exceeding the bandwidth of the channel. Many manufacturers integrate limiters into their devices for this very purpose.

Clipper vs. Limiter, soft and hard clipping

Encoding technology used in DVD players with proprietary

sound decoders and receivers. Dynamic range compression (or reduction) is used to limit the peaks in sound when watching movies. If the viewer wants to watch a film in which sudden changes in the volume level are possible (a film about a war,

for example), but does not want to disturb his family members, then DRC should be turned on. Subjectively, by ear, after turning on DRC, the proportion of low frequencies in the sound decreases and high sounds lose transparency, so you should not turn on the DRC mode without needing to.

DreamWeaver (See - FrontPage)

A visual hypertext document editor developed by the software company Macromedia Inc. The powerful professional program DreamWeaver contains the ability to generate HTML pages of any complexity and scale, as well as has built-in support for large network projects. It is a visual design tool that supports advanced WYSIWYG concepts.

Driver Driver)

A software component that allows you to interact with devices

computer, such as a network interface card (NIC), keyboard, printer, or monitor. Network equipment (such as a hub) connected to the PC requires drivers in order for the PC to communicate with the equipment.

DRM (Digital Rights Management)

u A concept that assumes the use of special technologies and methods to protect digital materials to ensure that they are provided only to authorized users.

v Client program for interacting with the Digital Rights Management Services package, which is designed to control access to copyrighted information and its copying. DRM Services runs on Windows Server 2003. The client software will run on Windows 98, Me, 2000, and XP, providing applications such as Office 2003 with access to the appropriate services. In the future, Microsoft is expected to release a digital rights management plugin for Internet Explorer. In the future, it is planned to have such a program on a computer to work with any content that uses DRM technology in order to protect against illegal copying.

Droid (Robot) (See p. Agent)

DSA(Digital Signature Algorithm)

Public key digital signature algorithm. Developed by NIST (USA) in 1991.

DSL (Digital Subscrabe Line - Digital Subscriber Line)

Modern technology supported by city telephone exchanges to exchange signals at higher frequencies than those used in conventional analog modems. A DSL modem can work simultaneously with a telephone (analog signal) and a digital line. Since the spectra of the voice signal from the telephone and the digital DSL signal do not "overlap", i. E. do not interfere with each other, DSL allows you to surf the Internet and talk on the phone on the same physical line. Moreover, DSL technology usually uses multiple frequencies, and DSL modems on both sides of the line try to find the best ones for data transmission. The DSL modem not only transmits data, but also acts as a router. Equipped with an Ethernet port, a DSL modem makes it possible to connect multiple computers to it.

DSOM(Distributed System Object Model, Distributed SOM - Model of distributed system objects)

IBM technology with appropriate software support.

DSR? (Data set ready - signal ready to transmit data, signal DSR)

Serial interface signal indicating that a device (for example,

modem) is ready to send the data bit to the PC.

DSR? (Device Status Report)

DSR? (Device Status Register)

DSS? (Decision Support System)