Presentation on computer science on the topic: "Measuring information. Alphabetical approach" 7th grade

In our lives, each of us measures something. For example, as children, our parents measured our body height. It's so exciting when you find out that in just one year you have grown by as much as 5 centimeters! For these purposes, we used a ruler and a door jamb, marking the height on it annually with notches.

Each measurement requires its own instrument and its own unit of measurement.

Thus, the mass of a body is measured with scales in kilograms, time with a clock in seconds, etc.

For those starting to study computer science, the question naturally arises: in what units should information be measured?

Smallest unit of information

To measure information in computer science they use their own special unit of measurement. It was called “bit” and was formed from a combination of two English words - “binary digit”.

In order to be able to measure information, it is necessary, as you remember, to encode the information into digital binary data.
This is the only way we can find out the size of a set of digital data stored in a file. A bit is the smallest unit of information.
This definition means that there is no other unit of information that is smaller in value than one bit.

One bit contains a very small piece of information. After all, it can only take one of two specific values (1 or 0).

Therefore, measuring information using only bits is extremely inconvenient - the numbers turn out to be very large. This is the same if we measured the height of our body in millimeters.

For example, to encode 1 character into text, 8 bits are enough. 8 bits are called a byte.

§4. Measuring Information

Main topics of the paragraph:

— alphabetical approach to measuring information; — alphabet, power of the alphabet; — information weight of the symbol; — information volume of the text; - units of information.

Questions studied:

— Alphabet, the power of the alphabet. — 1 bit – information weight of the binary alphabet symbol. — N=2b – formula for determining the information weight of a symbol. — Information volume of the text — Units of measurement of information: byte, kilobyte, megabyte, gigabyte.

Material for in-depth study of the topic “Measuring Information”

Questions studied:

— Meaningful approach to measuring information — Uncertainty of knowledge — Hartley’s formula

Alphabetical approach to measuring information

Now let's discuss the question of how information can be measured. There are several approaches to measuring information. Here we will look at only one, which is called the alphabetic approach *.

The alphabetic approach allows you to measure the information volume of a text in some language (natural or formal), which is not related to the content of this text.

You are well aware that there are units of measurement for such quantities as, for example, distance, mass, time. For distance it is a meter, for mass it is a gram, for time it is a second. Measurement occurs by comparing the measured value with a unit of measurement. —————————— * For another approach to measuring information, see section 1.1 of the mastery material, Chapter I Supplement.

The number of times a unit of measurement fits into the measured value is the result of the measurement. Consequently, to measure information, your own unit of measurement must be introduced.

Alphabet. Power of the alphabet

By the alphabet of a language we mean a set of letters, punctuation marks, numbers, parentheses and other symbols used in the text. The alphabet should also include a space, that is, a space between words.

The total number of symbols of the alphabet is usually called the power of the alphabet. We will denote this value by the letter N. For example, the power of the alphabet of Russian letters and marked additional symbols is 54: 33 letters + 10 numbers + 11 punctuation marks, brackets, space.

Information weight of the symbol

With the alphabetic approach, it is believed that each character of the text has a certain information weight. The information weight of a symbol depends on the power of the alphabet. What is the smallest number of characters in the alphabet? It's equal to two! You will soon learn that this is the alphabet used in computers. It contains only 2 characters, which are designated by the numbers 0 and 1. It is called the binary alphabet. By studying the design and operation of a computer, you will learn how you can represent any information using just two characters.

The information weight of a symbol of the binary alphabet is taken as a unit of information and is called 1 bit.

As the power of the alphabet increases, the information weight of the symbols of this alphabet increases. So one character from a four-character alphabet (N = 4) “weighs” 2 bits. The explanation for this can be given as follows: all the characters of such an alphabet can be encoded with all possible combinations of two digits of the binary alphabet. We call a combination of several (two, three, etc.) characters of the binary alphabet a binary code.

Using three binary digits, 8 different combinations can be made.

Therefore, if the power of the alphabet is 8, then the information weight of one character is 3 bits.

All characters in a 16-character alphabet can be encoded in four-digit binary codes, etc.

Let's find the relationship between the power of the alphabet (N) and the number of characters in the code (b) - the bit depth of the binary code.

Note that 2 = 21, 4 = 22, 8 = 23, 16 = 24.

In general, this is written as follows:

N=2b.

The bit depth of a binary code is the information weight of a symbol.

If the number N is not equal to an integer power of two, then to determine the information weight of a symbol proceed as follows: take the value M closest to N, greater than N, equal to two to an integer power: N < M = 2b. The value b obtained from this is taken as the information weight of the symbol. For example, if N = 12, then M = 16 = 24. Hence, the information weight of a symbol from an alphabet with cardinality 12 is equal to 4 bits. In other words, the 12 characters of the alphabet are encoded in 4-bit binary codes.

Information volume of the text. Units of information

The information volume of a text consists of the information weights of its constituent characters. For example, the following text, written using the binary alphabet:

1101001011000101110010101101000111010010

contains 40 characters, therefore, its information volume is 40 bits.

Today, computers are most often used to prepare text documents. The alphabet from which such “computer text” is composed contains 256 characters. An alphabet of this size can accommodate all practically necessary symbols: lowercase and uppercase Latin and Russian letters, numbers, arithmetic operations signs, all kinds of brackets, punctuation marks, etc.

Since 256 = 28, then one character of the computer alphabet “weighs” 8 bits. A value equal to eight bits is called a byte.

1 byte = 8 bits.

It is easy to calculate the information volume of a text if it is known that the information weight of one character is 1 byte. You just need to count the number of characters in the text. The resulting value will be the information volume of the text, expressed in bytes.

For example, a small book prepared using a computer contains 150 pages. Each page has 40 lines, each line has 60 characters (including spaces between words). This means that the page contains 40 x 60 = 2400 bytes of information. To calculate the information volume of the entire book, you need to multiply the resulting value by the number of pages:

2400 bytes * 150 = 360,000 bytes.

This example already shows that a byte is a “small” unit. Imagine that you need, for example, to measure the information volume of an entire library. In bytes this will be a huge number!

To measure larger information volumes, larger units are used:

1 kilobyte = 1 KB = 210 bytes = 1024 bytes

1 megabyte = 1 MB = 210 KB = 1024 KB

1 gigabyte = 1 GB = 210 MB = 1024 MB

1 terabyte = 1 TB = 210 GB = 1024 GB

Therefore, the information volume of the above-mentioned book is approximately 360 kilobytes. And if you calculate more precisely, you get:

360,000 : 1024 = 351.5625 KB.

351.5625 : 1024 = 0.34332275 MB.

In conclusion, let us once again draw attention to an important property of the alphabetic approach considered here. When using it, the content of the text is not taken into account. A text consisting of a meaningless combination of characters will have a non-zero information volume.

Briefly about the main thing

The alphabetic approach is a way of measuring the information volume of a text that is not related to its content.

An alphabet is the entire collection of symbols used in a language to represent information. The power of an alphabet is the number of characters in it.

1 bit is the information weight of one character of a two-character alphabet (N = 2).

The information weight of a symbol (binary code capacity) (b) and the power of the alphabet (N) are related by the formula: N = 2b.

If N is not equal to two to the integer power, then the larger N, the integer closest to N, is found M = 2b (b is an integer), and from this equality b is determined - the information weight of the symbol.

The information volume of the text is equal to the sum of the information weights of all the characters that make up the text.

1 byte is the information weight of a character from an alphabet with a capacity of 28 = 256 characters. 1 byte = 8 bits.

Byte, kilobyte, megabyte, gigabyte, terabyte are units of measurement of information. Each subsequent unit is 1024 (210) times larger than the previous one.

Questions and tasks

1. What is the alphabet?

2. What is the power of the alphabet?

3. How is the information volume of a text determined when using the alphabetical approach?

4. The text is written using an alphabet with a capacity of 64 characters and contains 100 characters. What is the information volume of the text?

5. What is a byte, kilobyte, megabyte, gigabyte, terabyte?

6. The information volume of the text prepared using a computer is 3.5 KB. How many characters does this text contain?

7. Two texts contain the same number of characters. The first text is composed in an alphabet with a capacity of 32 characters, the second - with a capacity of 64 characters. How many times do the information volumes of these texts differ?

Large units of information

In this regard, larger units of measurement of information were invented in computer science, the relationship between which is reflected below:

There are also larger units of information:

1 PB =1024 TB Petabyte (PByte)
1 EB =1024 PB Exabyte (Ebyte)
1 Zb =1024 EB Zettabyte (Zbyte)
1 Yb = 1024 ZB Yottabyte (Ybyte)

Let us give examples to compare different volumes of digitized text information.

One byte is occupied by the character we entered from the keyboard.

A low-resolution phone image takes 100 KB.

1 MB - a small art book.

Three gigabytes are just 1 hour of video recording in good quality.

One gigabyte of text can be read by a person in his entire life.

Information volume of a text message

How to find, for example, the information volume of the message “ Informatics is the main science of our time .” To do this, you need to count the total number of characters in the message (enclosed in quotes), taking into account the spaces between words (a space in a computer is also a symbol). In total, we get 41 characters or 41 bytes.

We suggest finding out how much information is in a book of 100 pages, if each page contains 50 lines, and each line contains 60 characters. 100⋅50⋅60=300,000 characters, which is 300,000 bytes. Let's convert everything into kilobytes: 300,000 bytes / 1024 = 292.97 KB. In megabytes this will already be 292.97 KB / 1024 = 0.29 MB.

Tasks for the topic Alphabetical approach to measuring information

Alphabetical approach. Tasks

Task 1. The Multi tribe alphabet consists of 8 letters. How much information does 1 letter of this alphabet carry?

Problem 2. The information volume of one character of some message from the alphabet of the Pulti tribe is 6 bits. How many characters are in the alphabet of this tribe, with which the Pultans composed this message?

Task 3. A message written in letters from a 128-character alphabet contains 30 characters. How much information does it carry?

Problem 4. A message composed using a 32-character alphabet contains 80 characters. Another message is composed using a 64-character alphabet and contains 70 characters. Compare the amount of information contained in the messages.

Problem 5. A 4 KB information message contains 4096 characters. How many characters does the alphabet with which this message was written contain?

Problem 6. How many kilobytes is a message of 512 characters of a 16-character alphabet?

Task 7. A 256-character alphabet was used to write the text. Each page contains 30 lines of 70 characters per line. How much information does 5 pages of text contain?

Task 8. The message takes 3 pages of 25 lines. Each line contains 60 characters. How many characters are in the alphabet used if the entire message contains 1125 bytes?

Problem 9. The user enters text from the keyboard at a speed of 90 characters per minute. How much information will the text contain that took him 15 minutes to type (using a computer alphabet)?

Task 10. The user entered text from the keyboard for 10 minutes. What is his information input speed if the information volume of the received text is 1 KB?

Task 11. The researcher observes a change in a parameter that can take one of seven values. Values are written using a minimum number of bits. The researcher recorded 120 values. Determine the information volume of the observation results.

Problem 12. If each character is encoded in two bytes, then what is the information volume of the following sentence in Unicode: Today it is 35 degrees Celsius.

Problem solutions

Problem 1. Solution: 2 i = N, 2 i = 8, i = 3 bits. Answer: 3 bits.

Problem 2. Solution: N = 2 i = 26 = 64 characters Answer: 64 characters.

Problem 3. Given: N = 128, K = 30 Find: It — ? Solution: 1) It = K*I, where I is the volume of one character 2) 2 i = N, 2i= 128, i = 7 bits – the volume of one character 3) It = 30*7 = 210 bits – the volume of the entire message. Answer: 210 bits – the size of the entire message.

Problem 4. Given: N1 = 32, K1 = 80, N2 = 64, K2 = 70 Find: It1, It2 Solution: 1) It = K*I, where I is the volume of one symbol 2) 2i = N, 2i = 32 , i = 5 bits – the volume of one character of the first message; 3) 2i = N, 2i = 64, i = 6 bits – the volume of one character of the second message; 4) It1 = K1 * i = 80 * 5 = 400 bits – the volume of the first message; 5) It2 = K2 * i2 = 70 * 6 = 420 bits – the volume of the second message; Answer: There is more information in the second message than in the first.

Problem 5. Given: K = 4096, It = 4 Kb Find: N - ? Solution: 1) N = 2i; 2) It = K*I, I = It/K = 4*1024*8/4096=8 bits – the volume of one character; 3) N = 28 = 256 characters – the power of the alphabet. Answer: the alphabet contains 256 characters.

Problem 6. Given: N = 16, K = 500 Find: It - ? Solution: 1) It = K*I, unknown I; 2) N = 2i, 16 = 2i, i = 4 bits – the volume of one symbol; 3) IT = 4 * 512 = 2048 bits – the volume of the entire message; 4) 2048*8/1024 = 16 KB. Answer: 16 KB is the size of the entire message.

Problem 7. Given: N = 256, x = 30 – number of lines, y = 70 – number of characters in a line, M = 5 – number of pages. Find: It = ? Solution: 1) N = 2i, 256 = 2I, i = 8 bits = 1 byte – the volume of one character; 2) K = x*y*M = 30*70*5 = 10500 characters – in the text; 3) It = I*K = 1 * 10500 = 10500 bytes = 10 Kbytes – the volume of the entire text. Answer: the volume of the entire text is 10 KB.

Problem 8. Given: IT = 1125 bytes, x = 25 – number of lines, y = 60 – number of characters in a line, M = 3 – number of pages. Find: N - ? Solution: 1) N = 2i, unknown I; 2) It = K*I, I = It/K; 3) K = x*y*M = 25*60*3 = 4500 characters – in the text; 4) I = It/K = 1125*8/4500 = 2 bits - the volume of one character; 5) N = 22 = 4 characters – in the alphabet. Answer: there are 4 characters in the alphabet.

Problem 9. Given: V = 90 zn/min, t = 15 min, N = 256. Find: It = ? Solution: 1) It = K*I; 2) K = V * t = 90*15 = 1350 characters contains text; 3) N = 2i, 256 = 2i, I = 8 bits = 1 byte – the volume of one character; 4) It = 1350 * 1 = 1350 bytes = 1.3 KB - the volume of the entire text. Answer: the text contains 1.3 KB of information.

Problem 10. Given: IT = 1 KB, t = 10 min. Find: V = ? Solution: 1) V = K/t, unknown K; 2) K = It / I, because The power of the computer alphabet is 256, then I = 1 byte. Therefore K = 1 1024/1 = 1024 characters in the text. 3) V = 1024/10 = 102 sim/min. Answer: text input speed is 102 characters per minute.

Problem 11. Solution.

We know the maximum number of values that need to be encoded using the same number of alphabetic characters. It's seven. A bit is used as an alphabet, which can take only two values (0 and 1). To determine the minimum number of bits,

necessary to encode one value, we will use
Hartley’s formula.
To what power do you need to raise two to get seven?
We know that 22 = 4 and 23 = 8. Therefore, the value of k is between 2 and 3 and is a fraction. But the number of bits cannot be a fractional number. 3 bits
are required to encode one value .

Since the researcher recorded 120 values, the total information volume of the observation is 3 * 120 = 360 bits or (360 / 8 =) 45 bytes.

Answer.
The information volume of 120 observations, taking seven different values, is 45 bytes.
Problem 12. Solution.

Let's count the total number of characters in a sentence, taking into account spaces, numbers and punctuation marks. In this case, only 26 characters. Each character is encoded in two bytes. This means that the information volume of the sentence is 26 * 2 = 52 bytes or 52 * 8 = 416 bits.

Answer.
The information volume of the sentence is 416 bits.