CODE-BASED HYBRID CRYPTOSYSTEM: COMPARATIVE STUDIES AND ANALYSIS OF EFFICIENCY

1) V. N. Karazin Kharkiv National University, Svobody sq., 6, Kharkiv, 61022, Ukraine, nastyak931@gmail.com 2) JSC “Institute of Information Technologies”, Bakulin St., 12, Kharkiv, 61166, Ukraine, gorbenkou@iit.kharkov.ua 3) State Service of Special Communication and Information Protection, Solomianska 13 str., Kyiv, 03680, Ukraine 4) National Academy of Internal Affairs, 1 Solomjanska Square, Kyiv, 03035, Ukraine, alex_korneiko@meta.ua 5) Central Ukrainian National Technical University, University Avenue 8, Kropyvnytskyi, 25006, Ukraine, smirnov.ser.81@gmail.com


INTRODUCTION
A vast majority of modern cryptographic systems are based on mechanisms which provide protection due to the complexity of solving a particular mathematical problem such as discrete logarithm, factorization, etc. [1][2]. By contrast, cryptosystems based on coding are not currently widely used, but in the near future everything can change radically. This change is due to the desire of the world community to create a full-scale quantum computer, which will be able to accelerate the performance of usual computer operations in dozens or even hundreds of times [3]. With this in mind, research into the postquantum cryptography, cryptography representing algorithms that are resistant to quantum and classical cryptanalysis, became relevant.
There are five main areas of research: hash-based cryptography, lattice-based cryptography, multivariate cryptography, supersingular elliptic curve isogeny cryptography and code-based cryprography [4]. In our study, we focus on the latter direction, taking into account several factors. Firstly, code-based systems can provide such benefit as channel error control. Secondly, the high speed of cryptography and the resistance to classical and quantum cryptanalysis are distinguished by codebased systems from their competitors [5].
The most popular cryptosystems based on the use of coding are McEliece and Niederreiter schemes. After analyzing their structures, advantages and disadvantages, we offer a new, so-called hybrid cryptosystem that combines principles of encryption of two above-mentioned systems and provides additional significant benefits that will be considered in the future.

McELIECE CRYPTOSYSTEM
McEliece cryptosystem is a so-called classic cryptosystem based on the use of codes. It was proposed more than 30 years ago and it is still considered to be resistant not only to classical, but also to quantum cryptanalysis. A feature of this scheme can be defined as masking the fast decoding rule by means of matrix multiplication of generating matrix of algebraic block code on a random matrix (which is a secret key) [6]. An attacker, who only knows a public key, has to use a complex algorithm for non-algebraic decoding. This algorithm is defined as an NP-complete task. An authorized user who has a private key removes the effect of masking matrices and applies a fast algebraic decoding algorithm. Next, we define an encryption algorithm using in McEliece scheme: , matrices P and D , which are permutational and diagonal n n  matrices respectively (for binary codes, D isn't used).
2). Let's form a matrix X G X G P D     . It's the public key of the McEliece scheme. In this case, matrices X , P and D are the private key.
3). A cryptogram is formed according to the following rule: where e is an error vector, the Hamming weight which meets the requirement: where I is the k-bit informational vector over the field ) (q GF . After completing the above steps, we receive a codeword X X с I G   , which is influenced by the error vector. In this case, vector e should be considered as a one-time private key. Its weight determines the complexity of decoding influenced the codeword (cryptogram) [7].
A decryption algorithm can be described by the following steps: 3) Calculate the initial k-bit information vector 1 '   X I I [6][7]. Consequently, McEliece decryption is performed by removing the masking matrices and using a polynomial complexity algorithm [8].

NIEDERREITER CRYPTOSYSTEM
The next step is to consider peculiarities of functioning of the theoretical code-based scheme Niederreiter. It is also based on the benefits of using masking matrices, as in the McEliece scheme [7][8][9]. In order to define an encryption algorithm that runs in this scheme: 2) Let's form a private key containing the following components: X is a non-degenerate (n k) (n k)    matrix with elements of GF(q), P is a permutation n n  matrix, and D is a diagonal n n  matrix (this matrix is not used for binary codes).
3) Then, we calculate a public key in accordance with the rule: 4) Formation of the cryptogram is accomplished by multiplying the vector e by the transposed public key: The cryptogram consists of (n-k) elements [10]. Vector e stores information that we want to encrypt. The information vector is further transformed using equilibrium coding. Upon receiving a message, a legitimate user, in the same way as in McEliece cryptosystem, removes the action of masking matrices and, using the fast decoding algorithm, receives the vector e, which, after equilibrium coding, represents the initially transmitted information [11].

A NEW HYBRID CRYPTOSYSTEM
Taking into account the proved stability of the considered cryptosystems (more details will be discussed in the next section), we propose a new hybrid cryptosystem, which has the same advantages as its predecessors, and even improves their performance. A basis of the proposed system is the combination of encryption information according to McEliece and Niederreiter schemes. Private keys of the hybrid scheme are similar to the first two schemes, matrix X (it has k k  elements), matrix P (it has n n  elements) and, in the case of nonbinary coding, matrix D (size n n  ) [7,11].
The public key is the matrix X In order to encrypt the information vector is divided into two components ( 1 I and 2 I ). After that, the cryptogram is formed: In this case, the first component of information is multiplied by public key, as in the transformation according to McEliece. The second information component is converted according to the Niederreiter scheme, namely, 2 I of length m is transformed into an encoded information vector e of length n (for example, using equilibrium coding).
For the generated vector, the following conditions must be fulfilled [7]: In order to provide a maximum stability, it is recommended to maximize the Hamming weight of the vector e, because overcoming all possible values of this vector is much more complicated. Decryption in the hybrid scheme occurs, just like in the McEliece scheme described in the previous section, with the only difference that information is extracted not only from the vector I, but also from the error vector e [12][13][14]. This fact allows us to significantly increase the relative speed of information transmission, which will be discussed further.

COMPARATIVE ANALYSIS OF CRYPTOSYSTEMS
Comparing the effectiveness of cryptosystems, we will use such factors as the relative speed of information transmission, resistance to classical and quantum cryptanalysis, the volume of key data that needs a cryptosystem, and the length of the ciphertext according to each alternate.

RELATIVE SPEED OF INFORMATION TRANSMISSION
First, let's consider the relative speed of information transmission. It describes an amount of information contained in a cryptogram of length n relative to the total length of this cryptogram.
An estimation of the relative speed for the McEliece scheme is the simplest, since it is known that any cryptogram formed by this algorithm has the length n, whereas the initial information vector has the length of k bit. Consequently, the relative transmission speed in this case [12][13][14] is equal to The relative speed of information transmission for the Niederreiter scheme is discussed in detail in [7]. According to this data, it equals Using a hybrid cryptosystem, the formed ciphertext has the length n, whereas information is encoded with a combination of the principles of McEliece and Niederreiter, dividing it into two components 1 I and 2 I , with 1 I having the length of k bit and 2 I converted by equilibrium encoding, so the maximum possible hidden amount of bits defined as in the Niederreiter scheme equals That is, an estimate of the relative speed of information transmission for hybrid cryptosystem can be defined as: From the above data, one can immediately conclude that in terms of relative speed, the hybrid system is far ahead of its predecessors, due to encoding two components [7].

STABILITY TO CLASSICAL CRYPTANALYSIS
It should be noted immediately that the researchers proved that the stability of McEliece and Niederreiter is equivalent. The proof is as follows. Assume that we know the syndrome c = eHx. We can calculate b = aE + e, in this case c = bHx and b is treated as a ciphertext in the McEliece system. Provided that an attack with complexity W is found for the McEliece system, there is a known algorithm for computing vector a, which is the secret information in the McEliece scheme. Then the vector e containing the secret information in the Niederreiter system can be represented in the form of e = aE + b, that is, the complexity of determining the vector e coincides with the complexity of determining the vector a. Otherwise, when there is an effective attack on the Niederreiter scheme, possibly using a ciphertext (aE + e)D T = eD T , the vectors e and a are calculated. It should be noted that from the above point of view, the equivalence of the estimates of stability of McEliece and Niederreiter cryptosystems and the hybrid cryptosystem [15] follows. The security of all three cryptosystems is based on the inability to solve such fundamental problems of coding theory as the general problem of decoding linear codes and the problem of finding a codeword with a given weight [16][17][18][19]. Considering the possibility of attacking, it's worth mentioning that, despite the fact that the McEliece cryptosystem based on Goppa codes, is still considered resistant, as Robert McEliece pointed out in his original article, there are two main ways that an intruder can use to attack a cryptosystem [6]: 1) An attacker may try to recover a private key from the public key, and then decrypt the message; 2) An attacker can directly decode a message without having to study the structure of the Goppa code.
A large number of researchers are engaged in the realization of these types of attacks, but the optimal effective version hasn't been yet invented. Also, the assessment of the stability of each cryptosystem to attacks can be made by determining the minimum number of sets covering all errors (roof sets). Their number is calculated according to the formula: In this case t n С represents the total number of error combinations, and t n k С  is the maximum number of error combinations that can be covered by this set [7]. The assessment from this point of view is somewhat underestimated, because the computational complexity of the formation of candidate words is not taken into account, which is calculated with respect to the chosen set.

STABILITY TO QUANTUM CRYPTANALYSIS
Now there are various quantum algorithms, among which the most popular are the quantum Schor's algorithm and the quantum Grover's algorithm for finding an element in unsorted base, quantum algorithms for cryptanalysis for transformations in factor-ring, and others [20]. Several sources say that the Shor's quantum algorithm is not efficient enough for the McEliece cryptosystem security breakdown. The most effective quantum algorithm in relation to the McEliece scheme is Grover's algorithm. It is correctly considered not as a "database", but as a search for the roots of a function. From this point of view, it is worth considering the application of the Grover's algorithm within the scope of the set decoding attack [21]. Grover's algorithm is a general constructive transformation of conditional chains in the quantum chain of finding roots. The detailed implementation of the quantum attack decoding the data set is demonstrated in [22,23]. It is worth noting that basic set decoding attack performs a search of the root of the function in a random manner. The search uses roughly average of a quantum computer. When you find S, you can calculate m and e, using minor incremental efforts [24].
We will show the relative speed of information transmission and stability to both types of cryptanalysis on the examples given in Table 1. It should be noted that in the table above, the following notation is used: "M" for the McEliece cryptosystem; "N" for the Niederreiter cryptosystem; "H" for the Hybrid Cryptosystem. Then the data presented in the table for the better visual perception can be represented using a graphic image (Fig. 1-2). Analyzing the reviewed data, we (1/ 2) / lg 0, 29 k n n n k n t C c C   can conclude that with the same code parameters, all three cryptosystems provide the same level of stability to the classical cryptanalysis. However, the results of resistance to quantum cryptanalysis are different: the resistance to quantum cryptanalysis of the McEliece cryptosystem begins to decrease when the code's relative speed drops below the limit of 0.66. At the same time, it is obvious that with the increasing correction ability of the hybrid cryptosystem and the decreasing relative speed the stability to quantum cryptanalysis also increases. However, the further research has shown that this trend will change under the same condition that affects the stability of the McEliece scheme, namely, the reduction of the relative speed of information transmission below the limit of 0.66.

COMPARISON OF KEY DATA AND LENGTH OF CIPHERTEXT
The next step is to compare the volume of key data and length of the ciphertext, which is formed according to each of three cryptosystems.
Since the binary case of using cryptosystems is considered in the work, therefore, when evaluating the volume of key parameters, matrix D will not be taken into account, and consideration will be given without taking into account the secret polynomial of the Goppa code. The key parameters and method of ciphertext formation coincide in the case of hybrid scheme and McEliece scheme, so their estimates can be considered equivalent. The private key of these schemes consists of matrices X and P. The matrix X has dimensions of k k  , and the volume that occupies matrix P is determined by the vector of a permutation of n elements. The size of the public key of both schemes is determined by the matrix n  . Hence, we can also note the disadvantage of cryptosystems, which is in an increased length of ciphertext relative to the initial information vector. It is known that for Niederreiter cryptosystem, matrices X and P also generate a secret key. Size of the matrix P is determined, as in the previous case, but the matrix X is different and has the dimensions of ( ) ( ) n k n k    . The public key of this scheme is the matrix n k   . Having analyzed the above information, one can conclude that the volume of the secret key in the McEliece and hybrid scheme varies with the secret keys in the Niederreiter scheme 2 2 n n k    , the difference from the generated encryption text is equal to k elements, but at the same time, size of the public key in Niederreiter scheme is greater than 2 n elements. We will illustrate this fact by the examples shown in Table 2. For the consideration of different levels of security, the code parameters most commonly found in the scientific literature were chosen. For a more visual understanding, we shall present the data listed in the table using histograms ( Fig. 3-5). Having analyzed the above data, one can conclude that to provide a similar level of stability to quantum cryptanalysis, compared with the usual cryptanalysis (for example, comparing the parameters for the code parameter n = 16384, n = 8192), it is necessary to increase the volume of key data more than three times. It is also worth noting that the stability of cryptosystems to quantum cryptanalysis depends directly on the indicators of their relative speed. Because of the advantage in the latter indicator, the hybrid cryptosystem has shown decent results in stability to quantum cryptanalysis, but the decrease in resistance is not critical compared to other schemes. However, the obvious advantage of the hybrid cryptosystem, which is worth reminding, in terms of cryptosystem's efficiency, is that it allows one to encrypt a larger amount of information using the same number of keys, while providing an adequate level of protection [25][26][27][28][29][30][31]. This research might be useful for the improvement of various methods of information security [10][11][12][13], as well as other practical use [32][33][34][35][36]. promising direction, since they allow us to provide a higher speed of cryptographic transformation, an error control that can occur in the communication channel, as well as resistance to the classical and quantum cryptanalysis. Due to the above mentioned advantages of using codes for the purpose of constructing algorithms of post-quantum cryptography, a new hybrid algorithm, which combines principles of encryption in accordance with the cryptosystems of McEliece and Niederreiter, was proposed. In turn, a further comparative analysis of all three cryptosystems has shown that using the proposed scheme, the key data occupies the same volumes as the key data of McEliece cryptosystem. The Hybrid cryptosystem provides a higher relative transmission speed and equal resistance to cryptanalysis as McEliece cryptosystem. One disadvantage is an increase in decoding time by adding information extracted as in Niederreiter scheme, but the increase in this indicator is not critical. Despite the demonstrated benefits, it remains open in all cryptosystems how to reduce the amount of the used key data, which, in the case of quantum computers to maintain stability, still needs to be increased once. This direction remains an actual vector of research in the core of modern cryptography [26][27][28][29].