HEURISTIC METHODS FOR THE DESIGN OF CRYPTOGRAPHIC BOOLEAN FUNCTIONS

1) Ivan Kozhedub Kharkiv National Air Force University, Sumska str., 77/79, Kharkiv, 61023, Ukraine 2) V. N. Karazin Kharkiv National University, Svobody sq., 6, Kharkiv, 61022, Ukraine, illarion_moskovchenko@ukr.net, kuznetsov@karazin.ua, ivanbelozersevv.jw@gmail.com 3) JSC “Institute of Information Technologies”, Bakulin St., 12, Kharkiv, 61166, Ukraine 4) Kharkiv University of Technology “STEP”, Malom’yasnitska st. 9/11, 61010, Kharkiv, Ukraine, kavserg@gmail.com 5) Yessenov University, 32 microdistricts, 130003, Aktau, The Republic of Kazakhstan, akhmetov@yu.edu.kz 6) Central Ukrainian National Technical University, University Avenue 8, Kropyvnytskyi, 25006, Ukraine, smirnov.ser.81@gmail.com


INTRODUCTION
An important element of most modern symmetric ciphers are non-linear replacement blocks [1][2][3][4][5][6][7], which are described with the help of Boolean or, in general, vector cryptographic functions [6].Indicators of the cryptographic strength of such functions (balance, nonlinearity, autocorrelation, etc.) directly affect the efficiency of symmetric ciphers, their resistance to most modern cryptanalytical attacks [5][6][7][8][9][10][11][12][13][14][15].In particular, the algebraic properties of S-blocks of modern block ciphers are investigated in [5][6][7] and their influence on sustainability to algebraic cryptanalysis is shown.In [8][9][10][11], combinatorial properties of non-linear knots in the context of the security evaluation of various encryption modes and key schedules were investigated.In [12,13], the influence of S-blocks on avalanche effects, differential and linear properties of block ciphers is investigated.The papers [14,15] are dedicated to the study of the properties of nonlinear replacement nodes in modern stream ciphers in comparison with the "Strumok" algorithm proposed as a new standard of stream encryption in Ukraine [16].
Methods for constructing S-blocks are investigated by many authors, for example, in [17][18][19][20].However, the most developed and widespread is the mathematical apparatus of cryptographic Boolean functions [21][22][23][24][25][26][27][28].In particular, a new recursive construction of a Boolean function with maximum algebraic immunity is presented in [21]; in [22,23] genetic algorithms for constructing Boolean functions with the required cryptographic properties are examined; in [24] the method of simulation of annealing is investigated; evolutionary methods are studied in [25,26]; papers [27,28] are dedicated to the heuristic methods of gradient search.
The purpose of this paper is to continue studies of the method of gradient descent, first proposed in [28], an assessment of its computational complexity in comparison with the closest analog in [27].For this purpose, the necessary terms and definitions in Section 2 are introduced; the heuristic methods studied in [27,28] are summarized in Section 3 and the calculated data for the required number of operations for the realization of gradient descent (Table I) is provided.Section 4 evaluates the properties of the gradient-lift method for the formation of high non-linear correlation-immune cryptographic Boolean functions.In section 5 a methodology for assessing the effectiveness of heuristic methods is proposed and the results of comparative studies are represented.In particular, it has been shown that the method of gradient descent in [28] for a significantly smaller number of iterations (in dozens of times) makes it possible to form cryptographic Boolean functions with the required indices of nonlinearity and autocorrelation.Section 6 presents the results of investigations of the cryptographic properties of the formed Boolean functions and compares them with the best known assessments.In conclusion, the obtained results are summarized and directions for further research are formulated briefly.

INDICES OF STABILITY OF CRYPTOGRAPHIC BOOLEAN FUNCTIONS
The basic concepts and definitions of the mathematical apparatus of Boolean algebra used in evaluating the effectiveness of non-linear nodes to replace symmetric ciphers were introduced in [21 -28].
The Boolean function f of n variables is the function [21-28], variables is the function that maps from the field GF(2 n ) of all binary vectors x = (x1, …, xn) of length n to the field GF (2).Usually Boolean functions are represented in algebraic normal form (ANF) and are considered as the sum of the products of the component coordinates.
The algebraic degree deg (f) is the degree of the longest summand of a function represented in an algebraic normal form.Algebraic degree reflects the resistance to analytical attacks, designed to reduce this function to cryptographically weak (linear).
The balance of the function is an indicator of stability, reflecting the weakness of the output sequence to statistical attacks.
The Hamming distance d (f,g) between the sequences of two functions f and g is the number of positions in which the sequences of these functions are different [21][22][23][24][25][26][27][28].
The nonlinearity of the NS transformation is the minimum Hamming distance between the output sequence S and all output sequences of affine functions over a certain field [21-28]: NS = min {d(S,)}, where  -is the set of affine functions.For an arbitrary function f, the nonlinearity of Nf over GF (2 n ) can reach [21][22][23][24][25][26][27][28]:

The nonlinearity of the function
is the maximal even integer less than or equal to х.
Non-linearity of the function is an indicator reflecting the stability of functions to correlative (linear) attacks.
The function f has a correlation immunity of order k if the output sequence of the function y  Y is statistically independent of any subset of k input coordinates [21][22][23][24][25][26][27][28] Equivalent definition of correlation immunity in terms of the Walsh transform [21-28]: the function f over the field GF(2 n ) has correlation immunity of order k, CI(k) if its Walsh transform satisfies the equality F() = 0 for all   Vn such as 1  W() k:   Vn, F() = 0, CI (f) = k.

HEURISTIC METHODS OF HILL CLIMBING
Heuristic methods of gradient search are investigated in this article.In particular, the method of gradient lift of W. Millan, A. Clark, E. Dawson, 1997 [27] and the method of gradient descent developed on its basis [28].

HEURISTIC METHOD OF GRADIENT LIFTING
The essence of the method is to increase the nonlinearity of an arbitrary Boolean function by complementing some position in the truth table of the original function.Each position of the truth table corresponds to unique input data.The method allows to create a complete list of such input data of the function, that the complementation of any output position corresponding to this input in the truth table will increase the nonlinearity of this function.The list of such positions in the truth table is denoted as 1 In [27] a fast systematic method for determining the set 1 -ISf of a given Boolean function is presented using its truth table and Walsh-Hadamard transforms.To find the set 1 -ISf of a given Boolean function, it is first necessary to determine the values of Walsh-Hadamard transform coefficients that correspond to values close to the absolute value of the maximum coefficient, WHmax.
Definition 2. f(x) is a Boolean function with a Walsh-Hadamard transform F (w), where WHmax denotes the maximum absolute value of F(w).One or more linear functions Lw(x) having a minimal distance to the function f(x), and for the data w, the equality |F(w)| = WHmax will exist.
The following set is defined as: 1 Also sets w for which the values of WHT are close to the maximum are defined: When the truth table changes exactly in one place, all WHT values change to +2 or -2.It follows that to increase the nonlinearity all WHT values in the set 1 W  must be changed to -2, all WHT values in the set 1 W  must be changed to 2 and also all WHT values in the set 2 W  must be changed to -2, all WHT values in the set 2 W  must be changed to 2. If the first two conditions are obvious, then the following two conditions are required in order to have all other values of |F(w)| smaller than WHmax.These conditions can be presented in the form of simple tests.
Theorem 1 [27].The Boolean function f(x) with WHТ F(w) is given, and sets are defined and Then for some input x an element from the Improvement Set exists and the following two conditions are met: f(x) = Lw(x) for all w  W + , and f(x) ≠ Lw(x) for all w  W -.
The criterion of gradient search is the maximization of the Hamming distance between the generated sequence and the sequences of linear functions.After updating the algebraic form of the Boolean function, similar operations are performed: the Walsh-Hadamard WFT transform is performed and the maximum values of the transformation coefficients are found; A set of Improvement Set is formed; there are elements of a sequence of functions that coincide with the elements of the sequence of the nearest linear form; inverting the matched elements and increasing the nonlinearity of the function, by "distance" from the nearest linear function.Next iterations similar to those discussed above are performed.
The conducted researches have shown that the considered method of gradient lifting is computationally expensive and, with a large number of arguments of the Boolean function, requires a significant number of repeated iterations.To reduce computational complexity, a gradient descent method with bent sequences is proposed in [28] as an input data.

HEURISTIC METHOD OF GRADIENT DESCENT
The proposed method of cryptographic Boolean functions constructing is a further development of the heuristic method of gradient lifting.This method is based on using the properties of nonlinear sequences.It differs from the well-known heuristic methods in the iterative procedure of complementing the positions of bent sequences for the gradient search for balanced Boolean functions according to the criterion of maximizing the Hamming distance between the generated sequences and the sequences of all linear functions, which makes it possible to search the Boolean functions with the required cryptographic properties with less computational efforts.
The main idea of the gradient descent method is the effective lowering of the nonlinearity of the given bent sequences for each of the 2 n/2-1 obligatory complementations.Table 1 presents the calculated data for vector spaces V4 -V12.Column 2 shows the non-linearity (Walsh transform value) of the bent sequences considered as an input, column 3 shows the maximum achievable non-linearity (the maximum value of the Walsh transform) of the functions taken as an output, and column 4 indicates the number of bits that need to be changed in the bent sequences to obtain the desired result.Fig. 1 shows the possible loss of nonlinearity in the complementation of the required number of positions of the bent sequence.To achieve the given upper bound of the nonlinearity, it is necessary to determine from the total number of positions x of the truth table to be complemented, the number of positions y the change of which entails a change of WH to +2, and the number of positions z the change of which entails a change of WH by -2, x = y + z.Table 2 presents the calculated data showing the necessary number of required complementations of the bent sequence for a given vector space in accordance with Theorem 2.1 of [27].

Figure 1 -Possible loss of nonlinearity in the case of complementation
After calculating the necessary number of complementations of the bent sequence, the Walsh-Hadamard transformation WH is performed in the first step of the heuristic search and the maximum Hamming distance to one or more sequences of linear functions Li(x) is determined.This operation corresponds to the selection of the zero value of Walsh-Hadamard transform coefficients WH, after which a set of linear functions constituting the Improvement Set is formed.Further, the elements of the bent function sequence are inverted, which coincide with the elements of sequences of linear functions from the Improvement Set.As a result, the imbalance of the function is reduced, but the nonlinearity also decreases, i.e. the sequence of the function is not as far from the sequences of the linear functions Li(x).At the next iteration, all operations are repeated.Thus, as a criterion for gradient search for cryptographic functions, the proposed method is the minimization of the minimal Hamming distance of the generated sequence and sequences of linear functions.
In general, the proposed method consists of three main stages.
At the first stage, gradient descent procedures are used, which allow to obtain a highly nonlinear sequence.
At the second stage, the renewal procedures of algebraic normal form of the function on the output sequence are used.
At the third stage, depending on the practical application environment, the modification procedure So, the developed method allows us to form balanced cryptographic functions with high nonlinearity.In this case, as shown in Fig. 1, the values of nonlinearity lie in a narrow range of values that depends on the dimension of the vector space.
It should be noted that for modern in-line ciphers, an important indicator of effectiveness is also the correlation immunity that characterizes the resistance of the encryption scheme to correlation attacks [41][42][43][44].We will make the evaluations of nonlinearity and correlation immunity of Boolean functions that can be synthesized by the developed method.

EVALUATIONS OF NONLINEARITY AND CORRELATION IMMUNITY OF THE FORMED FUNCTIONS
For cryptographic Boolean functions, the relationship between the attainable degree of correlation immunity m and its nonlinearity Nf [30] is known: it is true for As it can be seen from ( 1), the degree increase in correlation immunity m leads to the decrease in nonlinearity, and vice versa.Therefore, developers of cryptographic protection facilities, depending on the conditions of practical use, have to find a compromise between the required nonlinearity and the desired degree of correlation immunity.The advantage of the developed method is the ability to build functions with different values of cryptographic indicators.
So, for example, in Table 3, the achievable degree of correlation immunity CImax(k) with indication of the corresponding non-linearity Nf min is shown according to (1) and (2).In fact, the data in the table correspond to the lower limit of nonlinearity, guarantee obtained using the developed method.Table 4 shows the achievable degree of correlation immunity CImax(k) with indication of the maximum possible nonlinearity for the balanced functions of Nf max.That is, the upper limit of the functions nonlinearity using the developed method is given here.In Fig. 2 for clarity the table data is depicted by means of a graph.

Figure 2 -Boundary indicators of correlation immunity
As the mentioned data analysis show, the application of the developed method allows to form the Boolean functions, which, in addition to high nonlinearity values can be potentially correlationimmune functions.When used in stream ciphers, they will be highly resistant to various cryptographic attacks.Thus, for example, the application of the developed method over the V8 space allows us to form functions with the nonlinearity index Nf min = 112 and the degree of correlation immunity CImax(3), which is the best result known up today.
It should be noted that the probabilistic search by heuristic methods is described by some random process, the specific implementation of which is a random variable -the values of the indices of the stability of the found function (see Section II).
The corresponding probabilities of the occurrence of the desired random events indicate the average number of attempts to succeed -the construction of a cryptographic Boolean function with the required properties.Thus, to evaluate the computational

METHODOLOGY OF ESTIMATION OF THE EFFICIENCY AND RESULTS OF THE RESEARCH
The proposed methodology uses the average number of attempts as an index of computational efficiency that will need to be performed using the heuristic method to generate a cryptographic function with the required indicators of stability.
In accordance with the main provisions of the theory of probability and mathematical statistics, the unknown distribution function of the random variable under consideration is determined due to the results of observations from the sample [29].A sample of volume L for a random variable A is a sequence X1, X2, …, X L of L independent observations of this quantity, that is, a set of values taken by L independent random variables А1, А2, …, А L with the same distribution law FA(x) as the considered quantity A. In this case, the sample X1, X2, …, X L is taken from the general aggregate of A, and the distribution law of the general population is understood as the distribution law of a random variable A. The values X1, X2, …, X L are called sample values [29].
The following notation:

* ( ) ( )
Using the distribution function indicator of computational efficiency of heuristic methods as the average number av K of attempts of probabilistic formation of a Boolean function with the required properties can be introduced as: will be determined by the probability of a joint event written through the product of the probabilities of independent events: is calculated from the expression: Two main indicators are of greatest interest for cryptography: nonlinearity of Nf and autocorrelation of AC [21-28], and it is necessary to maximize nonlinearity and minimize auto-correlation.To estimate the computational efficiency for these two stability indicators, the last expression is rewritten in the form: where: ( ) x -theoretical and empirical probabilities of an event { } АС x  ; Using the indicator av K , comparative studies of the computational effectiveness of heuristic methods of probabilistic formation of cryptographic Boolean functions will be performed.As an object of investigation, the method of random generation [21-28], the method of gradient lifting and the heuristic method of gradient descent proposed in [27] will be used [28].As the analysis shows, the heuristic method of gradient descent is highly competitive with the closest analogue -the method of gradient search.It allows for representing Boolean functions with a low autocorrelation index.
Fig. 4 shows the dependencies av K for:  the method of random generation with AC = 80 (RG, AC = 80);  method of random generation with AC = 120 (RG, AC = 120);  method of gradient lifting with AC = 24 (MCD, AC = 24);  method of gradient lifting with AC = 32 (MCD, AC = 32);  gradient descent method with AC = 24 (IKK, AC = 24);  method of gradient descent with AC = 32 (IKK, AC = 24).Analysis of the dependencies provided in Fig. 4 shows that the gradient descent method allows for representing Boolean functions with high cryptographic indices (nonlinearity and autocorrelation) for fewer attempts (on average).For example, the formation of a cryptographic function with AC = 24 and N = 116 for the random generation method is computationally unattainable due to the extremely high average number of attempts.For the same parameters, the gradient lifting method will require an average of about 8000 attempts.The method of gradient descent with the same parameters will require an average of 4 attempts, i.e. the average number of attempts has decreased 2000 times.When requirements to cryptographic properties of AC = 24 and N = 114, the method of gradient lifting will require an average of about 15 attempts, and the method of gradient descent -about 3.

CRYPTOGRAPHIC PROPERTIES OF FORMED BOOLEAN FUNCTIONS
We will conduct a comparative study of the properties of cryptographic Boolean functions with the best known analogues: the genetic algorithm [31], the NLT-and ACT-algorithms [32], which belong to the class of heuristic methods.
Table 5 presents the results of a comparative assessment of nonlinearity functions obtained using the developed method of gradient descent, the prototype method (heuristic method of gradient lifting) and the best known heuristic methods (all data except the last line are taken from [31]).
These data indicate that among the heuristic methods, the developed method allows for achieving the highest nonlinearity.High nonlinearity indicates a high degree of data mixing, which determines the resistance of crypto-transformations.For the first time, we managed to construct functions with the highest known nonlinearity among the heuristic methods: Nf = 488 for V10 and Nf = 2002 for V12.[31] 26 116 484 1976 NLT [32] 26 116 486 1992 ACT [32] 26 116 484 1986 Developed method [28] 26 116 488 2002 Table 6 shows the comparative characteristics of the best known methods that allow for representing functions with low autocorrelation values [31].As it can be seen in this table, the developed method allows formation of functions with low autocorrelation values.Over V8, the NLT and ACT methods allow for representing functions with AC = 16, but the nonlinearity is equal to 112.The developed method allows formation of functions with nonlinearity 116.Over all other vector spaces, the obtained values are comparable to the results for other methods.[33,34] 16 24 48 96 Maitra [35,36] 16 24 40 80 NLT [32] 16 16 64 144 ACT [32] 16 16 56 128 Developed method [28] 16 24 40 72 Figs. 5-9 show the spectral properties of Boolean functions formed in various ways.In parentheses are the indicators: (n, deg(f), Nf, AC).This data shows that cryptographic Boolean functions constructed in accordance with the developed method [28] have the maximum attainable algebraic degree, high nonlinearity, and low autocorrelation.By the majority of resistance indicators, the formed functions are equal to known methods.

CONCLUSIONS
Studies of the computational efficiency of heuristic methods that were conducted have shown that the methods of gradient search for an acceptable number of iterations allow for representing cryptographic Boolean functions with high nonlinearity and low autocorrelation.Formed functions are not inferior to the best known results for the rest of the cryptographic indicators.
The gradient descent method, first proposed by us in [28], is developed in this paper.In particular, we obtained estimates of the computational complexity of this method, and also carried out a comparison with the closest prototype, Hill Climbing Method.The gradient descent method proved to be more effective than the Hill Climbing Method in [27].In particular, the results of experimental studies show that the method of gradient descent requires ten times smaller number of iterations, i.e. it is more effective in the computational aspect.
We have compared the cryptographic properties of Boolean functions formed by various methods.Comparisons were made with the following evolutionary computational approaches: the Hill Climbing method, the Simulated Annealing method, the Genetic Algorithm.Comparative studies of the cryptographic properties of Boolean functions have shown that the functions formed by the proposed computational method have high indicators: the nonlinearity index approaches the upper theoretical limit; the autocorrelation index is one of the lowest in comparison with other methods of synthesis; with equal indices of nonlinearity, the formed functions have the maximum attainable algebraic degree; all known methods of synthesis are inferior in spectral characteristics of functions.Thus, on the basis of the conducted studies, it can be concluded that the functions constructed in accordance with the developed method have high persistence indexes and exceed the known functions by these indicators.

PROSPECTS FOR FURTHER RESEARCH
A promising research direction is the development of a probabilistic model for the synthesis of non-linear replacement nodes with high cryptographic properties, experimental studies and substantiation of practical recommendations in order to implement the obtained results.This research might be useful for the improvement of various methods of information security, as well as other practical use [45][46][47][48][49][50][51][52].In particular, the obtained results can be used to build non-linear replacement nodes for modern block symmetric ciphers, including the formation of sblocks of the Ukrainian national block encryption standard Kalyna (DSTU 7624: 2014) [3,53], the cryptographic hashing algorithm Kupyna [54][55][56], as well as the recently approved stream encryption standard Strumok [16].
The estimates and calculated values given in this paper (see Tables 5 and 6) clearly confirm the conclusion that the developed gradient descent method is not inferior in basic cryptographic indicators (nonlinearity and autocorrelation) to the best known results.In addition, as it is seen in the diagrams (see Figs. 2, 3, 4) the developed method is significantly (several times) more efficient computationally.Thus, the obtained results have the great practical importance for the development of methods and computational algorithms for the formation of nonlinear nodes of modern symmetric cryptoalgorithms.
The proposed method for estimating the computational effectiveness of heuristic methods can be used for other methods, including using an extended set of indices of stability.This direction is an area of our further research.
Nf is the minimal Hamming distance Nf between the function f and all affine functions over GF (2 n ) [21-28]: where  is the set of affine functions.
Vn, f(х), ,x  N (,x -is the scalar product w1x1  …  wnxn) .Correlation-immune function of the k-th order is a function possessing correlation immunity of the order of k.Balanced correlation-immune functions are called elastic functions.Function f over the field GF (2 n ) satisfies [21-28]: normal form of the function f(x) is used.This allows us to maintain the basic indicators of stability (balance and nonlinearity) by applying affine transformations, to improve either the dynamic properties of the nonlinear transformation or the correlation characteristics.
methods, i.e. it is necessary to assess the probability distribution of the formation of Boolean functions with different cryptographic indices.

i
SI -a random variable whose values represent the outcomes of a heuristic search is introduced -a numerical expression of the i-th indicator of the strength of a cryptographic Boolean function; X1, X2, …, XL is the sample of the volume L of the random variable SIi; the distribution function of the random variable SIi.The values of the theoretical distribution functions ( ) i SI F x that are the probabilities of events should be estimated{ } i SI x  , using the frequencies of these events from the sample of the volume L. vx denotes the number of sample values less than x.Then the frequencies x v L of sampling to the left of the point x in this sample are the frequencies of events{ } frequency of the event in L independent experiments is an estimate for the probability of this event, i.e.

Fig. 2 1 .
Fig. 3 shows the frequencies of events { } AC x  for balanced Boolean functions constructed over V8, the sample size L =10000.As the analysis shows, the heuristic method of gradient descent is highly competitive with the closest analogue -the method of gradient search.It allows for representing Boolean functions with a low autocorrelation index.Fig.4showsthe dependencies av