NETWORK APPLICATION-LAYER PROTOCOL CLASSIFICATION BASED ON FUZZY DATA AND NEURAL NETWORK PROCESSING

A technique of network packet classification on the application layer is proposed. It is based on fuzzy data processing and artificial neural networks to define the network packet belongingness to one of the known network protocols. In the suggested technique, two main data processing stages are distinguished. At the first stage data is preprocessed by fuzzy logic methods. At the second stage the packets are classified by means of an artificial neural network. An artificial neural network having the proposed architecture allows one to determine the following aspects: the type of secure network protocol, the internal state of the network protocol based on the application of logical decision rules, and the type of network application using the identified protocol. The architecture of the bench environment for field tests is considered. During the experiments, the traffic of real network applications that are used around the world was used. Experimental assessment of the offered technique showed rather high quality and work speed of the developed classifier.


INTRODUCTION
The current stage of development of almost all sectors of the economy, including energy, manufacturing, finance, etc., is characterized by their wide informatization, which is based on the intensive development of communication infrastructure and the massive use of information technology. However, the infinity of the global information and communication space leads to the possibility of intercepting information by different malefactors and the unlawful use of information technology. First of all, this is due to the possibility of carrying out computer attacks against the information resources of corporations and enterprises. In this case, telecom operators and information system developers are forced to pay increased attention to issues related to the construction of security systems. One of the most important issues of this kind is the preservation of confidential information in compliance with international standards and protocols of information and communication interaction. International Journal of Computing combined and be information providers for other systems of this type. For example, NTA systems can provide information on security incidents for IDS, IPS, NMS, DDoS PS, and others. Reliable classification of Internet protocols will provide the necessary information to the above systems to identify: -network attacks, abnormal and (or) fake traffic [1]; -network devices [2] ; -application-level software operating at the seventh layer of the OSI model (Skype, Facebook, Viber, Telegram and others) [3,4] and their states.
The classification of network protocols is implemented by the following methods: -signature, behavioral and hybrid analysis of network packets. The signature method is based on the analysis of the header and the information block (payload) of packets. The behavioral method is based on the study of the statistical properties of network traffic: the sequence, number and size of network packets, time intervals between packets, etc. [5][6][7]; -artificial intelligence (machine learning algorithms [8], neural networks [9], fuzzy logic, genetic algorithms [10], etc.); -traditional mathematical research (fractal and wavelet analysis [4], cluster analysis [11], etc.); -combination and development of various classification methods.
Methods for traffic classification in the interests of solving the problem of detecting network attacks are presented in [12][13][14]. This list of publications is not exhaustive.
The indicators of the classification efficiency of network protocols are as follows: classification accuracy, computational complexity of the classification algorithm, the ability of the algorithm to parallelize calculations, classification time, and others. Increasing the values of these indicators has an important impact on the functioning of NTA, DPI [15,16], IDS / IPS [17], DDoS PS and other systems.
The main method of traffic classification in these systems is based on the signature based approach, which is characterized by high resource consumption. A qualified estimate of the complexity of the signature based algorithms is given in [12]. In addition, it can be clarified that these classical methods have a multitude of disadvantages: -the number of false positives is potentially higher due to the use of encryption mechanisms; -the accuracy of the network packet classification depends on the competence of system administrators in configuring the system; -do not allow classification of protocols having a variable part of packet attributes (variable port, IP address, encryption); -it is not possible to compile a signature for each protocol.
The following approaches are distinguished by the level of classification of network packets: -Shallow Packet Inspection (SPI); -Medium Packet Inspection (MPI); -Deep Packet Inspection (DPI). The analyzers of the "shallow" level function in the simplest firewalls where the decision on blocking of packets is usually made according to the list of the prohibited IP addresses and port numbers.
The analyzers of "medium" level allow one to carry out traffic filtering by using the information on the transmitted data format and also on more complete localization of the sender. These tools usually act as the intermediary (proxy) between an access provider to the Internet and an internal network.
The systems of "deep" packet analysis are intended to identify the applications participating in network interactions and define the states of information exchange protocols. Therefore the "deep" analysis of network packets assumes the analysis of content of these packets on all levels. The purpose of the DPI systems is ensuring the control over execution of the requirements for information security of info-telecommunication infrastructure and monitoring the quality of functioning of communication channels.
The paper suggests a technique of network packet classification on the application layer which can be used both on the level of Medium Packet Inspection and Deep Packet Inspection (DPI). The novelty of the paper is in a new approach for combining fuzzy data processing and artificial neural networks to define the belongingness of network packets to one of the known network protocols. Thus, the paper suggests using two stages in the offered technique: at the first stage data is preprocessed by fuzzy logic methods. At the second stage, the packets are classified by means of an artificial neural network. Experimental assessment of the offered technique showed rather high quality and work speed of the developed classifier.
Statement of the network protocol classification problem is considered in the second section. The third section suggests the general description of the traffic classification technique used. The fourth section specifies the preprocessing stage of the suggested approach. The technique for neural network processing is considered in the fifth section. The sixth section presents the network packet classifier implementation and the results of experiments. The seventh section summarizes the main results and reveals the direction of further research.

STATEMENT OF THE NETWORK PROTOCOL CLASSIFICATION PROBLEM
The problem of the network protocol classification can be formulated as follows.
There is a set of the investigated objects -IP packets of the application layer: where is an analyzed packet from the sequence of packets (traffic) with dimension from 1 to W.
Each object (IP packet) is characterized by a set of variables (attributes): where is i-th observable attribute of w-th packet, which value is defined in "Request for comments" (RFC) of the known classified protocol, = 1, … , 13; P is a useful packet data (payload); Z is a dependent set of feature values defined at classification of the protocol.
The set Z includes: where is a type of the protocol; is a state of the protocol; is a type of the application. Each attribute takes a value from some set: where is the i-th variant of attribute values from М possible variants described in RFC.
Thus, the problem of classification comes down to definition of the set Z based on the values of attributes of the packet sequence . Solving an applied classification problem, taking into account the analysis of researches [14,18], the following set of important attributes of packets (factor space) was defined: -Markers (used to maintain the quality of the service and indicate priority when processing the package); --Length (number of bytes of the payload hexadecimal set); --Teaching (the protocol which value is a priori known during the training, null -in other cases); -= , , … , -a hexadecimal Payload set with dimension of J bytes (IP packet payload).

GENERAL DESCRIPTION OF THE TRAFFIC CLASSIFICATION TECHNIQUE
When implementing "deep" packet analysis, the present paper considers a combined method of traffic classification based on the application of neural networks and fuzzy sets [19][20][21][22][23][24]. Significant gain in calculation time while solving traffic classification problem is achieved due to reduction of factor space by introducing two-stage method of processing ( Fig. 1), which includes two stages: preprocessing stage and neural network processing stage.
The use of fuzzy sets allows one to expand the understanding of ordinary mathematical sets. In this case, the binary nature of belongingness of some element to the set is rejected, i.e., the membership function takes the value 1 ("true") when the element belongs to the set and the value 0 ("false") when it does not belong [25]. The membership function of a fuzzy set can take any values on the interval [0, 1].
The advantage of artificial neural models is expressed in the ability to analyze incomplete input data or data with natural noise, or data obtained as a result of exposure to the system. Algorithms based on neural networks process each event that has its own weight, which is important for traffic analysis. Algorithms are implemented by elementary mathematical operations, due to which they have a high speed of operation. They have the possibility of self-learning and the ability to predict further events in the system. The indicated possibilities of mathematical methods allow one to suggest that their implementation in software will minimize the time of network traffic classification and increase the volume of the transmitted traffic, which is an urgent task in the condition of increased workload in information and communication networks.
Many authors have already applied the presented mathematical methods [21,[26][27][28] separately. It is proposed here to apply them together in order to reduce the dimension of the problem being solved and increase the efficiency of solving the IP packet classification problem. As a result, on the one hand, it will reduce the requirements for computational resources of NTA, IDS/IPS, DDoS PS and others systems and, on the other hand, will increase their efficiency.
In the first stage (pre-processing stage) the following operations are performed: 1. Primary determination whether the network packet being analyzed belongs to specific groups ( , , ) 2. Fuzzification and normalization of attribute values 3. Reduction of the dimension of factor space of features (convolution) 4. Defuzzification.
In the second step (neural network processing stage), using neural network processing, the traffic classification is completed using the method of logical regression. The result of this step is the calculation of the dependent set Z.

PREPROCESSING STAGE
When preprocessing the packets of the analyzed traffic, the following operations are performed: 1. Allocation of structured data of the IP packet, 2. Identification of the internal state of the protocol (Fig. 1, block A). Preparation for classification is provided by using additional features of internal state of the protocol (connection of subscribers, key exchange, identification, data transmission, session completion, etc.); 3. Division of the classified network packets of the relevant protocols (DHCP, DNS, FTP, NTP, HTTP, HTTPS, SSL, TLS, etc.) on homogeneous groups ( , , ) ( Fig. 1,  block B); 4. Normalization of attributes (Fig. 1, block B). Normalization is the process of bringing attribute values to the same scale; 5. Preliminary classification based on the application of fuzzy set theory and convolution (Fig. 1, block C). Calculations are transformations of normalized attributes based on logical rules and fuzzy logic algorithms. Actions are completed by convolution of normalized attributes to reduce the dimension of tasks.
Three groups of protocols are defined as follows: -: the group of protocols in which connections of subscribers are established -TCP protocols; -: the group of protocols in which connections of subscribers are not established -UDP protocols. Traffic is processed by two subscribers; -: the group of protocols in which connections of subscribers are not established -UDP protocols. Traffic from one subscriber is processed at the same time by several subscribers.
Division is carried out on the basis of values of attributes (EtherType, Multicast и IPprotocol) using the database of the public information resource Internet Assigned Numbers Authority (IANA) [29].

IDENTIFICATION OF THE INTERNAL STATE OF THE PROTOCOL
For identification of a state of the protocol (Fig.  1, block A) it is not enough to use data from a packet header. It is necessary to retrieve additionally the data from the packet Payload field (attribute P). In this case the identifying features of the protocol status are the data retrieved from hexadecimal useful data of the transport layer packets of the Payload field. Internal state is defined on the basis of the logical decisive rules constructed on the basis of data of RFC (TLS 1.0 RFC 2246, TLS 1.1 RFC 4346, TLS 1.2 RFC 5246, TLS 1.3 RFC 8446) [30] where the logic of work of the protected transport layer protocol is specified. Main session states of exchange of TLS are initial connection, exchange of cryptographic keys, determination of connection parameters, authentication, warning, data exchange, and session completion.
For example, the rules of definition of connection state may be: Thus, the output of block A, in case of establishment of a protocol state Y1, will have positive value. The value at the Y1 output provides additional input data for the neural network. The neural network works more accurately using Y1.

DIVISION
The division of classified network packets is based on the following logical rules: The distribution of network packets to these sets may be represented as follows:

FUZZIFICATION AND NORMALIZATION
The stage of a fuzzification and normalization of entrance attributes of a packet { ,…, , } of the protocol is carried out with the use of member functions (µ). On an input of a processing unit, a consistently created array of IP packets dimension of W arrives. The array contains values of all input attributes . The purpose of a stage is obtaining member function values for all conditions from the rule base: Thus, the matrix is a set of results of calculating the membership function for the n-th attributes of w-th IP packet, where w = 1, …, W is the number of the classified packets; n =1,…, 13 is the number of packet attributes investigated.
We used linear membership functions (sigmoid, triangular, trapezoidal and other species) based on the rules, for example: In the presented rule set the values of port numbers of the TLS protocol, defined on the basis of data from RFC [30], are used. On the basis of simple logical processing rules of the port number X7, the studied protected protocols with enciphering TLSv1.0, TLSv1.1, TLSv1.2 and TLSv1.3 are classified. As a rule, for the transferring and accepting parties, the port 443 and ports from an interval of integer numbers, the lower bound value of which exceeds 50000, are used. The variant of graphical representation of the membership function constructed according to the rule base (8) for X7 is presented in Fig. 2-a. Fig. 2-b depicts the graphical representation of the membership function constructed according to the rule base (9) for X8. Generally different models of normalization functions of both linear, and not linear type can be applied. The influence of different types of membership functions on quality of classification will be the next stage of investigations.
1 50000 443 where _ is the minimal value of the attribute ; _ is the maximal value of the attribute .

DEFUZZIFICATION
The algorithm of defuzzification of output variables is carried out on the basis of the algorithm of Mamdani-Zade fuzzy inference [31][32][33], which purpose is obtaining quantitative value for each of output linguistic variables. Formally it occurs as follows: the output variable and the set (i = 1, …, 13) are considered, then a total quantitative value of each output variable is calculated. The value at the exit of the model is calculated by the method of the gravity center, in which the n-th value of the output attribute is calculated by the following expression: where is the membership function of a corresponding fuzzy set ; min and max are borders of the universe of fuzzy variables of w-th packet of n-th attribute (in our case for the attributes X7, X8: min -0, max -65535); is the matrix of results of defuzzification for w-th packet.
The choice of the method of gravity center is made on the basis of such advantages as separation of the control solution from the statement, application of the universal method, use of the already worked and proven apparatus of fuzzy logic.
This method is the least demanding for computing resources, so their use is useful in the considered application field.

CONVOLUTION
We have defined the following output parameters of the pre-processing stage (input parameters of the classification stage are the neural network input): -1 is the value of the protocol state at transmission of the w-th packet; -2 is the belongingness of the source port number to the value related TLS protocol of the w-th packet; -3 is the belongingness of the length of the network packet to the value related TLS protocol of the w-th packet; -4 is the belongingness of the value of integer numbers ContentType of the PayLoad field defined in RFC for protocols TLSv1.0, TLSv1.1, TLSv1.2, and TLSv1.3 of the w-th packet.
The number of input parameters is reduced compared to the number of attributes in order to reduce the dimension of the artificial neural network. The procedure for reducing the dimension of an attribute input space is to apply fuzzy arithmetic rules over sets Y . We applied: Besides the specified parameters, the values of attributes X7 and X8 are supplied to the neural network input to improve classification quality.

NEURAL NETWORK PROCESSING FOR CLASSIFICATION
The neural network architecture which is most approved now is the multilayer network of direct distribution (called also as the multilayer perceptron) was offered in [34], and gained development in [22,26]. At the same time, we effectively applied logistic regression to identify the belongingness of network packets to the protected protocols. The method of logical regression allows one to receive probabilistic estimates of the protocol classification.
Having made numerous experiments, the architecture of the multilayer network of direct distribution with one buried layer, which includes the L neurons N, was selected. Having applied the genetic algorithm [22], we defined quantity of neurons of the buried layer for classification of the protected protocols. So, for TLSv1.1 L = 11 neurons, for TLSv1.2 L = 12 neurons.
By method of gradient descent we provided training of neural network [35] that allowed one to receive high quality of classification of protocols for the smallest time of calculations.
In solving the classification problem, the sum of the input signals of the hidden layer is converted into the output of the neuron by means of an activating non-linear function σ, which does not possess a memory: The choice of activation function σ depends on specifics of a solvable applied task. In this work as activation function a sigmoidal function was applied: The impact of different types of activation functions on the quality of classification is the next stage of research.

NETWORK PACKET CLASSIFIER IMPLEMENTATION AND EXPERIMENTAL RESULTS
A comparative analysis of machine learning algorithms is given in [27,28,36,37]. In order to obtain an assessment of the effectiveness of the proposed approach, the software was developed and a bench environment was implemented.
However, in our work for the purpose of ensuring cross-platform under different processors and operating systems, the classification program specially developed in C++ was used. Besides, the programming module in programming language C++ was developed to check the presented mathematical apparatus.
The validation of the developed software was realized on the hardware platform with the following characteristic: -Central processing unit: Intel Core i5-6400 2,7 GHz -RAM: 8 Gb -Operating system: MS Windows 10 Pro 64 bit -Network interface -100 Mbps. The architecture of the bench environment implemented is depicted in Fig. 3. During testing on a smartphone and computers with installed applications (Viber, WhatsApp, Google Chrome), information was exchanged with the relevant services via the Internet. The traffic generated by the applications was mirrored to the server, recorded in the form of a dump. Next, a traffic dump was sent to the traffic classification software module.  Figure 3 -The architecture of the bench environment As a training sequence, 1250 packets were sent to the module, and as a testing sequence, we used a series of six dumps with a total length of 9128 packets.
As a training sequence, 1250 packets were used; and as a testing sequence, 1996 packets were sent to the module. The prepared sets of network packets for training and testing had the distribution presented in Fig. 4. The test results for evaluating the testing time are presented in Table 3. Thus, during the testing of the developed classifier, the following results were obtained: -Probability of a recognition error of the second kind for the protected protocols (TLSv1, TLSv2, etc.) is not less than 0.95% -Probability of a recognition error of the first kind for the protected protocols (TLSv1, TLSv2, etc.) is not more than 0.05 % -Average time of classification of the protocol is equal to 0.6 milliseconds.
These results suggest that the developed classifier has, on the one hand, a rather high speed of operation and, on the other hand, a rather high quality of classification. In this case, when creating a classification software module, optimization mechanisms for executable code were not used and hardware accelerators were not involved.

CONCLUSION
The technique of classification of the protected application-layer protocols for information exchange, presented in the work, illustrated a modern approach on application of fuzzy logic and neural networks. This approach can be applied to create the efficient information security support systems (IDS, IPS, NMS, DDoS PS, etc.). This approach significantly differs from the algorithms of classification based on the analysis of the sequences of the rules which are previously prepared by highly qualified specialists in information security field.
The main advantages of the approach suggested are a high computational performance of classification and a high quality of classification.
The practical results achieved in testing the suggested technique make it possible to put forward a hypothesis about the possibility of moving away from routine methods of building chains of rules based on signatures, to the construction of adaptive self-configured systems for classification of IP packets of secure application layer protocols, based on methods of fuzzy sets and neural networks. Taking into account the avalanche-like growth of application level protocols, the application of the presented technique in the software of secure information systems will reduce the requirements for the knowledge of system administrators and increase the efficiency of protection of information resources. This direction can be considered as the main direction of further research.
Funding: Research is carried out with support of Ministry of Education and Science of the Russian Federation as part of Agreement No. 05.607.21.0322 (identifier RFMEFI60719X0322).
Conflicts of Interest: The authors declare that there is no conflict of interest.