THE METHOD OF INTERACTION MODELING ON BASIS OF DEEP LEARNING THE NEURAL NETWORKS IN COMPLEX IT-PROJECTS

In this paper, we propose a method for using neural networks to model impacts on the parameters of complex IT projects for the creation of distributed information systems. The method allows predicting the level of changes in the results of the project activity at any time during the execution of projects and depending on changes in the time parameters of projects. An integrated information system is developed for modelling the changes of key parameters of IT projects using cloud data warehouses. The modern information technologies of management projects of leading developers are involved and integrated in the process of modelling. The evaluation of the results of modelling the effects of changes on the timing of work implementation is carried out taking into account the context characteristics of projects, including resource distribution both in time and in project work, cost distribution, etc. The model of in-depth training of the neural network is proposed, due to the experimental representation of the input and output data of numerical experiments. In this paper we propose a method for analysing the effects of changes on the terms of project execution.


INTRODUCTION
The dynamics of technology development in the IT area is determined by the increase in the number of new developments and the creation of more complex and powerful distributed information systems. This is facilitated by the rapid development of cloud computing, such as edge computing, which creates new services in the cloud [1], as well as artificial intelligence, blockade, large data analytics, quantum computing systems, etc. According to the American analyst company IDC (International Data Corporation), which specializes in IT studies, over the next 5-10 years, more than 90% of all enterprises will organize their own digital IT environment for further development in the digital economy, will create new digital business-models; more than 60% of global GDP will come from digital technologies and solutions [1].
In such situation, companies will be urgently needed to change their own strategies, switch to new business conditions using integrated hybrid and multi-node tools and strategies implemented through distributed information systems [2,3]. Increasing competition, increasing the turbulence of processes occurring in the external environment, increasing complex unpredictable calls lead to the need to move to "super-flexible applications" -container, server-free computing and other technologies, the characteristics of which include modularity, distribution, constant updating and use of cloud computing [4,5].
Recent IT trends indicate that the use of distributed information technology is useful and relevant. It is regarded as an effective way for resolving current problems and challenges arising from the rapid development trends, globalization, technology and complexity of intensified turbulence of the environment [5,6].
Creating complex hybrid systems that can meet today's business needs involves a high enough constructive and technological complexity in the development of such distributed information systems computing@computingonline.net www.computingonline.net

Print ISSN 1727-6209 On-line ISSN 2312-5381
International Journal of Computing (DIS). It is advisable to use the project approach [7,8] to solve these complex tasks, which applies the methods and information technology of project management. Over the past 20 years, the use of such tools has proven its effectiveness [9].
The use of such information technologies for managing IT projects, as well as the use of artificial intelligence methods, will have a significant impact on the effectiveness of the projects themselves.
At the same time, the priority task for project management in the development and implementation of modern integrated software applications with intellectual support is to optimize costs [8,9] for such development and integration, as well as optimize (reduce) the time for their development. Reducing the time to develop a new IT product often becomes a top priority in today's market environment.
Therefore, finding the best options for distributing resources in IT projects can become an important task, a successful solution that will reduce the timing of individual project tasks and, consequently, the whole project, in addition, it can reduce the cost of the project. At the same time, predicting the state [10] of such projects is a multidimensional problem, which can be solved using modern neural network technologies [8]. In addition, a significant number of changes that appear at different stages of the project implementation should be taken into account and have a significant impact on the results of its implementation.
Thus, consideration of the possibilities of experimental use of neural networks in studies of the interaction of numerical changes on the parameters of IT projects with the definition of their optimal states is an urgent task.

RELATED WORK
Support for project coordination was studied in publications of Ukrainian and foreign scientists such as Evaristo R. [9], Zieja M. [10], Gogunskiy V. [11], Sachenko А. [12], Morozov V. [13,14], Fabiano F. [15] and others. In particular, the field of project management, changes, as well as the synthesis of design product configuration in various subject areas, in particular, in distributed projects is deeply studied. However, the problem of choosing the optimal set of controlled elements of the project affected by the changes has not been thoroughly investigated to offer its practical solutions, including in the design and implementation of DIS.
Unresolved earlier parts of general problem. Partially formalized models of DIS elements in distributed projects are presented in the abovementioned works. However, the impact factors of the IT environment require for such complex projects of further development as to determine the reactions of the management system to change the key parameters of the project and create a mathematical model for experimental research.
Formulating of purpose. The purpose of this research is to develop a method for forecasting the reaction of the system based on the deep neural network modelling in case of changes in the parameters of the project elements as a result of the informational influences of the external environment. At the same time, the emphasis will be placed on finding an optimal resource allocation option to shorten the duration of the project. This purpose will be achieved by experimentally modelling changes in the implementation of complex IT projects. To do this, the technological sequence of IT product development, resource parameters and their distribution according to the tasks and time will be determined on the bases of the technology implementation of such projects, as well as experimental simulation will be conducted in order to find the optimal option at the system output [9].

DEVELOPMENT OF DIS ARCHITECTURE
The complexity of projects for the development and integration of distributed information systems is the need to solve multi-criteria tasks, in providing a wide range of functionality and in the use of multiple technologies that creates a multi-level architecture of such systems. Therefore, the complexity of DIS determines the complexity of the processes of managing these types of projects. This in turn will determine the features of the project elements, as a second-order configuration. In other words, you need to complete the process of identifying the DIS architecture for further formation of the configuration of the project.
The procedure for forming a distributed information system can be presented in the form of successive stages: design, project implementation, prototype design, implementation and preparation for use. At the design stage, the structure of the information system is determined, the rules for exchanging information between the different databases that are part of the distributed information system are determined, as well as the rules governing the introduction of changes in such databases.
The system presented in Fig. 1 demonstrates the structural interaction of the elements of the distance learning system developed by the authors, which allows the implementation of distributed data management functions and the full provision of the learning process for distributed users.
The authors are conducting research on the use of such system in the banking sector when creating client-distributed systems. At the same time, the principles of interaction on the basis of the proposed models and methods are transferred almost unchanged.
Among the main elements of such system are blocks that form the primary database on local workplaces, which are then combined at the central portal in the project office. Interaction with remote users takes place with the help of satellite communication.

DEVELOPMENT OF MATHEMATICAL MODEL
As the initial set of data X, we will use the description of project tasks. In this case, the input parameters of the project model can also be presented as: Since the project needs to have a certain amount of resources, it is necessary for the project to specify a list of such resources in the form of a certain set R . From the experience of implementing IT projects, it can be argued that 90-95% of the resources are labour costs. Therefore, for modelling, we will consider only one type of resources -labour resources. Thus, we have: where m -the number of labour resources provided for the implementation of this project. In this case, each j r consists of: where IR -the identifier of a particular resource, NR -the name of the resource, CR -the price characteristics of the resource (the price), MR -the maximum possible load of the resource, OR -the nature of the resource.
As described above, we must now download the resources l R for each project task ( ) i x t . In this case, we obtain the following matrix: where , l i j r -the assigned volume l for j th resource and for i th work.
The specified download is by the expert method. However, as noted in [16], the main approach to obtaining estimates of the time of project execution is the PERT (Program Evaluation and Review Technique) method. It is based on the assumption of beta-distribution of random size, which defines the duration of project tasks. Therefore, conducting the appropriate calculations using the specified method and using herewith function  , we obtain the allocation of resources d R for each period of time of the project's execution: where , d n t r -the assigned amount of resources for i th work in the period of time t .
At the same time, analysing such division, it is necessary to pay attention to the correspondence of the values of the received volumes of resources at each moment of time t with the value of MR (maximum possible resource load) for this resource [17]. In the case where there is an excess of the resource limit, it is necessary to make some optimization. For example, the resource 1,1 d r runs more than 40 hours in the first week of project implementation. This indicates its overload and suggests the need to solve this problem by increasing the duration of work / works that uses this resource in the week under consideration. This decision is appropriate in the context of the statement of the article's task, since other solutions to the problem cannot be used. After increasing the duration of work/works, with a constant amount of resources for the data of work/ works, we receive a decrease in the use of the resource in the period of time that is considered. In other words, when the resource is loaded proportionally, its volume is evenly distributed over a longer period of time [18].
In addition, each work has a certain length and we can specify the set D T of durations of all project works: The next step will be an attempt to review the project duration by reducing the duration of individual works with a mandatory check to exceed the maximum permissible values of MR for each resource. Such iterations can be many as long as we do not get the optimal duration of the project and thus get rid of all overloads of resources: where k K  -the number of modelling attempts (number of matrices `d R ).
Getting the distribution of resources in time and knowing their prices, you can apply the function  for determining distribution of project cost for each project task of project in time.
, 1 , Then the planned cost of the project will be: We need to get a few attempts based on the conditions of the experiment on the use of the neural network and its training, which in the modelling process will give us the appropriate distributions Thanks to the trained neural network in this way, you can use it to get optimum downloads of resources for each project. That is, you need to choose that value from the family of curves for the distribution of the value that corresponds to the minimum acceptable time option.
In the future, with the use of proactive management of IT projects, for each of the possible options d R (which simulate changes) we will have different variants of deviations of the projected duration of the project T  from the optimal option with the developed model of the neural network.
The architecture of multilayer perceptron is suitable for forecasting of data as the base [19]. At the input of the neural network 100 values are given -data on the allocation of project resource volumes by periods for specified works (with changes in the duration of the works). At the output we get the value of the minimum project duration.
We faced the task of choosing the network architecture (the number of layers, the number of neurons in the layer and the number of training samples) so that the network properly recognized the distribution that is being submitted to the input. For each function, 30 experiments were performed at each stage. One experiment is the input to the neural network of one of the known functions with noisy values (according to the model in Figure 2). Activation functions were selected from linear, quadratic, cubic and sigmoid functions. We stopped using the sigmoid activation function in the hidden and output layers of the neural network due to the fact that the program time significantly increases as a result of such excess, implemented in a hidden cycle. It was applied for teaching the neural network by the method of reverse error propagation: At the first stage there was a choice of the required number of neurons in one hidden layer by increasing from 30 to 200 in step 20.
In the course of the experimental selection of network characteristics, the hidden layer of neurons began to contain 50 neurons. The availability of a large array of source data for training is one of the main factors determining the possibility of a successful solution to our problem. It is important not only the volume but also the origin of the data [20]. So, for predicting the results of resource allocation, the statistics that are typed on similar projects are important.
The training data was presented in such a way that they were able to interpret the work program, that is, they were normalized., the program must perform the following steps in order to calculate the outputs of a network with this input vector: 1) transpose the input vector; 2) multiplies the first matrix (layer) on this vector; thus obtaining an intermediate vector; 3) receives a new vector by sending each element of the vector T to the activation function of the n th neuron; this vector is the input vector for the second layer.
The main idea of the method of reverse error propagation is to spread the error signals from the outputs of the network to its inputs, in the direction of reverse direct propagation of signals in the normal mode of operation. In reverse propagation of the signal, the behaviour of the neuron is determined by the behaviour of its constituent elements considered earlier; in general, the output signal is determined by the expression: where bi y -the output signal of the network in reverse propagation, b x -the input signal of the network in reverse propagation, g -the coefficient of amplification of the functional converter in reverse distribution, i w -synoptic weight.

EXPERIMENTATION
We should determine the list of tasks or works for conducting experimental studies as inputs that determine the content of the IT project itself. Usually such list is from 500 to 1500 tasks, but for the results of these studies should be limited to a fragment of 10 tasks. Description of such tasks is shown in Fig. 3. Also, it is possible to determine the duration of each task by certain methods as parameters of such tasks.
Going into the neural network learning phase, we further define the numerical variants of the duration schemes of the tasks that were considered for determining the optimal duration of the project with a certain load of tasks with resources. An example of a basic version of loading tasks with resources is shown in Table 1. Variants of changes in the duration of tasks are given in Table. 2. Such variants should be from 10 to 50. These download options will be the generator of changes in the investigated project Table 1.
Within the framework of machine learning tools, we solved the problem of prediction. The basis for data processing was a neural network built on the model of LSTM (Long short-term memory). The main parameters (plug-ins, data for connecting to the database and account in Twilio) were set in the engine.py file.
The data set was divided into two -one part for testing, and the second for training. At the same time, data for training accounted for 80% of their total volume and covered sets of project works.
To create a neural network, python packages and the TensorFlow library were used. To work we needed the following libraries: numpy, matplotlib, keras, jupyter.  If the decision tree used in project management automation is trained on all types of questions, then the neural network accepts only numerical input data and learns only on quantitative attributes.
It was necessary to convert the set of work parameters so that it could be fed to the input of a neural network with a constant number of inputs.
All data is presented in the form of two files. In the first 824 lines and 10 columns. One line provides one job, each column is one of 10 work parameters in the form of a symbol-reduction from the whole word-parameter.
Once the data is collected, they need to be prepared. This stage is called preprocessing. The main task of preprocessing is to display data in a format suitable for teaching the model. Among the three main manipulations on the data at the preprocessing stage, we have done the creation of a vector feature space, where the examples of the training sample will be implemented. Also carried out the classic data normalization. The process by which we achieve, for example, that the average value of each attribute for all data is zero, and the variance is single.
After loading tasks with resources, we obtain two corresponding distributions, variants of which are shown in Fig. 2 and Fig. 3. In this case, the differentiated distribution will show us the peak load or failure in the use of resources (Fig. 3). A cumulative diagram will show which of the resources in the time is most loaded, and which vice versa. For the training of the neural network, it is also necessary for each variant of loading tasks with resources to have output differentiated and cumulative (growing amount) distribution of the resources involved in time.
We will use simulation in standard program planning and monitoring programs, such as Microsoft software [21] to obtain such results.
The resources of the basic version of the project task were loaded and certain excess of resource usage limits were loaded. Such surpluses were subsequently eliminated, which greatly increased the duration of the project.
We also load all these graphs in tabular form into the neural network at the learning stage. The author analysed the experience of managing IT projects, indicating that as critical source information, in such cases, a project cost schedule or a timetable for project cost sharing is used.
As a result of modelling the options for the duration of tasks for our project, we obtain a family of resource distribution curves in time in terms of value, an example of which is given in Fig. 4.
All these obtained data are also recorded on the neural network at the stage of its training. You can visually and analytically determine the maximum and minimum distributions by time for the same value of the project value, when analysing a family of cost curves. In practice, from a commercial point of view, this indicates that there may be optimistic and pessimistic options for completing any IT project. But the most interesting is, of course, the cost sharing of the project in time with a minimum duration of the project. This option is economically attractive and can provide the finished product as soon as possible.
There remains the question of determining the optimal duration of tasks for obtaining a minimum time schedule of value. The solution can be obtained by numerical calculations, gradually approaching a given initial result. But it will take a lot of time and will be a sufficiently expensive process.
The process of training the neural network was performed by selecting a set of its coefficients W to solve our problem. Initially these coefficients were not entered by the initial values and by the input of the network were given a vector of input signals X, and then using the activation function the vector of the output values Y is compared, which is compared with the known values Y_test, based on the experience of successful projects. After that, the deviation of the calculated value from the given is calculated, and if it is more given, is the change of coefficients W_hidden and the process is repeated. Batch mode is used [22]. Hidden levels of the network are transformed by activation functions. Thanks to these elements of the network infrastructure, the system turns into nonlinear. The most commonly activation function is used in programming the network -the rectified linear unit (ReLU), as shown in Fig. 5.

Figure 5 -Fragment of program code with activation function
Below is the dependence of the medium-square error (RMSE) on the cycle of training (Fig. 6):  (15) where t y -the actual value; t y -predicted value.

Figure 6 -Medium-square error
The training of the neural network consisted of 50 cycles, in 5 episodes, which took about 16 minutes. Part of the used software code (Fig. 7).
In the course of computational experiments, the neurons were enlarged in a hidden layer so that when adding each new neuron, the network was retrained and fixed the medium-square error of learning by the equation (11). The process of augmentation of the hidden layer continued until the medium-square error stabilized at some constant value, which is an error of empirical data. In all experiments, increasing the number of layers did not reduce the error of learning. We worked with two hidden layers for this reason.
After receiving the elements of the vector of the optimal distribution of the duration of tasks for the formation of the minimum value of the project cost curve, we transferred the responses of the neural network to the necessary units of measurement. The proposed method of studying the interaction of the environmental impacts of a project on changes in its parameters, results of implementation and changes in the responses of the management system of complex IT projects to use neural networks with in-depth training is based on an integrated approach to the use of information systems to consider the processes of creating complex products of such projects.
A distinctive feature of this approach is a coherent presentation and analysis of the environmental impact on a plurality of all elements of the project with strong interacting relationships and influences. This allowed identifying the control elements for formalizing the development processes of distributed information systems.
Using such an approach in the future will allow building the necessary conceptual and mathematical models and approaching to the solution of problems of effective management of changes in IT projects. Therefore, we can conclude that such an approach can be used to manage portfolio development projects for varied and multifunctional IT enterprises.