A HYBRID OPTIMIZATION APPROACH FOR COMPLEX NONLINEAR OBJECTIVE FUNCTIONS

With respect to the ‘no free launch’ theorem, no single algorithm has a better performance when tested against a completely stochastic algorithm on all objective functions. Consequently, choosing the best algorithm for a particular problem is often more of an art than science. The complexity of an objective function can be determined by certain features such as the modality, the basins, the valleys, the separability, and the dimensionality of the objective function. While the separability and modality contribute to the complexity of the function, the dimensionality and domain range increases the function’s search space exponentially. In this paper, the authors analyze the algorithmic constructs of Simulated Annealing (SA), Cuckoo-search (CK), Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) along with two hybrid paradigms. In addition, an extensive comparative study was conducted using 30 standard bench mark functions to demonstrate how an ingenious hybrid algorithm could significantly shorten the amount of function calls (generations) needed to attain the optimal or rather near optimal solution for almost any complex objective function. Results from empirical analysis unveil the precision, robustness and success of the hybrid algorithm (without compromising run-time complexity) over its counterparts.


INTRODUCTION
Optimization is a branch of science that involves the search for the best parametric values to a solution of a specific problem [1]. The objective is to find the solution to a predefined objective function via an iterative process, towards an optimal value. In optimization problems, a mathematical representation of the objective function is clearly defined along with its constraints.
Optimization problems are usually classified under various paradigms such as linear programming, quadratic programming, combinatorial optimizations and meta-heuristics. While classical algorithms use methods based on the hessian matrix [2] and gradient decent (especially for objective functions with computable derivatives), the meta-heuristic algorithms on the other hand are deployed on non-differentiable and non-linear objective functions. Such functions are either impracticable or very difficult to solve using classical methods. However, solutions provided by meta-heuristic algorithms are usually referred to as suboptimal solutions.
Some of the raciest meta-heuristic algorithms include: Genetic algorithm (GA), Particle Swarm Optimization (PSO), Simulated Annealing (SA), Differential Evolution Algorithm (DE), Cuckoo Search Algorithm (CK) etc. These algorithms leverage on a model matrix which evolves random solutions to the predefined objective function. In addition, some of these meta-heuristic algorithms use the variant of the basic genetic algorithm schema (selection, Mutation, and crossover) while evolving solutions. These variants can be summarized into two basic strategies namely, exploration and exploitation. While exploitation targets the best local solution within the search space, exploration attempts to leverage diversification in an attempt to incur the best solution which in most cases lies around one of the local solutions.
A good meta-heuristic algorithm can be characterized by the rate at which it finds the global optimal solution to the objective function. In this paper, the authors propose a frame work for hybridization of multiple meta-heuristic algorithms. This frame work leverages on the strengths of each meta-heuristic algorithm in order to rapidly converge a search process to an optimal or suboptimal solution with minimal computational complexity.
Objective functions (Table 1) could be qualified or grouped into categories such that they are either continuous functions or discontinuous functions, linear functions or polynomial, differentiable or nondifferentiable, uni-modal or multi-modal, separable or non-separable. In this paper, some artificial multifarious problems also referred to as test functions has been chosen to evaluate the robustness of our proposed algorithm. Artificial problems has the advantage of ease in modification and manipulation of the test algorithm in diverse scenarios. In addition, objective functions can be sorted or grouped by their modality, basins, valleys, separablity and dimensionality [3].
Modality: This represents the number of peaks in the function's topology. When an algorithm come across such peaks during a search cycle, there exist the likelihood of the algorithm to asymptote at a local optima or minima depending on the predefined search criteria.
Basins: unlike peaks, these are steep decline around a large area. The presence of basins could have a significant impact on the success of an algorithm due to insufficient information to guide the algorithm towards the global minima.
Valleys: These occur when narrow domains of minimal difference is surrounded by multiple basins. The floor of a valley could have a significant impact on the success of a search algorithm.
Separable: This measures the difficulty of a function. It is easier for a search algorithm to transverse a separable function than a non-separable function. When the variables of a function are independent of each other the derivative of the function can be decomposed into sub functions. This separable feature makes it easier for an algorithm to solve. On the other hand, if the variables are dependent on each other, the function becomes nonseparable thus making it more difficult for an algorithm to solve.
Dimensionality: the magnitude of the parametric variables defines the dimensionality of the objective function. Every one step increase in the number of parameters has an exponential overhead in the amount of computational search space. Almost every meta-heuristic algorithm has dimensionality as a major bottleneck.
The complexity of the function can determined by two major factors: modality and separability as depicted in table 1(a) below.

LITERATURE REVIEW
Evolution computation is a combination of genetic programming and genetic algorithm which incorporates models such as selection and mutation which forms the core of the entire evolution computation algorithm.
Genetic algorithm (GA) is a classical optimization meta-heuristic based on the biological model of natural selection. The algorithm involves a clever manipulation of an objective function, a vector or matrix of objective variables, definition of variable constraints, selection, crossover and mutation ( Fig. 1). The total number of iterations (epochs) usually depends on the chosen termination criteria which could either be a predetermined number of epochs or the convergence of the algorithm. The GA is said to have converged when little or no significant improvement is observed in the population. Two key process guides the GA towards an optimal solution: the selection and the crossover [19]. The roulette wheel selection model is a typical and intuitive probabilistic approach which favors the best pair of objective variables (variables with high fitness scores) for participating in the mating process.
The mating process is implemented by the crossover model. A typical procedure is binary fission of binary encoded objective variables. A random split point is chosen for both selected string of binary coded bits, thereafter an exchange (crossover) is conducted.
A subtle but very efficient model is the mutation process. Given a predetermined probabilistic mutation rate, the resulting string from the crossover process may undergo an alteration in one or more of its bits. This procedure helps the algorithm's converging process progress towards the global optima.

SIMULATED ANNEALING (SA)
The simulated annealing algorithm as a metaheuristic optimizer was the 'brain child' of Kirkpatrick et. al., in the year 1983 [20]. The algorithm mimic's the process of a crystal-like lattice via a quick heating and slow cooling process. Like other standard procedures, the algorithm begins with generating a random number of objective variables that are modified via some parametric turning before assigning them to a fitness function which then outputs a fitness score for each pair/ vector of variables.
It's important to note that most literature refer to a set of objective variables as a chromosome. In the Simulated annealing iterative algorithm, a new set of objective values replaces the old ones if there was an improvement in their corresponding fitness scores, however some of the less fit pairs may proceed to the next generation even if their fitness scores worsened as long as they satisfy the following conditions: Otherwise, they are rejected. where, r = uniform stochastic variable, T= temperature. The algorithm slowly reduces the T value until it gets close to zero before terminating. During this cooling process, the algorithm does a percentage wise reduction of the d value in an attempt to enhance the fitness scores of the population.

PARTICLE SWAM OPTIMIZATION (PSO)
The PSO algorithm is a relatively less involved algorithm with few parameters to manipulate, first proposed by Kennedy and Eberhart [21]. Like the Genetic algorithm, the PSO meta-heuristic mimic's a biological model however, it exempts the crossover and mutation procedures of the GA.
The PSO algorithm iteratively updates the objective variables via a velocity vector. For brevity, the algorithm modifies each set of objective values updating their vector velocities via a clever manipulation of the global best and local best solutions. While the global best is indicative of the best fitness score thus far in the iterative process, the local best is indicative of the best fitness score within the current run. The equation below show the simplicity of this elegant algorithm: vel = velocity of each particle, P = particle variables, Plocal best = best local fitness for each particle, Pglobal best = global local fitness for each particle, γ = learning rate (constant), r1, r2 = stochastic variables. The ease of implementation is another significant advantage of this algorithm.

CUCKOOS SEARCH ALGORITHM (CS)
The cuckoos search meta-heuristic algorithm was first proposed by Deb and yang [22]. The algorithm was inspired the interesting reproductive characteristics of the cuckoos' bird. The cuckoos' bird is an opportunistic bird that lays its eggs among other eggs in a host nest. On return of the host bird, the host bird may or may not detect the presence of the cuckoos' egg. If the cuckoo egg is undetected, all eggs are hatched otherwise, the nest is completely abandoned or ruined.
The CS meta-heuristic combines the behavior of the host bird and the cuckoos' bird. Intuitively, each nest is a representation of a set objective variables.
The algorithm first generates an N -population pair or vector of objective variables usually referred to as candidate solutions in literature. Next, the cuckoos' egg is laid in a randomly chosen nest using a typical random walk 'levy flight' approach: Next, the fitness of the nest with the cuckoos' egg is compared with the host nest. The host nest is replaced if it has a worse fitness score when compared with the cuckoo's nest. However, if the host bird notices the presence of the of the cuckoos egg, the nest is discarded usually with a probability p < 0.25 consequently creating a new nest.

HYBRID OPTIMIZATION
A typical hybrid algorithm blends the strengths of genetic algorithms along with the converging speed of any local optimizer. A couple of authors such as Kazarlis et al. [23] implemented a scaled down genetic algorithm with a minute population size as a local optimizing strategy. The rationale behind hybridization is often to combine the power of GAs (exploration) with the swiftness at which a local optimizer asymptotes (exploitation). When the GA seem to gradually asymptotes, its assumed to at least be in the domain of the global solution, thereafter the local optimizing algorithm seizes the search process in an attempt to obtain optimal solution. Hybridization could be in any of the forms below: 1. Beginning with a GA until it decelerates before seeding a local optimizer 2. Start the GA with some local minima obtained from random starting points in the population 3. After a predefined number of iterations, seed a local optimizer on a selected elite population using elitism and incorporate the resulting chromosome into the population. Haupt [2] demonstrated finding the global optima by combining a continuous GA with Nelder-Mead downhill simplex algorithm.

HYBRID METHODOLOGY
Preliminary analysis of the reviewed metaheuristic algorithms revealed the strengths of each algorithm on the 30 different benchmark objective functions. The classical GA employs a moderately balanced explorative and exploitative strategy while the cuckoos search algorithm is highly explorative. This quality of the cuckoo's algorithm gives it an edge over the classical GA when deployed on complex objective functions. The polygamy induced GA provides a highly exploitative strategy. These diverse capabilities informed our choice of algorithms for the creation of the hybridized model. The proposed hybrid model leverages on any three different meta-heuristics combinations for solving the optimization problem. The three metaheuristic of choice are the GA, CK, and POLY (i.e. GA-with polygamy). Using 3 meta-heuristics avails us with 3 factorial (3!) possible unique combinations of the meta-heuristics. For example [CK, GA, POLY] with [CK 2 , GA 4 , POLY 5 ] where the superscripts represents the duration of sub epoch assigned to each meta-heuristic. The algorithm ( Table 2.) begins by first sampling a random population of 50 ( ) chromosomes with a scalable dimension size of 2 ( 1, 2) continuous variables. Simultaneously, a sub population ( ≤ ) of random combinations of meta-heuristics are spurned to evolve or optimize the matrix of chromosomes towards an optimal solution to the given objective function. It is important to note that each metaheuristic randomly obtains a duration (  At the end of the first main epoch, a typical GA style algorithm (without the crossover operator) is used to select and mutate the chromosomes. Exempting the crossover operator reduces the computational complexity and helps the hybrid algorithm evolve faster.   Elitism is used to preserve the best 5 performing chromosomes to the next generation. Fig. 3 shows the best performing combinations for the overall objective benchmark functions used in this research. The mutation process alters the combination or order of meta-heuristics along with their respective superscript durations.
Leveraging tournament selection, as a selection strategy, the fittest chromosomes are passed to the next generation. * (global minima) is updated at the end of each main epoch. For the research, the termination criteria was set at 50 epochs (MAX) or when the optimal solution has been found.

Polygamy as an Exploitative Strategy for GAs
The concept of diversity and exploitation are two paradigms that has contributed immensely to the success of the GA. While diversity attempts to prevent the algorithm from stagnating within a local optima, exploitation on the other hand attempts to achieve faster convergence of the algorithm.
One approach towards maintaining diversity within a population is by replacing existing identical solution strains with newly formed strains especially in cases where they exist multiple similar strains in the population. [24] Other methods include a mechanism for favoring dissimilar strains while similar ones are discouraged leading to convergence on multiple peaks [25]. Another related approach is to restrict mating among similar strains while encouraging mating among dissimilar ones thereby increasing diversity [26]. Similarly, a tag stamping mechanism has been used to indicate strains that are eligible to mate as they pass from one generation to another [27].
Polygamy one the other hand attempts to explore the power of exploitation. Using this approach, every strain within the population is forced to mate with the fittest strain within each generation. A clever implementation of this strategy helped the algorithm converge faster to the global optima. Polygamy behavior was allowed when only little improvement was observed within five consecutive generations' thus fine tuning population towards the global optima.

EXPERIMENTS
On the basis of ANOVA, we reject the NULL Hypothesis (H0) on the premise of a significant difference between the mean optimal values of our tested meta-heuristics. Consequently, we proceed with our analysis to discover the pairs of metaheuristics that differ significantly. For this purpose, we use the SHEFFE's test. The test states that "the critical difference (CD) for each pair of metaheuristic can be obtained using the equation below:" , = ℎ ( + ) * ( − 1) * , * ∝ (7) where, ∝ = critical value at 5% significance. = number of meta-heuristics = number of test bench mark functions ℎ = ANOVA mean square within samples In this paper, we use 30 bench mark functions to test the success of the CK, PSO, SA, POLY (a proposed exploitative strategy), and HYB (our proposed hybrid frame work). Table 3 shows the best optimal values for each meta-heuristic on the bench mark functions compared against standard expected optimal solutions f(x*). A fixed dimension size of 2 was implemented for all bench mark functions. The population size was preset at 50 while the maximum number of function evaluations for each iteration was set to 500. For the purpose of cohesion, the global minimal values below 10 -15 were considered as zero (0) in all experiments.
In this paper, global optima (minima (X best )) of each bench mark function was evaluated 20 times using random initial population at every instance. The mean performance of optimal solutions and durations (number of function calls) during each phase of the experiments were recorded for further analysis.
Also, two ANOVA tests were conducted for multiple comparisons of the performance of each meta-heuristic algorithm.

HYPOTHESIS 1 (H1)
The Null hypothesis for the first ANOVA test is stated as follows: "There is no difference in the speed of convergence to the global optima among test meta-heuristic algorithms" (Table 4)

HYPOTHESIS 2 (H2)
The Null hypothesis for the second ANOVA test is stated as follows: "There is no difference in the mean global solutions between the tested metaheuristics" (Table 5) Both hypothesis were tested with 95% confidence (∝ = 0.05).
Considering the relative complexity of the Hybrid algorithm, the algorithm was allowed to run for 1/10 th of the max allowed epoch of 500. The rational was to give all algorithms a level playing ground as each one ran for approximately equal CPU time.

HYPOTHESIS 1 (H1)
In this paper, the success of the proposed Hybrid model, CK, GA, SA, and a modified GA (polygamy induced) has been statistically compared. When compared over hypothesis 1 which is based on the mean function calls (speed of convergence), the proposed hybrid algorithm displayed superiority over all other models. Next to the Hybridized model was the GA algorithm, thereafter the CK search algorithm. It was observed that the CK model outperformed the GA (with a statistically significant difference) in just 2 out of the 30 bench mark functions (Table 1(b)) (the F9 -'brent function' and the F28-'Adjiman Function') which are relatively complex functions in terms of their differentiability, modality, separability and modality. The PSO owes its performance to its stability problem and also, the amount of permitted epochs for evaluation used in most PSO implementations is usually high (approx. 2 million) [28,29,30] for each bench mark function as against 500 used in this research. There exists no significant difference in the speed of convergence (in cases where convergence occurred) between the SA, PSO and the polygamy induced GA. However, the polygamy induced GA performed better than the CK in two of the bench mark functions (F4-bird Function, F5-Bohachevsky 1 Function).

HYPOTHESIS 2 (H2)
The success of the proposed Hybrid model can also be seen from the mean global optimal value as show from the ANOVA analysis (Table 5). In addition to the convergence speed advantage, the proposed hybrid model has a significant better performance when compared with the other metaheuristics with an impressive advantage over the traditional GA in (F8-'Branin RCOS-2 Function', F9-'Brent function', F22-'Rosenbrock Modified Function' F28-'Adjiman Function' and F30-'Damavandi Function'). These functions are recognized for their complexity in terms of their differentiability, separability and modality. We can deduce from statistical results that the speed of convergence is a major advantage of our proposed model while its ability to consistently converge at the global optima within a minimal number of epochs is an added advantage. Figs. 4-8 show a box plot representation of the mean difference between test algorithms on the frequency with which the optimal solutions is found within the predefined number of epochs.

CONCLUSION
This paper empirically compares the success of a hybrid algorithm with the CK, SA, GA, PSO and a modified GA in solving optimization problems. The framework systematically combines strengths of multiple meta-heuristics leveraging on the traditional GA mutation strategy. Empirical analysis revealed faster convergence to a global optima with minimal computational complexity.
The algorithm provides a platform for future research work on the scalability (max number of meta-heuristics) and also the efficiency of the algorithm when combined with other meta-heuristics apart from those used in this research such as ABC (artificial bee colony), DE (Differential Evolution), Ant Colony, etc. In addition, the success of such framework could be analyzed on multi-objective optimization problems.