CLOUDINESS IMAGES MULTILEVEL SEGMENTATION BY PIECEWISE LINEAR APPROXIMATION OF CUMULATIVE HISTOGRAM

: The Ramer-Douglas-Peucker algorithm for piecewise approximation is used for image multilevel segmentation. The cumulative histogram is selected as a function for approximation. The algorithm allows you to determine threshold values of continuous and discrete images. The algorithm is used to separate cloudiness from background and to calculate cloudiness intensity. The found points of the approximated function have been accepted to change pixel intensity by proposed formulas. The algorithm efficiency is compared with those based on ordinary and cumulative histograms. By controlling the number of points for piecewise linear approximation function, the necessary segmentation accuracy can be achieved. The algorithm complexity is linear to the number of image pixels and to the number of intensity steps. The developed algorithm is applied to the satellite map images to separate clouds of different intensity. The extracted clouds of different intensity are used to classify regions by cloudiness with a developed clustering algorithm. Testing and experimental results are presented. Copyright © Research Institute for Intelligent Computer Systems, 2020. All rights reserved.


INTRODUCTION
Determination of the image features requires fast algorithms of image segmentation. Nowadays, there is a great variety of publications on methods for image segmentation. They can be generally divided into two classes: those that are based on finding the intensity threshold and those that divide the image into regions having certain features. The first ones determine the intensity thresholds and are based on histograms. Among them, the algorithms of determining the minimal intensity [1], convexity [2], moments [3], entropy [4], minimal errors [5,6], etc. can be distinguished. The typical example of the methods from the second class is the graph-based image segmentation [7]. In the paper [8] the improved Otsu method that constrains the search interval of gray levels is proposed. In the work [9] the required threshold is determined in the receiver operating characteristic space.
Multilevel thresholding is of great importance. However, the computational complexity of it in many algorithms increases exponentially with the number of thresholds. So, authors try to minimize the algorithm complexity. In the paper [10] a criterion for maximizing a modified between-class variance is proposed and a recursive algorithm is designed to efficiently find the optimal threshold using stored in a look-up table. The publication [11] proposes new method based on quantum particle swarm optimization (QPSO) algorithm. In the work [12] starting from the extreme pixel values at both ends of the histogram plot, the algorithm is applied recursively on sub-ranges computed from the previous step, so as to find a threshold level and a new sub-range for the next step, until no significant improvement in image quality can be achieved.
The paper [13] proposes multilevel image thresholding for image segmentation using several recently presented P-metaheuristic algorithms, including whale optimization algorithm, grey wolf optimizer (GWO), teaching-learning-based optimization algorithm and some others.
The paper [14] proposes a method by using cluster organization from the histogram of an image. A new similarity measure proposed is based on inter-class variance of the clusters to be merged and the intra-class variance of the new merged cluster.
A new method of multilevel thresholds for image segmentation using grey wolf optimizer is proposed [15]. This metaheuristic algorithm is applied to multilevel threshold problem using Kapur's entropy and Otsu's between class variance functions.
The drawbacks of the abovementioned and some other algorithms are the different thresholds for similar images even within the algorithms of the same class. Most algorithms are fairly bulky, especially those using graph models or those based on statistical calculations. Modern CBIR-systems process millions of images in real time and therefore need extremely fast and quite accurate image feature determination tools. Segmentation algorithms are an important part of these tools.
In this paper we present very fast and simple algorithm having a linear algorithmic complexity and clear physical explanation. It needs to calculate the image cumulative histogram and to find the extreme coordinate in the interval 0-255 as many times as many levels of segmentation are needed.
Comparatively with the work [16], where cloudiness was considered as one object for analysis, in this paper cloudiness is divided into three objects: low, middle and high clouds.
The paper is based on the research provided by the grant support of the State Fund for Fundamental Research (project No 33651).

PIECEWISE-LINEAR APPROXIMATION
We want to approximate a function () fx by piecewise-linear function () gx determined on the interval (1) The convex function () fx can be approximated by the piecewise-linear function () gx having N segments and determined on the interval 0 where 1 , () 0, in other case For piecewise-linear approximation of the cumulative histogram function the Ramer-Douglas-Peucker (RDP) algorithm [17,18] was taken as a basic one.
The essence of the algorithm is approximating initial curve 0 f represented by the set of points ( , ) , 1, The algorithm defines maximum distance (tolerance) between the original and the approximating function: This distance must be less than assigned maximum allowable tolerance, which determines desired accuracy of the approximation.
The initial curve is an ordered set of points or lines within 0   . The process of the curve approximation is shown in Fig. 1. If the point is located at distance less than  , then all the points that have not yet been signed for storing may be removed from the set and received line will smooth curve with an accuracy not less than  .
If the distance is more than  , then the algorithm recursively calls itself on the set of points from first to the current and from the current to the endpoint (which means that current point will be signed for storing). After all the recursive calls output polygon is based only on those points that were signed for storing.
The results of the cumulative histogram function approximation using RDP algorithm for the test image for some value of the maximum allowable deviation (tolerance) are presented in Fig. 2. I m n are the original and transformed images, of size MN  , respectively. After applying the present algorithm recursively a few times, PSNR of the segmented image is found to saturate. This property can be used to obtain the appropriate number of thresholds and to select the best quality of approimation from formulas (13)- (16).
Computation complexity of the approximation algorithm could be determined by a number of thresholds.
In the case, when its number is , k a complexity of the approximation algorithm is linear to a number of pixels and equals ( 256 )

MULTILEVEL THRESHHOLDS FOR IMAGE SEGMENTATION
For the image histogram it is easy to calculate the cumulative histogram where V is the overall number of image pixels, () Vi is intensity frequencies, () To confirm or to deny robustness of the proposed approach we use two "model" algorithms: known Otsu algorithm [1] and our own approach [19]. We consider two types of images: having discrete steps of intensity and having a "continuous" type with the intensity step as 1 . In Fig. 3a, an example of the "pyramid" image is shown. In Fig. 3b, we see its cumulative histogram and two-lines approximating function got by the RDP algorithm. The middle point indicates the main threshold value 51 h = for the "pyramid" image. The same value demonstrates the Otsu algorithm. The lower and the upper parts of segmented image are shown in Fig. 4.  Table 1). They do not contain those points received from the Otsu algorithm. Tolerances and a number of points in the piecewise linear approximating function are given in Table 1. Values of the approximating function, all threshold values and an order of its determination are given in Table 2. For the "pyramid" example we have 5 levels of pixels. If we want to continue to divide pixels by assigning additional colors, we should divide pixels from any of given levels.
For the next test we have chosen the wide known image of "12". The image and its cumulative histogram are shown in Fig. 6. In Fig. 6b, the cumulative histogram is approximated by the RDP algorithm. The first received point indicates the threshold value 77 ht = . The Otsu algorithm gives 107 ht = . The approximating function has 7 points. Tolerances and a number of points in the piecewise linear approximating function are given in Table 3. Values of the approximating function for seven points, their threshold values and an order of determination are given in Table 4. We can observe these points in Fig. 6. The approximating algorithm gives maximal number of points 40 and their placement we can see in Fig. 7. It is difficult to determine one of the found points ( 12) ht = by known algorithms. The low segment for this point is shown in Fig. 8. For a hypothetical image, a normalized cumulative histogram is constructed according to the following formula: where ()

FH
Vs is a number of pixels (accumulated frequency) of the hypothetical image within the intensity interval 1 s  .
We construct a function () Ds of the difference between the cumulative histograms of real and hypothetical images: In Fig. 9, the plot of the () Ds function, the difference between the cumulative histograms of the "12" image real and hypothetical images, is given. The function () Ds indicates the intervals in which the frequencies of image pixels are larger or smaller than the corresponding values of the hypothetical image, when they are increasing or decreasing. The function is characterized by the special points: extreme, inflection points or fracture.
In particular, in Fig. 9, we can see that the () Ds function has three extremes. We consider that the coordinates of the extremes indicate possible thresholds for the image segmentation. In Fig. 9, three extremes points have the following threshold values: 12 h = , 38 h = and 77 h = . They coincide with first, second and forth points from Table 4. As we can see there are no other extreme points on the () Ds plot. So, the next threshold values can't be discovered. But the approximation algorithm allows us to find all possible threshold values.

MULTILEVEL SATELLITE MAP IMAGE SEGMENTATION
For cloudiness monitoring in different regions of the Earth surface the satellite map images are widely used. The popular services from free resources [20][21][22] give the map images ordered from different parts of the world. So, the satellite maps of Ukraine with cloudiness are being obtained and stored. For every interval of cloudiness, we need to calculate intensity, mass, moving direction and other cloud properties. To get an access to the corresponding cloud areas the developed algorithm was applied to satellite image in Fig. 10. To the original image in Fig. 10, the approximation algorithm was applied.
In Fig. 11, we see how the cumulative histogram is approximated by piecewise the linear function with 12 intervals. Values of the approximating function for six points, their threshold values and an order of determination are given in Table 5. We can observe these points in Fig. 13.  Fig. 12, these three types of cloudiness are shown. To exclude influence of other colors they are marked as black.
Black color allows us to perform clustering of the map and to evaluate intensity of cloudiness in different regions of the territory. In Fig. 12, an example of the clustered map and corresponding dendogram are given. The performance metrics for checking the effectiveness of the method are chosen as computational time and PSNR which is used to determine the quality of the segmented image. PSNR values for different types of changing the intensity pixel exchange in the determined intensity intervals are given in Table 6.  Simultaneously, approximation changes continuous type of images by discrete ones. Approximated images are compressed. Some examples of compression ratios for different format of images are given in Table 7.

CONCLUSION
Thus, software of the method of multilevel thresholds for image segmentation based on piecewise linear approximation was developed. The method is based on original RDP algorithm intended to reduce the algorithmic complexity because a cumulative histogram is calculated one time. The developed algorithm was applied to segment cloudiness on the satellite map images. Future investigations are planned to use the approximation algorithm for face and defective images features extraction.