Non-destructive Photosynthetic Pigments Prediction using Multispectral Imagery and 2D-CNN

Rapid assessment of plant photosynthetic pigments content is an essential issue in precise management farming. Such an assessment can represent the status of plants in their stages of growth. We have developed a new 2 Dimensional-Convolutional Neural Network (2D-CNN) architecture, the P3MNet. This architecture simultaneously predicts the content of 3 main photosynthetic pigments of a plant leaf in a non-destructive and real-time manner using multispectral images. Those pigments are chlorophyll, carotenoid, and anthocyanin. By illuminating with visible light, the reflectance of individual plant leaf at 10 different wavelengths – 350, 400, 450, 500, 550, 600, 650, 700, 750, and 800 nm – was captured in a form of 10 digital images. It was then used as the 2D-CNN input. Here, our result suggested that P3MNet outperformed AlexNet and VGG-9. After undergoing a training process using Adadelta optimization method for 1000 epochs, P3MNet has achieved superior MAE (Mean Absolute Error) in the average of 0.000778 ± 0.0001 for training and 0.000817 ± 0.0007 for validation (data range 0-1).


I. INTRODUCTION
HOTOSYNTHETIC pigments in plants have essential roles in the process of plant growth. They harvest solar light energy and use it for photosynthesis. Therefore, any changes in their contents can represent various conditions such as nutritional status, senescence, responses to environmental changes [1] as well as pests and diseases attack [2]. Hence, the development of rapid analysis methods of photosynthetic pigment contents is an important topic in agricultural research.
In recent years, advances in computer technology and electronics have led agricultural research to the development of pigment analysis methods in a non-destructive manner [3]. Non-destructive methods allow the quantification of pigment contents to be done in-situ and in real-time, and, thus, a rapid analysis can be easily performed. Gitelson et al. [4][5][6] have developed several non-destructive methods based on spectral reflectance from spectrophotometer-based measurements to predict the content of three main photosynthetic pigments (chlorophyll, carotenoid, anthocyanin) in plant leaves. Although the predictions are claimed to be quite good, spectrometer-based measurements are generally costly. Other researchers then developed an image-based measurement as an alternative [7]. These methods have been proven to be efficient, accurate, and easy to implement. In general, the RGB format is used [8][9][10]. However, in the case of simultaneous measurement of pigment contents, the RGB format is not enough. This refers to the theory that each type of pigment has a unique light reflectance behavior at certain wavelengths. The RGB format is mostly produced by camera sensors that have been filtered only to be able to capture reflections in the red (620-750 nm), green (495-570 nm) and blue (450-495) ranges. Therefore, using the RGB format as the raw data will eliminate a lot of important information that can be obtained from other wavelengths. For example, the reflectance in the near-infrared range (750-800 nm), which carries information about the diversity of leaf structure and thickness and as a correction for the calculation of pigment content in leaves containing both chlorophyll and anthocyanin [11]. Hence, in this study, we propose the use of multispectral digital images consisting of 10-channels. With reference to the positive results in our preliminary study using different species and without the wet chemical procedure [12], we hypothesized that, the prediction of photosynthetic pigments content in plant leaves would be approaching the results given by the spectrophotometerbased measurements and certainly better than those of 3channel (RGB) images. In 10-channel images, the quantity of light reflectance that represents the content of the photosynthetic pigments becomes more thorough and guides the system to produce better predictions.
To provide real-time analysis, an image-based measurement needs to be equipped with a machine learning method that is used to perform either classification [13] or regression [14] tasks on the image. In this study, we used the 2D-Convolutional Neural Network (2D-CNN) to perform a regression task on our pigments prediction system. This method is a variant of the Artificial Neural Network (ANN) that is now popularly used to handle input in the form of digital images. By applying the convolution method, the use of hyperparameter on 2D-CNN is more efficient so that the learning process can be done quickly even with a large dimension of inputs [15] and minimize dependencies on human knowledge in determining the main features of the input images [16]. In conventional ANN, manual determination of the input main features is the most critical task to ensure the model performance. But on 2D-CNN, the features that represent the input image are generated automatically and continuously adjusted throughout the learning process without human bias. Moreover, 2D-CNN can also extract information from a digital image that consists of many color components (multispectral) simultaneously. In pigment analysis, each color component in a multispectral image carries unique information. Therefore we need a tool that can perform simultaneous and automatic extraction of all color components, such as the convolution method. This makes 2D-CNN superior to conventional ANN, especially in handling multispectral digital images.

A. THE SAMPLE
Four species of Indonesian herbal plants were used in the experiment: Syzygium oleana, Piper betle, Jasminum sp. and Graptophyllum pictum. Each species was chosen carefully to be able to represent the diversity of the chlorophyll, carotenoid, and anthocyanin [17]. These pigments have a major role in the photosynthesis process. Also, compared to other pigments, they are more easily observed visually. S. oleana contained high concentrations of carotenoid and anthocyanin, while G. pictum contained high concentrations of anthocyanin and chlorophyll. Jasminum sp contained high concentrations of chlorophyll and carotenoid. If there is a predominance of certain pigments in these three varieties, then P. betle is not the case. Hence, P. betle played a role to complement the other diversity that could not be obtained from the other three varieties. Part of the plants sampled in this study were leaves. The sample diversity of color, age, and position from the terminal bud was among our consideration during the sample preparation. A total of 212 fresh leaves were taken from several regions in Malang, East Java, Indonesia.

B. DATA ACQUISITION
Two data acquisition processes were applied to each leaf on the same day. The first process was the acquisition of its multispectral image and the second process was the acquisition of photosynthetic pigment content by wet chemical methods. ACQUISITION   Fig 1. depicts the devices arrangement. The leaf sample was placed in a tray with special clips. A bandpass filter with 10 channels -350, 400, 450, 500, 550, 600, 650, 700, 750, and 800 nmwas placed between the CCD camera and the leaf samples. Tungsten halogen was used as the light source since it provides a wide range of electromagnetic wavelengths from 360 up to 2400 nm.

B.1. MULTISPECTRAL IMAGE
An image of the leaf was then taken with a CCD camera (Pcopixelfly 14 bit). A reflectance spectrophotometer (Ocean Optic USB-4000) was also used to capture the reflectance spectrum of each leaf sample. The spectra of the samples were used for calibration and validation. Fig. 2 depicts an example of the multispectral images taken from a leaf sample. Dark-colored images indicate the lack of reflectance of leaf samples at certain wavelengths and vice versa. The images labeled 350, 400, 450 and 500 nm appear quite dark. This visualization shows that the leaf sample is very little or even does not reflect light at these ranges.

B. PIGMENT CONTENTS MEASUREMENT
Each leaf was divided into 2 parts, one for the chlorophyll and carotenoid measurement and the other for the anthocyanin measurement. Using a mortar and pestle, they were mashed into small pieces. As much as 0.05 grams were then withdrew and put into different tubes. The pigments were then extracted by adding the CaCO3 and sodium ascorbate powder with 1.5 mL of solvent into the tubes. Solvent for chlorophyll and carotenoid were 100% acetone, whereas a mixture of methanol, concentrated hydrochloric acid and distilled water was used for anthocyanin. Homogenization of the mixture was then carried out for 1 minute using a vortex. For the next 1 minute, the tubes were immersed in the ice cubes. The homogenization process was repeated 3 times before all tubes were centrifuged at 14000 rpm for two minutes and cooled again with the ice cubes. The absorbance measurement was done using double-beam UV/VIS scanning spectrophotometer (Shimadzu Corp., Kyoto, Japan) and the conversion of the absorbance values into pigment contents (µg/g) was done using Lichenthaler [18] and Sims and Gammons [19] formula. At the end, for each leaf, 3 sets of data were acquired, i.e., the chlorophyll content, the carotenoid content and the anthocyanin content.

C. DESIGN OF THE 2D-CNN ARCHITECTURES
We have developed 3 of our original 2D-CNN architectures and compared them with 2 well-known architectures, i.e., AlexNet and VGG-9. Our original architecture is named P3MNet, which stands for Plant Pigment Prediction Multispectral Network. The input was a 10 channels digital image of a plant leaf while the output was the prediction of chlorophyll, carotenoid, and anthocyanin contents. Table 1 shows the details of the five architectures used in the experiment. These five architectures are carefully arranged to represent the level of network complexity. The architectures on the left column side are less complex compared to those on the right column side. The architecture with the highest complexity is VGG-9 and the lowest is P3MNet_1. Prior to the experiment, we modified AlexNet and VGG-9, which are generally used for classification tasks. In this study, AlexNet and VGG-9 are modified for regression tasks.
We used a total of 212 10-channel multispectral images for the training process. Each architecture was trained using 7 gradient descent-based optimization methods: Stochastic Gradient Descent (SGD) [20], Adaptive Gradient (Adagrad) [21], Adaptive Delta (Adadelta) [22], Root Mean Square Propagation (RMSProp) [23], Adaptive Momentum (Adam) [24], Adaptive Max Pooling (Adamax) [24], and Nesterov Adaptive Momentum (Nadam) [25]. We carefully select the optimization method that is most suitable for each architecture. Thus, the results reported in this article are the most accurate results among all possible outcomes. We have also tried to apply the hessian-based optimization method. However, the training time is almost two times longer than the training time for the gradient-based method. This is due to heavy computations to create the hessian matrix of the loss and without the concept of batch, the training process is forced to receive the training set at once in one epoch. Moreover, we found that the MAE of the hessian-based method is relatively equal to the MAE of the gradient-based method. Therefore, we did not continue the experiment with the hessian-based optimization method. The ReLu activation function was used in the convolutional and fully connected layer. In the output nodes we used LeakyReLu [26] to avoid a dead ReLu. The LeakyReLu activation function is: where w (i) is the weight vector for the i-th hidden unit and x is the input. Due to differences in the amount of pigment content,chlorophyll ranges in a hundred while anthocyanin and carotenoid range in the tenswe normalize the target data into a 0-1 range. The normalization was done using where is the normalized data and is the raw data. The normalized data will ensure that each neuron (kernel) in the 2D-CNN has an equal chance to learn the variation of each pigment content. In addition, normalization will also speed up the calculation process for the hyperparameter updates. Mean Absolute Error (MAE) was used as the performance indicator, where is the actual pigment content, ̂ is the predicted pigment content and is the sample size. The optimization methods task is to minimize the MAE.

D. DATA AUGMENTATION
One of the issues that we face in this study is related to the time and cost of wet chemical procedures. This precludes the production of large data, which is generally a requirement of CNNs. However, some previous studies have claimed that CNNs can still show good performance with small data [27][28]. We have found that augmentation techniques can be applied to overcome underfitting and overfitting problems that often occur in the case of small data learning processes [29]. In this study, we applied a spatial-based augmentation. We created various variations in the position of the leaf image using rotation. Fig. 3 compares the P3MNet_3's MAE with and without the application of augmentation techniques. It can be seen that without augmentation, the fluctuations in MAE values in both the training and validation processes are enormous. This shows the instability of the predictions. With augmentation, the fluctuations in the MAE considerably decrease. In addition to the fluctuation problem, we also faced the problem of underfitting. It can be found in the MAE value at the end of the training epoch. Without augmentation, the MAE is much greater (±0.0203) when compared to the MAE with augmentation (±0.000778).

E. EXPERIMENTAL SETUP
Python 2 was used to develop the 2D-CNN architectures. The TensorFlow backend along with the Keras API and GPU support was used to enable fast calculations. The experiment was run with the support of Google Colaboratory facilities and on a personal computer with the macOS Sierra operating system, Intel Core i5 1.6 GHz processor, and 8 GB of DDR3 RAM.

A. Pigment Contents and Composition
The wet chemical procedure produces data of pigment contents from each leaf samples as summarized in Table 2. S. oleana and G. pictum leaves contain a higher concentration of anthocyanin than the other two species. Jasminum sp. contains more chlorophyll and carotenoid, whereas P. betle did not appear to have a significant dominating pigment. Therefore, it fills in the other diversity of pigments composition which has not been fulfilled by the other three species. Since variations in the samples are the most important success factor in the 2D-CNN learning process, it is important to ensure that the required variations are sufficiently represented by the leaf samples.    Table 2, we can find out that the variation of pigment content for each species varies widely, it can be seen from the large standard deviation values. Thus, we have confirmed that sufficient variation had been successfully met in this study. The example of leaf reflectance behavior to a visible light at various wavelengths can be seen in Fig. 4. It shows a comparison of the reflectance spectra of 4 individual leaves from 4 different species measured by a spectrometer. In general, the chlorophyll, carotenoid, and anthocyanin content in a plant leaf can be observed from the combination of its reflectance on three spectral ranges, i.e., (540-560 nm, 710-720 nm, and 770-800 nm) [4]. In Fig. 4, the difference is quite clear at those spectral ranges. The difference in pattern is created due to the simultaneous influence of the photosynthetic pigments contained in each leaf.
However, leaf thickness and water content also contribute to pattern diversity. Therefore, reflectance at other spectral ranges is also important to adjust the predicted pigment content to produce more accurate results. Hence, the complexity of the light reflectance behavior by a plant leaf is formed. Such complexity results in a high non-linear relationship between the reflectance pattern and the content of photosynthetic pigments. Therefore, in this study, 2D-CNNs is used to represent such a difficult relationship.

B. MODEL DEVELOPMENT AND VALIDATION
Adadelta was the best optimization method to train AlexNet and VGG-9 [30]. Fig. 5 shows a comparison of the P3MNet_3's overall MAE trained with seven different optimization methods. It appears that Adadelta also gives the smallest MAE. The same results also apply to the other two P3MNet architectures. Therefore, all P3MNet architectures in this experiment were trained using the Adadelta optimization method. The experimental results are summarized in Table 3. Each architecture underwent 10 learning processes to explore its behavior, each process went through 1000 epochs. Beyond the 1000 th epoch, there were no significant MAE changes. Of the five architectures that were experimented, P3MNet_3, AlexNet, and VGG-9 provides lower average of MAE. Thus, we found that the extreme addition of the number of kernels in the CNN convolution layers and nodes in the fully connected layers (see Table 1) did not significantly reduce the MAE. Compared to the P3MNet_V3's MAE, the MAE of AlexNet and VGG-9 seem slightly smaller. However, considering the number of parameters that must be managed, the MAE of AlexNet and VGG-9 are not quite encouraging. Fig. 6 shows the comparison of the file size of the trained model (in .h5 format) between the three CNN models. It appears that the trained P3MNet_V3 model is extremely more efficient in storage, which is not the case with AlexNet and VGG-9. Both of them require storage that is 7 times larger than P3MNet_V3. For the sake of developing portable devices, this condition is not favorable.  Figure 6. The MAE comparison of P3MNet_3, AlexNet and VGG-9 together with their trained models file size.
It can be concluded that increasing the number of convolution layers does not make predictions significantly better. This behavior is also in line with the theory that with the limited number of samples, the complexity of CNN cannot be too high [31]. Therefore, P3MNet_3 was chosen as the best model to represent the relationship between 10 channels of leaf image and their photosynthetic pigment contents. Fig. 7 shows the performance of P3MNet_3 for each pigment. It appears that for the 3 pigments, the correlation between observed pigment content and estimated pigment content is very good; it is positive and the correlation coefficient (R) is close to one. This result is also superior when compared to the non-destructive technique developed by Gitelson et al. [32]. Using a spectrum measured with a spectrometer, Gitelson et al. have created indices that are claimed to be very good at predicting chlorophyll, carotenoid, and anthocyanin content of plant leaves nondestructively. The indices are (Chl) RI (Chlorophyll Reflectance Index), CRI (Carotenoid Reflectance Index), and ARI (Anthocyanin Reflectance Index). Each index gives a correlation value (R) of 0.959, 0.969, and 0.967, respectively. This data show that the use of 2D-CNN with a 10 channels digital image can replace the use of the reflectance index with the reflectance spectrum data (measured with a spectrometer). Both are proven to provide almost the same accuracy.
However, when examined in more detail, we found little differences in the MAE. Fig. 8 depicts a boxplot of the validation MAE for all three pigments. Anthocyanin seems to be more predictable. It can be seen from its MAE which is lower compared to those for the other two pigments. Furthermore, the fluctuation in MAE values for anthocyanin is also the lowest. It is an indicator that P3MNet_3 can provide better consistency in anthocyanin prediction. Of the three pigments, chlorophyll's MAE is the worst, its fluctuation also appears to be greater than anthocyanin. Meanwhile, the behavior of carotenoid's MAE tends to be more similar to that of chlorophyll This phenomenon occurs since the reflectance by chlorophyll (especially in the green spectral range) can be reduced in the presence of anthocyanin. This is usually found on leaves that contain both these pigments. Huang [33] also reported a similar phenomenon for the prediction of chlorophyll in sweet potatoes using spectral reflectance data and developed a new index to fix it. However, the method to deal with the same phenomenon for multispectral digital images remains unknown. This issue will be the focus of our next research project.

IV. CONCLUSION
In this study, we compared 5 2D-CNN architectures to predict the content of photosynthetic pigments in a plant leaf using its multispectral image. Of the five architectures, 3 of them are our original models named P3MNet. We found that the P3MNet_3 architecture is the best among all with training MAE = 0.000778 ± 0.0005 and validation MAE = 0.000817 ± 0.0007 (data range 0-1). In the input layer, there was a digital image of a plant leaf consisting of 10 channels. Each channel represents the reflectance of leaf sample to light at wavelengths 350, 400, 450, 500, 550, 600, 650, 700, 750, and 800 nm respectively. The P3MNet_3 was trained using the Adadelta optimization method, run in 1000 epochs. Three main photosynthetic pigments, i.e., chlorophyll, carotenoid, and anthocyanin have been successfully predicted with high accuracy. However, there is a slight difference in the prediction performance for each of the three pigments. P3MNet_3 tends to give a smaller MAE value for anthocyanin compared to carotenoid and chlorophyll. Based on these positive results, we have shown that a multispectral digital image along with 2D-CNN can be used as a good non-destructive tool for photosynthetic pigment content measurement in plant leaves. Its accuracy can compete with other non-destructive instruments that have been developed by previous researchers.