ECG Arrhythmia Classification Using Recurrence Plot and ResNet-18

Cardiovascular diseases are the leading cause of death worldwide, claiming approximately 17.9 million lives each year. In this study, a novel CAD system to detect and classify electrocardiogram ( ECG ) signals is presented. Designed system employs the recurrence plot ( RP ) approach that transforms a ECG signal into a 2D representative colour image, finally performing their classifications via employment of Deep Learning architecture ( ResNet-18 ). Novel system includes two steps, where the first step is the preprocessing one, which performs segmentation of the data into two-second intervals, finally forming images via the RP approach; following, in the second step, the RP images are classified by the ResNet-18 network. The proposed method is evaluated on the MIT-BIH arrhythmia database where 5 principal types of arrhythmias that have medical relevance should be classified. Novel system can classify the before-mentioned quantity of diseases according to the AAMI Standard and appears to demonstrate good performance in terms of criteria: overall accuracy of 97.62 % , precision of 95.42 % , recall of 95.42 % , F1-Score of 95.06 % , and AUC of 95.7 % that are competitive with better state-of-the-art systems. Additionally. the method demonstrated the ability in mitigating the problem of imbalanced samples.


I. INTRODUCTION
The heart is a peculiar organ with the ability to generate electrical impulses and contract rhythmically. Sometimes these impulses are formed or propagated incorrectly, leading to arrhythmias, which can be either fast (tachycardia) or slow (bradycardia) [1].
To visualise these signals, usually an electrocardiograph should be employed where the electrocardiogram (ECG) is the graphical record of the heart's electrical activity [1].
Nowadays, ECG is one of the most used tools for diagnosing cardiovascular issues. Depending on the morphology of the ECG waveform, physicians or specialists can perform different diagnoses.
According to the World Health Organization (WHO), cardiovascular diseases are the leading cause of death worldwide. It is estimated that 17.9 million lives will perish each year, and is expected to increase to 23.6 million by 2030 [2].
In general, an ECG has three important normal electrocardiography complex [1] : • Atrial Activation -Wave P is due to depolarization of artheria.
• Ventricular Activation -Waves of the QRS complex are due to ventricular depolarization from the beginning to the end of the signal. * Wave Q presents the first negative wave before the first positive wave. * Wave R is a positive wave. * Wave S represents every negative wave after a positive one.
• Ventricular Repolarization -Wave T appears due to repolarization of the ventricles. -Wave U is of uncertain meaning.
Commonly, the ECG studies use the features obtained from the P-QRS-T complex [3]. These features resemble the location, duration, amplitude, and five prominent deflections of the ECG signal in the P, Q, R, S, and T waves, as well as the minor deflection known as the U wave [4]. Figure 1 illustrates all these waves and deflections. Figure 1. A normal ECG waveform (P,Q,R,S,T and U) [5].
In the literature, numerous classifiers have been used for classification of the diseases using ECG, among them are Support Vector Machine (SVM) [6], Decision Tree [6], and Bayesian belief network [7], among others. It is noteworthy that these methods are one-dimensional (1D) processing. In recent years, several authors have opted in the signal analysis the machine learning (ML) [8] and deep learning (DL) [9] approaches to archive better-designed tools to diagnose of a patient in a faster way. In such way, employing information of 2D signal can be preferable against 1D signals.
For example, in study [10], there is employed the continuous wavelet transform the continuous wavelet transform (CWT) to convert the ECG signal into scalograms. Consequently, those scalograms are fed to a custom Convolutional Neural Network (CNN) model to classify them into different classes.
In paper [11], the authors proposed a method for ECG heartbeat classification called Multimodal Image Fusion (MIF) and Multimodal Feature Fusion (MTF). As the input of these frameworks, the authors use the transformed raw ECG data into three different images by using Gramian Angular Field (GAF), Recurrence Plot (RP), and Markov Transition Field (MTF). Therefore, the obtained images are fed into a custom CNN model was trained from scratch to detect five classes belonging to cardiac issues.
Study [12], addressed the cardiac arrhythmia (CA) classification problem using the DL approach based on the Transfer Learning method for ECG in 2D by employing the RP approach. Following, the obtained images are fed to the Inception-ResNet-v2 network, finally classifying nine disease in different classes.
Paper [13] performed a short-duration segmentation of ECG signal (2s), and afterward, a transform into 2D ECG images where the RP technique is applied to ECG data creating the 2D images. The classification stage performs up to six cardiac diseases employing a CNN architecture that have been evaluated on different datasets. In [14], the authors proposed a method to characterize abnormal heart rhythms employing the RP technique. They present a CNN architecture for ECG arrhythmia classification that was evaluated using the MIT-BIH arrhythmia dataset. To address the overfitting issue, the authors employ a fine-tuning phase. However, it is unclear which weights are used, or if they train from scratch in the previously mentioned database, or they employ the well-known Imagenet classification dataset. The sizes of the training and test data were performed via a trial-and-error process. However, the imbalanced data problem is present in this work after this process.
As one can see, there is a wide variety of methods for the classification of ECG signals. Existing methods present several drawbacks, among them dissemblance of classes in the datasets because of significant differences among majority and minority disease classes; meanwhile, usually employed method, which is based on generation the synthetic data, could result in erroneous classification. The designed in this study method tries to resolve these drawbacks.
The principal contributions of our system called ECG-Recurrence-plot-classification (ECG-RPC) in the classification of arrhythmia are summarized as follows: • This study proposes a method to solve the imbalance problem, avoiding the usage of a synthetic sample generator that could result in erroneous classifications. • Lightweight ResNet-18 network trained from scratch is employed, which performs fast training without large data volume. • Experimental performance validation of novel ECG-RPC system appears to demonstrate their competitive performance against better existing systems, guaranteeing the best performance in terms of F1-Score and IBA measures that balance precision and recall as well as differences in data for classes. • ECG-RPC system allows an ECG signal to be analysed and classified according to the AAMI standard.

II. MATERIALS AND METHODS
In this section, the detailed methodology of the proposed system is discussed. A block diagram of the system is illustrated in figure 2. ECG signals can be easily acquired through PhysioNet or other databases available on the internet. The ECG signals utilised in this paper were obtained from the MIT-BIH arrhythmia database [15], which contains 16 original classes of arrhythmias. However, according to the Association for the Advancement of Medical Instrumentation (AAMI) standard [16], only five types of arrhythmias have medical relevance, which are: normal (N), ventricular ectopic (V), supraventricular ectopic (S), fusion (F), and unknown (Q). These types of arrhythmias are summarized in table 1.
In the first step, we carry a segmentation of the ECG signals every two seconds similar to [13], and then these VOLUME 22(2), 2023 segments should be transformed into images employing the Recurrence Plot (RP) method. To address the problem of imbalanced classes, the Random UnderSampling (RUS) method will be employed, leaving each containing class with the same number of images. As a classifier, ResNet-18 architecture is used that is trained from scratch. The details of each stage will be explained in further detail in the following sections.

A. SEGMENTATION
The ECG signals from the MIT-BIH database arrhythmia that contains 48 extracts of half an hour, collected by outpatients, and only two Leads were obtained. An ECG is primarily compounded by 12 Leads, those Leads are connected to a specific part of the body such as the chest, legs and arms obtaining the graphical representation of the heart's electrical activity. In this work, we focus on the frontal plane leads commonly known as Einthoven Triangle (Lead I, II and III). Those leads are used in the Intensive Care Unit (ICU), where Lead II is monitored due it represents the P wave in the cycle of an ECG. The database has annotations made by physicians of each contained class. In total, 110,000 annotations are described in [17]. Table 2 shows the number of signals contained in each a class:

B. RECURRENCE PLOT
The recurrence plot technique proposed in [18] is used in dynamic systems. The initial purpose of RPs is to visualise trajectories in phase space, which is particularly advantageous in the case of high-dimensional systems. RPs yield insights into the time evolution of these trajectories because typical patterns in RPs are linked to a specific behavior of the system. The ECG signals have typical recurrent behaviors, including periodicity and irregular cyclicities [19], which may be difficult to visualize in the time domain.
An RP can be formulated as follows: where ε i is the threshold distance, x i , x j are the observed sequences at points i, j, and ||.|| is the Euclidean norm, N is the number of considered states and Θ is the Heaviside function, which is defined as: The original formulation equation (1) is considered binary, caused by ε, the threshold distance. In this paper, an unthresholded approach proposed by [20] was adopted to avoid information loss by binarization of the R matrix, using the Euclidean norm. 142 VOLUME 22 (2), 2022 The R-matrix is defined as: In the present study, the 1D ECG signals have been transformed into 2D RP images and concatenated finally generating RGB images that contain the colour information in RP images to be used in the following stage

C. NETWORK ARCHITECTURE
The ResNet architecture was proposed by [21], introducing the residual block shown in figure 3. These blocks allow connecting the feature maps of the previous layer with the obtained feature maps of subsequent layers through a skip connection. In addition, these blocks help to solve the vanishing gradient problem and also allow to perform deeper networks without increasing the loss during training. There are a lot of the variants for the ResNet architecture, such as ResNet-34, ResNet-50, ResNet-101 and ResNet-152. In the proposed system, we employed the ResNet-18 architecture due to their enhanced performance against deep neural networks with additional layers such as AlexNet, and VGG-1x, among others. Moreover, ResNet-18 is a relatively lightweight neural network architecture compared with other deep neural network architectures founded in the state-ofthe-art techniques, which means that it can be trained faster and with fewer computational resources that is important in low-income institutions.

B. EVALUATION METRICS
To validate the performance of the proposed system, we use the commonly used evaluation metrics, such as Accuracy, Precision, Recall, F1-Score, Receiver Operating Characteristic curve (ROC), Geometry Mean and Index of Balanced Accuracy (IBA), where: • The Accuracy measures the appropriate classifications over the total elements evaluated. • The Precision criterion characterises the number of elements that are correctly classified among all the positive elements to evaluate. • The Recall is the proportion of actual positive cases that are correctly predicted. It measures the coverage of actual positive cases and reflects correct predicted cases. • The F1-Score value is defined as the harmonic mean between precision and recall.
A Receiver Operating Characteristic curve (ROC) is a two-dimensional plot that illustrates the performance of a classifier as the discrimination cut-off value is changed over the range of the predictor variable. The x-axis or independent variable is the False Positive Rate (FPR) for the predictive test. The y-axis or dependent variable is the True Positive Rate (TPR) for the predictive test [25].
The area under the curve (AUC) summarizes the performance of the classifier into a single quantitative metric, and it is used to determine, which classifier has superior performance. A classifier with a larger AUC is better than another one with smaller AUC value [26].
In study [27], there has been proposed a method to verify the accuracy for imbalanced datasets, called Index of Balanced Accuracy (IBA). The idea of this measure is to moderately favour the classification models with a higher prediction rate of the minority class without underestimating the relevance of the majority class. In other words, this measure quantifies the accuracy of the model without taking into account if the classes have a different number of samples that mislead into a bias. This will be discussed in the section III-E.
The measures used to characterize IBA are: • Geometry Mean This measure attempts to maximize the accuracy for each class, at the same time, regulates those accuracies to be balanced. • Dominance is aimed at quantifying the prevalence relation between the majority and minority classes and is used to analyze the behavior of a binary classifier. • IBA is a performance metric in classification that aims to make it more sensitive for imbalanced domains.

C. TRAINING SCHEME
It is known that for ResNet-18 architecture, the input images should be 224×244×3. In the proposed ECG-RPC system, the generated image from each heartbeat is obtained via RP method when a signal should be transform in a 2D image ( fig.4) In examples presented in this figure an image generated by RP is dependent on the samples obtained in the short-duration segmentation. Therefore, the network is initialized with He-Normal strategy which is described as a N (µ, σ 2 ), with µ = 0, σ = 2/f an in , where f an in , is described as the number of inputs in the previous layer. The proposed system is trained  The Learning Rate is minimized by employing ReduceL-ROnPlateau technique which is used to minimize the Learning Rate. The method helps processing when accuracy has stopped increasing learning rate as follows: where α − LearningRate, and the gradient ∂L ∂W is the partial derivation of the loss function in respect to the weight parameter. This defines the changes of rate in error with respect to the changes in the weight parameter. Figure 5 shows the number of samples for each a class after the proposed segmentation. It is noticeable that the quantity of images is not balanced or distributed equitability that would produce an incorrect classification for the minority class due to it having a lower probability compared to the other classes. Hence, almost 85% of all the samples are contained in class N having 90,630 samples, in comparison with class F that has 803 samples.

D. TRAINING ON COMPLETE MIT-BIH DATASET
The results of the ECG-RPC system evaluated with the complete MIT-BIH arrhythmia database, as the Training-Loss curves, Confusion Matrix and ROC-AUC is shown in figure 6. Table 4, exposes that designed ECG-RPC system archives 98.41% for overall accuracy, 92.06% of Precision, 92.66% of Recall and 93.62% of F1-Score. The performance of designed system seems to demonstrate good performance. However, as one can see in figure 6 the Confusion Matrix shows that for the imbalanced dataset since the system fails to classify some samples of class F misclassifying them to class S. This behaviour could be due to an imbalanced dataset where ECG-RPC system can present a mistake extracting deep features that could be the same features with the other classes (F and S).

E. TRAINING ON BALANCED MIT-BIH DATASET
According to presented above comments, if we don't resolve the problem of imbalanced data, the ECG-RPC system could result in incorrect classification of samples contained in   classes F and S. To mitigate this issue, we employ an undersampling technique to train and evaluate the same quantity of samples uniformly.
In this study, we propose the usage of Random Under-Sampling (RUS). This technique decreases the samples per class until they reach the class with lower samples. In difference with other methods to balance data, RUS does not generate synthetic data from the dataset that in case of ECG signals could result in erroneous classification. In our case, the RUS technique reduces to 4015 samples in the complete dataset, which are 803 samples for each class, based on the minority class F contained in the MIT-BIH arrhythmia database. The new distribution graph of the balanced dataset is shown in figure 7. Also, with the advantage of reducing the dataset with the RUS technique, we investigated the effect of lowering the samples of each class to a fixed value. In table 6, one can observe that while we increase the specified value, the performance of the proposed system is improved until we employ the maximum value of 803 samples, as mentioned before, belonging to the minority class (class F of the

F. EXPERIMENTAL RESULTS
The experimental results presented in table 7 have justified that the novel ECG-RPC system appears to demonstrate visible improvement in comparison with several methods, confirming the effectiveness of the RP technique and resolving the balance of the classes via the RUS algorithm. Several methods from the state-of-the-art use different techniques to classify Arrhythmia. For example, in [11], the author suggested employing the SMOTE oversampling technique to address the imbalanced data problem. However, this is a problem since the obtained synthetic data could mislead to different ECG signals and cause a bias in the system. Also, the author proposed the transformation 1D to 2D with several preprocessing techniques (GAF, MTF, and RP), standalone or combined, obtaining an accuracy of 98.6% by employing the MTF preprocessing technique. Still, the problem is that the system is trained on an imbalanced dataset, which can mislead to a bias for the majority class. Bhekumuzi et al. [13] proposed a system with two classification stages employing ResNet architectures. Their system did not fix the imbalanced problem that can be seen in values of the criteria G-Means and IBA (table  7), and can result in incorrect classification. Finally, study, [14] proposes a system where the Binary Recurrent Plot is employed. However, their system did not resolve the   In the figure 8, where the training and the loss curves are presented, one can see difference in comparison the performance that is obtained via RUS technique against figure 6 where similar results are exposed without applying the RUS technique.
Another method to check the performance of novel system can be seen in the figure 8(c), which presents the confusion matrix, where each column of the matrix represents the number of predictions of each class, while each row represents the instances in the real class. That, in practical terms, allows us to see what type of successes and errors of our model is when it is going through the learning process on the data. Figure 8(d), shows the results for the ROC curve, in general an AUC value of 95.7% is achieved for all classes.

IV. CONCLUSIONS
In this study, a competitive ECG-RPC system has been designed to classify five principal classes from MIT-BIH arrhytmia database . Novel system can resolve the problem in balancing classes, avoiding the usage of synthetic sample generator that could result in erroneous classifications. ECG-RPC system can be trained without large data volume obtaining competitive classification results. The designed system can efficiently classify the diseases according to the AAMI standard and appears to demonstrate good performance in terms of commonly used criteria: overall accuracy of 97.62%, precision of 95.42%, recall of 95.42%, F1-Score of 95.06%, and AUC of 95.7%, in comparison with better state-of-the-art systems. Other metrics used for quality characterisation, such as F1-Score, ROC-AUC, G-Mean, IBA justify better performance against the methods that don´t take into account the imbalanced classes problem. The proposed system can be useful for the inexperienced physicians, as rare morphologies can be found. Also, the designed CAD system can be used presenting a second classification opinion.