Performance Comparison of the SVM and SVM-PSO Algorithms for Heart Disease Prediction

ABSTRACT


INTRODUCTION
In the world of health and medicine, the accuracy of predicting a disease is very important and requires appropriate and effective decisions in taking an analysis and predicting the accuracy of a patient's disease, such as heart disease which really requires accuracy in predicting an existing disease symptom. One of the non-communicable diseases (NCDs) that is prone to occur especially when an individual is at a productive age, namely heart disease. The high mortality factor from heart disease is due to the lack of public knowledge of the symptoms or signs when a person has this disease. Heart disease is one of the diseases that is quite dangerous when attacking a person, where the main cause of heart disease comes from the lifestyle of an unhealthy individual, consuming high-cholesterol foods, using alcohol, tobacco, extreme diets and other causes.
Heart disease is a disturbance in the balance between blood supply and demand caused by blockage of blood vessels. Deaths due to heart disease reached 959,227 patients, namely 41.4% of all deaths or every day 2,600 people died from heart disease [1] [2].
Many methods of prediction of heart disease have been proposed using Genetic Algorithms, native bayes and decision trees supporting naive bayes, Multilayer Perceptron [3]. Previous studies [4] used The Statlog (heart disease) dataset, where the highest accuracy (92.59%) was obtained using the C4.5 decision tree ensemble classification compared to other algorithms. Furthermore, in another study [5] also using the Statlog (heart disease) dataset, the highest accuracy result was obtained by the Naive Bayes method of 84%. Symptom factors diagnosed as heart disease include the type of chest pain (cheasr pain), high blood pressure (tresbps), cholesterol (chol), ECG test score (resting electrodiagraphic "restacg"), heart rate (thalach) and blood sugar level (Fasting blood sugar "FBS"), and several other factors that identify that a person has heart disease.
Heart disease includes aortic regurgitation, cardiogenic shock, congenital heart disease, cardiomyopathy, peripartum cardiomyopathy, tricuspid regurgitation which often affects children, adults and is still a major problem in developing countries [6].
In this study, we will compare two Support Vector Machine (SVM) classification algorithms as a single algorithm and compare them with Support Vector Machine (SVM) in combination with Particle Swarm Optimization (PSO) to determine which result is more accurate in predicting more common heart disease good [7].

RESEARCH METHOD
Self-efficacy has three dimensions that are magnitude, the level of task difficulty a person believes she can attain; strength, the conviction regarding magnitude as strong or weak; and generality, the degree to which the expectation is generalized across situations. Self-efficacy is judgement of a person to his capabilities to plan and implement the action to reach certain goals [8]. In an academic context, self-efficacy reflects how confident students are in performing specific tasks. Self-efficacy plays a role in academic motivation and learning motivation (especially students' ability to manage their learning activities), and resistance to learning [9].
Self-efficacy in mathematics is described as an individual's mathematics self-efficacy is his or her confidence about completing a variety of tasks, from understanding concepts to solving problems, in mathematics [10]. High mathematics self-efficacy will encourage the achievement of good learning outcomes, and when students have good learning outcomes, they will be more motivated in the learning process. Higher self-efficacy expectations can lead to better results and therefore increase the motivation for learning mathematics. Based on the description above, it can be concluded that mathematics self-efficacy is a belief or self-assessment of the student's ability in overcoming certain mathematical problems and tasks related to mathematics in the three dimensions that are magnitude, strength and generality.
One type of solution that can be done to deal with the problem of large data dimensions is to perform feature selection. It is a process that reduces the dimension of features by selecting important attributes and eliminating irrelevant, redundant and noisy attributes to get a more accurate data classification. Thus, feature selection is an important step in text classification and directly affects the classification performance [11] [12].
Classification is how to place specific objects into a group based on their nature. This method aims to study the different functions that describe each of the data selected into one of the predefined groups of classes [13].

Supports Vector Machines (SVM)
Support Vector Machine is a machine learning method that works based on the principle of Structural Risk Minimization (SRM) with the aim of finding the best hyperplane that separates two classes in the input space. The best hyperplane is a hyperplane that is located in the middle between two sets of objects of two classes. The best dividing hyperplane between the two classes can be found by measuring the hyperplane margin and finding its maximum point. Margin is the distance between the hyperplane and the closest pattern of each class. This closest pattern is called a support vector [14]. Support Vector Machine (SVM) is a machine learning method that works based on the principle of Structural Risk Minimization (SRM) with the aim of finding the best hyperplane that separates two classes in the input space [15] [16]. Support Vector Machine is defined as a set of related learning methods that analyze data and recognize patterns, which are then used for classification and regression analysis [17]. SVM takes a set of input data and predicts for each given input, which comes from two classes which are then classified by finding the best hyperplane value. Support Vector Machine is a classification method to find the best hyperplane value that is able to find the optimal global solution [18], [19]. So the accuracy value is not easy to change because not all training data will be seen to be involved in every training iteration [20]. The data that contributes is referred to as a support vector, so this method is called a Support Vector Machine.
The characteristics of the Support Vector Machine (SVM) are as follows: 1. In principle, SVM is a linear classifier. 2. Pattern recognition is done by transforming the data in the input space to a higher-dimensional space, and optimization is carried out on the new vector space. This distinguishes SVM from typical pattern recognition solutions, which perform parameter optimization on the transformed result space with lower dimensions than the input space. 3. Implementing a Structural Risk Minimization (SRM) strategy. 4. The working principle of SVM is basically only able to handle the classification of two classes. In simple terms, the SVM concept is an attempt to find the best hyperlane that functions as a separator of two classes in the input space, which can be seen in the image below: Figure. 1. SVM concept to find the best hyperplane The figure above shows some patterns that are members of two data classes: +1 and -1. Data belonging to class -1 is symbolized by a circle, while data in class +1 is symbolized by a square.

Particle Swarm Optimization (PSO)
Particle Swarm Optimization (PSO) is a global heuristic optimization technique introduced by Doctors Kennedy and Eberhart in 1995 which was inspired by the social behavior of flocks of birds trying to achieve an unknown goal [21]. Particle Swarm Optimization (PSO) is a type of intelligence algorithm that is able to optimize the related variables most effectively.
According to Liu, Particle Swarm Optimization (PSO) is an evolution of computational engineering. Similar to genetic algorithms, PSO is an optimization tool. It is inspired by social behavior between individuals. Particles (individuals) that represent potential solutions to the problem move through the n-dimensional search space. Each particle i keeps a record of the best performing position in a vector called pbest [22].
According to Y. Yin et al., Particle Swarm Optimization (PSO) is a computational method that iteratively optimizes a problem to increase candidate solutions to a certain size. Quality The movement of each particle is affected by a local position that is guided towards the most recognized position in the search for space, which is updated as a better position than the other particles [23]. Particle Swarm Optimization (PSO) is also an evolutionary computational technique capable of generating global optimal solutions in the search space through the interaction of individual particles in a swarm. Each particle conveys information in the form of its best position to other particles and adjusts the position and speed of each based on the information received about the best position [24].
According to Zhao, Liu, Zhang, & Wang, Particle Swarm Optimization (PSO) is an evolutionary computational technique that is able to generate a global optimal solution in the search space through the interaction of individuals in a swarm of particles [25]. Each particle conveys information in the form of its best position to other particles and adjusts the position and speed of each based on the information received about the best position.
Particle Swarm Optimization (PSO) is a tool for dealing with optimization problems. Although relatively new, many have implemented the PSO algorithm, because it is quite simple and has a faster computational speed than other optimization algorithms such as Genetic Algorithm (GA). Each particle in PSO is also associated with the velocity of the particle that flies through the search space at a speed that is dynamically adjusted to its historical behavior. Therefore, the particles have a tendency to fly towards a better search area during the search process [26]. Based on the above understanding, it can be concluded that Particle Swarm Optimization (PSO) is an optimization method that is able to optimize the closest variable to achieve maximum accuracy. Swarm Intelligence (SI) is an innovative distributed intelligent paradigm to solve optimization problems which originally took inspiration from biological examples with the phenomena of swarming, flocking and herding in vertebrate animals. Particle Swarm Optimization (PSO) combines the swarming behavior of sampled animals in a flock of birds, a group of fish, or a swarm of bees, and social behavior in humans [27].
To find the optimal solution, each particle will move to the previous best position (pbest) and the global best position (gbest). For example, the i-th particle is expressed as: xi = (xi,1,xi,2... x-i,d) in d-dimensional space. The previous best position of the i-th particle is stored and expressed as pbesti = (pbesti,1, pbesti,2,... pbesti,d). Change the speed and position of each one The particles can be calculated using the current velocity and the distance pbesti,d to pbestd as shown by the following equation: vi,m= w.vi,m+ c1* R * (pbesti,m -xi,m) + c2* R * (gbestm-xi,m) xid= xi,m+vi,m Where: n : number of particles in group d : dimension vi,m : velocity of particle i in iteration i w : weight factor of inertia c1, c2 : acceleration constant (learning rate) R : random number (0-1) xi,d : the current position of the i-th particle in the i-th iteration pbesti : the previous best position of the i-th particle gbest : the best particle among all the particles in a group or population The above formula calculates the new velocity for each particle (potential solution) based on the previous velocity (Vi,m), the location of the particle that has achieved the best fitness value (pbest), and the location of the global population (gbest for the global version, lbest for the local version). or the local environment in the local version of the algorithm where the best fitness value has been achieved.
The following equation updates the position of each particle in the solution space. Two random numbers c1 and c2 are generated independently. The use of inertial weights w has provided improved performance in a number of applications. Broadly speaking, the basic structure of PSO can be depicted in the graph below:

K-Fold Cross Validation Test
One alternative approach to "train and test" that is often adopted in some cases (and some regardless of size) is called k-fold cross-validation, by testing the magnitude of the error in the test Cross validation is a validation technique by dividing the data randomly into k parts and each part will be classified as a process [28]. By using cross validation, a k test will be carried out. The data used in this experiment is training data to find the overall error rate. In general, testing the value of k is done 10 times to estimate the accuracy of the estimate. In this study, the value of k used is 10 or 10 times the cross validation. each experiment will use one test data and the k-1 part will be the training data, then the test data will be exchanged with one training data so that for each trial different test data will be obtained. Training data is data that will be used in conducting learning while test data is data that has never been used for learning and will be used as data to test the truth or accuracy of learning outcomes [29].

Confusion Matrix
The confusion matrix provides decisions obtained in training and testing, the confusion matrix provides an assessment of the classification performance based on objects correctly or incorrectly [30]. The confusion matrix contains actual (actual) and predicted (predicted) information on the classification system.

ROC Curve
The ROC (Receiver Operating Characteristic) curve or commonly called the AUC value is a useful visual tool for comparing two classification models. ROC expresses the Confusion matrix. ROC is a two-dimensional graph with false positives as horizontal lines and true positives as vertical lines [9]. Using the ROC curve, we can see the trade off between the rate at which the model can

RESULTS AND DISCUSSION
This research is a systematic problem solving activity, which is carried out carefully and attentively in the context of the situation at hand, research in the academic field is used to refer to a diligent and systematic investigation or investigation in an area, with the aim of finding or revising facts, theories , application and purpose for discovering and disseminating new knowledge This study uses experimental research methods involving the Heart Disease dataset. Heart disease is a heart database that has variables that must be predicted whether it has symptoms of heart disease or not. This database contains 76 attributes, but all published experiments refer to the use of a subset of 14 attributes. In particular, the Cleveland database is the only one used by ML researchers to date. The "goal" field refers to the presence of heart disease in the patient. It is an integer rated from 0 (none) to 4. Experiments with the Cleveland database have concentrated on a simple attempt to distinguish existence (value 1,2,3,4) from absence (value 0). The patient's name and social security number were recently removed from the database, replaced with dummy values. Attribute Information:

Testing the Support Vector Machine (SVM) Method
The following are the results of the Support Vector Machine test using the Cross Validation method using RapidMiner. The results and discussion contain discussion and final results or program outputs or analysis of research methods. The results of the SVM test from the haert disease dataset are shown in Figure 5. The accuracy value is 81.59% and the AUC is 0.823, so that value will be used in this study. Based on the classification of data mining according to Gorunescu Florin, the results of the completion of the classification carried out by SVM with the haert disease dataset have an AUC value between 0.80-0.90 with a good classification meaning.
Results show that the performance evaluation of the proposed method is carried out by calculating the test parameters in the form of precision with a value of 80.59%, recall with a value of 88.68% is calculated using the formula:

Testing the Support Vector Machine (SVM) and Particle Swarm Optimization (PSO) Method
The following are the results of the Support Vector Machine and Particle Swarm Optimization test using the Cross Validation method using RapidMiner.The results and discussion contain discussion and final results or program outputs or analysis of research methods. From the results of the SVM-PSO test from the haert disease dataset in Figure 7. the accuracy value is 84.81% and the AUC is 0.898, so that value will be used in this study. Based on the classification of data mining according to Gorunescu Florin, the results of the completion of the classification carried out by SVM-PSO with the haert statlog dataset have an AUC value between 0.80-0.90 with a good classification meaning.
Results show that the performance evaluation of the proposed method is carried out by calculating the test parameters in the form of precision with a value of 80.84%, recall with a value of 96.97% is calculated using the formula:

Evaluation Analysis and Result Validation
The results of the tests carried out were to make direct predictions with the support vector machine (SVM) as a single method, and compared directly with the support vector machine (SVM) based on particle swarm optimization (PSO) to determine accuracy and AUC (Area Under the Curve) values. The classification model can be evaluated based on criteria such as accuracy, speed, reliability, scalability and interpretability (Vecellis, 2009). The results of the analysis of Accuracy and AUC calculations from the SVM vs SVM-PSO algorithm are summarized in the table below. The results shown in the graph in table 1, it can be stated that the results of the classification method focus on the Accuracy and AUC values in each method. The results of the performance evaluation of the proposed method are carried out by calculating the test parameters in the form of precision, recall and f-measure. In general, precision, recall and f-measure are calculated using the formula:

CONCLUSION
In data analysis research for large-dimensional datasets, classification is very necessary in predicting from a dataset, in this study comparing a method for classifying large data where the data will be processed to obtain the desired data prediction information. From the experiments conducted by the SVM-PSO algorithm, the Accuracy value is 84.81% and the AUC value is 0.898, while the SVM Algorithm as a single algorithm only gets an Accuracy value of 81.85% and an AUC value of 0.823. From the results of experiments and tests that have been carried out, it can be concluded that the SVM-PSO algorithm method is better than using a single SVM algorithm in predicting and classifying data.