Developing Heart Attack Prediction Model Based on Hyperparameter Tuning and Machine Learning Approach
Keywords:
Machine Learning, Imbalance Data, Hyperparameter Tuning, Catboost, Xgboost, Naive Bayes, Decision TreesAbstract
Heart attacks pose a severe health threat to human society in the world. If the heart attack is discovered in its early stages, it can be treated and save souls from death and not occurring severe complications. The machine learning approach can predict the precision of the heart attack risk to take the appropriate intervention measures in time and help medical institutions. The heart attack data sets are excessively unbalanced, leading to the bias of machine learning models. To model, an unbalanced dataset needs exceptional methods to reduce model bias. Heart attack constitutes a minority category of clinical data sets. Finding hyperparameters is a critical task in finding the best model. This paper proposes a concept of boosting (Catboost) to develop a heart attack prediction model for this type of data and use the Hyperopt method to find hyperparameters to reduce model bias. The study concluded that Catboost is better at predication the target minority group compared to (decision tree, naive Bayes, and Xgboost)and finding hyperparameters using Hyperopt increases model accuracy and reduces bias
References
Kumar, Yogesh & Koul, Apeksha & Singla, Ruchi & Ijaz, Muhammad Fazal. (2022). "Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda". Journal of Ambient Intelligence and Humanized Computing. 1-28. 10.1007/s12652-021-03612-
Ahuja, Abhimanyu. (2019). "The impact of artificial intelligence in medicine on the future role of the physician". Journal of PeerJ. 7. e7702. 10.7717/peerj.7702.
Tarbell, John & Mahmoud, Marwa & Corti, Andrea & Cardoso, Luis & Caro, Colin. (2020). "The role of oxygen transport in atherosclerosis and vascular disease". Journal of The Royal Society Interface. 17. 20190732. 10.1098/rsif.2019.0732.
Hessel FP. (2021)" Overview of the socio-economic consequences of heart failure". Cardiovasc Diagn Ther. 11(1):254-262. doi: 10.21037/cdt-20-291. PMID: 33708497; PMCID: PMC7944217.
Khdair, Hisham & Dasari, Naga. (2021). "Exploring Machine Learning Techniques for Coronary Heart Disease Prediction". International Journal of Advanced Computer Science and Applications. 12. 10.14569/IJACSA.2021.0120505.
Sharma, Chetan & Shambhu, Shankar & Das, Prasenjit & Jain, Dr. Shaily. (2021). "Features Contributing Towards Heart Disease Prediction Using Machine Learning".
Pabitra Kumar Bhunia &Pouami Mondal &Kankan Gitanguly &Arijit Debnath &Monalis D E &Pranati Rakshit .(2021)." Heart Disease Prediction using Machine Learningh". International Journal of engineering research technology.2278-018.
Dutta, Aniruddha & Batabyal, Tamal & Basu, Meheli & Acton, Scott. (2020). "An Efficient Convolutional Neural Network for Coronary Heart Disease Prediction". Journal of Expert Systems with Applications. 159. 113408. 10.1016/j.eswa.2020.113408.
Rajliwall, Nitten & Davey, Rachel & Chetty, Girija. (2018). Machine Learning Based Models for Cardiovascular Risk Prediction. Proceedings International Conference on Machine Learning and Data Engineering ,iCMLDE 2018.142-148. 10.1109/iCMLDE.2018.00034.
David. H, Benjamin & Belcy, S.(2018). Heart disease prediction using data mining techniques. ICTACT Journal On Soft Computing. 10.21917/ijsc.2018.0253.
Walaa Adel Mahmoud a , Prof. Dr. Mohamed Aborizka a ,Prof. Dr. Fathy Ahmed Elsayed Amer2b.(2021). "Heart Disease Prediction Using Machine Learning and Data Mining Techniques: Application of Framingham Dataset". Turkish Journal of Computer and Mathematics Education. 4864- 4870.
Leevy, Joffrey & Khoshgoftaar, Taghi & Bauder, Richard & Seliya, Naeem. (2018). "A survey on addressing high-class imbalance in big data". Journal of Big Data. 5. 10.1186/s40537-018-0151-6.
Kumar, Pradeep & Bhatnagar, Roheet & Gaur, Kuntal & Bhatnagar, Anurag. (2021). "Classification of Imbalanced Data:Review of Methods and Applications". IOP Conference Series: Materials Science and Engineering. 1099. 012077. 10.1088/1757-899X/1099/1/012077.
Santiso, Sara & Casillas, Arantza & Pérez, Alicia. (2018). "The class imbalance problem detecting adverse drug reactions in electronic health records". Health Informatics Journal. 25. 146045821879947. 10.1177/1460458218799470.
https://catalog.data.gov/dataset/national-health-and-nutrition examination-survey-nhanes-national-cardiovascular-disease-su-00a88
Kaviani, Pouria & Dhotre, Sunita. (2017). "Short Survey on Naive Bayes Algorithm". International Journal of Advance Research in Computer Science and Management. 04.
G, Parthasarathy. (2020). "Analysis Of Machine Learning Algorithm For Prediction Of Heart Disease". International Journal of Advanced Research in Computer Science. 11. 42-46. 10.26483/ijarcs.v11i3.6532.
Kumar, Dharmender & Priyanka, N.A.. (2020). "Decision tree classifier: a detailed survey". International Journal of Information and Decision Sciences. 12. 246. 10.1504/IJIDS.2020.10029122.
Reddy, V & Meghana, P & Reddy, N V Subba & Rao B, Ashwath. (2022). "Prediction on Cardiovascular disease using Decision tree and Naïve Bayes classifiers". Journal of Physics: Conference Series. 2161. 012015. 10.1088/1742-6596/2161/1/012015.
Sennan, Sankar & Potti, Anupama & Chandrika, G. & Ramasubbareddy, Somula. (2022). "Thyroid Disease Prediction Using XGBoost Algorithms". Journal of Mobile Multimedia. 10.13052/jmm1550-4646.18322.
Sigrist, Fabio. (2020). "Gradient and Newton boosting for classification and regression". Journal of Expert Systems with Applications. 167. 114080. 10.1016/j.eswa.2020.114080.
Liudmila Prokhorenkova &Gleb Gusev& Aleksandr Vorobev&Anna Veronika Dorogush& Andrey Gulin .(2018). "CatBoost: unbiased boosting with categorical features". 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada.
Ibrahim, Abdullahi & Raheem, Ridwan & Muhammed, Muhammed & Abdulaziz, Rabiat & Ganiyu, Saheed. (2020). "Comparison of the CatBoost Classifier with other Machine Learning Methods". International Journal of Advanced Computer Science and Applications. 11. 11.
Dwivedi, Ashok. (2018). "Performance evaluation of different machine learning techniques for prediction of heart disease". Journal of Neural Computing and Applications. 10.1007/s00521-016-2604-1
Vujovic, Zeljko. (2021). "Classification Model Evaluation Metrics". International Journal of Advanced Computer Science and Applications. Volume 12. 599-606. 10.14569/IJACSA.2021.0120670.
Hossain, Md Riyad & Timmer, Douglas & Moya, Hiram. (2021). "Machine learning model optimization with hyper-parameter tuning approach". Global Journal of Computer Science and Technology: D Neural & Artificial Intelligence
Shekhar, Shashank & Bansode, Adesh & Salim, Asif. (2021). “A Comparative study of Hyper-Parameter Optimization Tools”. 1-6. IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE) .10.1109/CSDE53843.2021.9718485.