Machine learning analysis of oral solid dosage formulation solubility variations by adjusting pressure and temperature.
Ahmed A Lahiq, Abdullah A Alshehri, Shaker T Alsharif
Abstract
Open AccessIn this research work, we hand out a comprehensive study on predicting the solubility of tolfenamic acid and the density of supercritical carbon dioxide (SC-CO2) using a combination of machine learning models and hyper-parameter tuning techniques. The dataset is composed of input features, specifically temperature and pressure, which are used to predict the target outputs, namely the density of SC-CO2 and the solubility of tolfenamic acid. Three distinct models, namely ADA-GPR (AdaBoost on Gaussian Process Regression), ADA-SVR (AdaBoost on Support Vector Regression), and ADA-LR (AdaBoost on Linear Regression), were employed to correlate the relationships between the inputs and outputs for the dataset. The hyperparameters of these models were optimized using the Chimp Optimization Algorithm (ChOA) to enhance performance. In predicting the solubility of tolfenamic acid, ADA-GPR achieved excellent results, with an R-squared value of 0.98806, an RMSE of 0.10133, and an MAE of 0.07790. Additionally, ADA-SVR and ADA-LR delivered competitive outcomes, attaining R-squared scores of 0.96056 and 0.86815, respectively. In the realm of SC-CO2 density prediction, it is noteworthy to highlight that the ADA-GPR model has emerged as the preeminent performer with an exceptional R-squared score of 0.99265, RMSE of 9.7870, and MAE of 7.81506. ADA-SVR and ADA-LR exhibited favorable performance as well, achieving R-squared scores of 0.8841 and 0.87774, respectively. This study helps pharmaceutical and chemical companies predict tolfenamic acid solubility and SC-CO2 density. The proposed models and ChOA hyper-parameter optimization solve solubility and density prediction problems in research and industry.