Data-Driven Optimization of Healthcare Recommender System Retraining Pipelines in MLOps with Wearable IoT Data.
Yohan Park, Jonghyeok Mun, Yejung Lee, Jihwan Um, Jongsun Choi, Jaeyoung Choi
Abstract
Open AccessPersonalized healthcare recommender systems are increasingly being deployed in edge AI environments through wearable devices. In such environments, cloud servers leverage high-performance GPUs to train base models, which are then optimized for data reduction deployment on edge devices, enabling the delivery of personalized services. However, the base model may experience a gradual decline in accuracy over time, a phenomenon known as model drift. Recommender systems that do not keep up with changes in user preferences risk generating predictions based on outdated behavior, which can negatively impact the user experience. Therefore, it is essential to adopt retraining approaches that incorporate both past training data and new data from wearable devices. To address the drift problem, we propose a dynamic data management strategy, integrated into an automated training pipeline based on machine learning operations (MLOps). This approach enables adaptive model updates in response to continuously evolving IoT data. To preserve base model performance, our strategy leverages data reduction and feature selection algorithms. By dynamically managing data with these techniques, we effectively mitigate data drift and enhance resource efficiency during model retraining. We validated our approach through experiments on personalized fitness recommendations using FitRec wearable data from 1104 users, achieving improved computational efficiency during retraining while preserving model accuracy. Consequently, our dynamic data management method ensures faster training and the sustained performance of data reduction base models essential for edge AI applications. Moreover, this approach presents a compelling solution for continuously refining personalized recommendation services in alignment with evolving user preferences.