Team 35
Team Members |
Faculty Advisor |
Vishal Saminathan |
Chuxu Zhang Sponsor UConn, Dr. Chuxu Zhang |
sponsored by
Sponsor Image Not Available
Improving Healthiness in a Multi-Objective Personalized Health-Aware Food Recommendation System using Advantage Actor-Critic Reinforcement Learning Networks
Our project extends the Multi-objective Personalized Health-aware Food Recommendation System (MOPI-HFRS), a state-of-the-art framework for balancing user preference, personalized health, and nutritional diversity using graph neural networks and Pareto-based optimization. While effective, the original system generates recommendations in a single step, limiting its ability to model interactions and trade-offs across an entire top-K recommendation list. To address this, we introduce a sequential reinforcement learning extension that reframes recommendation as a step-by-step decision-making process. Operating on frozen GNN embeddings, we construct recommendation lists within a Markov Decision Process (MDP), where each selection depends on the evolving state of the list. We train a policy using an Advantage Actor-Critic (A2C) framework, enabling stable optimization of a reward function that captures both relevance and personalized healthiness. To improve training efficiency and stability, we incorporate a warm-start pretraining phase via imitation learning from the baseline model. We evaluate our approach against the original one-shot ranking method using such metrics as NDCG, health alignment, and diversity, demonstrating the effectiveness of modeling recommendation as a sequential, multi-objective decision process.