Abstract
Polyhydroxyalkanoates (PHAs) are a biodegradable and biologically produced thermoplastic with potential to replace traditional Petrol-based plastics. Mixed microbial consortia (MMC), operated under aerobic dynamic feeding (ADF) conditions, offer a cost-effective and sustainable approach for PHA production using volatile fatty acids (VFAs) derived from otherwise unused carbon-rich waste streams. However, the absence of real-time VFA monitoring presents a critical limitation for PHA accumulation, and scalability for commercial application. This study aimed to develop and deploy a supervised machine learning (ML) model capable of predicting real-time VFA concentrations during multi-pulse production runs (PRs). The performance of five ML models, Linear Regression, Decision Tree, Random Forest, Support Vector Regressor (SVR), and Artificial Neural Networks (ANNs), were compared against each other in predicting VFA consumption by MMC fed VFAs using mean squared error (MSE), mean absolute error (MAE), and R2. These models were trained using simple input parameters such as dissolved oxygen (DO), oxidation reduction potential (ORP), oxygen uptake rate (OUR), among other parameters. The Random Forest model was selected for deployment due to its robust generalization and highly accurate prediction of depletion points during training. The model was deployed in three PRs; these were six-, eight-, and ten-pulses long lasting 12-, 20-, and 28-hours long, respectively. The second PR yielded the best performance (MAE of 1.12 Cmmol/L), while suboptimal conditions during the first and third runs highlighted limitations relating to sensor accuracy and model robustness. Nevertheless, this research demonstrates the potential of ML models to provide real-time VFA monitoring in MMC-based PHA systems, while also highlighting important development and deployment challenges.