Radu-Emil Precup is currently with the Politehnica University of Timisoara (UPT), Romania, where he became a Professor in the Department of Automation and Applied Informatics, in 2000, and he is currently a Ph.D. supervisor in automation and systems engineering. Since 2022 he is also a senior researcher (CS I) and the head of the Data Science and Engineering Laboratory of the Center for Fundamental and Advanced Technical Research, Romanian Academy – Timisoara Branch, Romania. From 2016 to 2022, he was an Adjunct Professor within the School of Engineering, Edith Cowan University, Joondalup, WA, Australia. He is currently the director of the Automatic Systems Engineering Research Centre of the UPT. From 1999 to 2009, he held research and teaching positions with the Université de Savoie, Chambéry and Annecy, France, Budapest Tech Polytechnical Institution, Budapest, Hungary, Vienna University of Technology, Vienna, Austria, and Budapest University of Technology and Economics, Budapest, Hungary.
He has been an Associate Editor of IEEE Transactions on Fuzzy Systems (2018-2022, Certificate of Commendation in 2022), Information Sciences (Elsevier, 2021-2024), Engineering Applications of Artificial Intelligence (Elsevier, 2021-2024), Applied Soft Computing (Elsevier, 2014-2024), Expert Systems with Applications (Elsevier, 2021-2024), Communications in Transportation Research (Elsevier, 2021-2024), Applied Artificial Intelligence (Taylor & Francis, 2022-2024), and Healthcare Analytics (Elsevier, 2021-2024), is the Editor-in-Chief of Romanian Journal of Information Science and Technology, and is an editorial board member of several prestigious journals including IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Cybernetics (IEEE SMC Best Associate Editor Award in 2025), IEEE Open Journal of the Computer Society, Evolving Systems (Springer Nature Editorial Contribution Award in 2025), Journal of Engineering-JOE (IET and Wiley), and Journal of Intelligent and Connected Vehicles (Tsinghua University Press, China, and IEEE).
Prof. Precup is a Fellow of IEEE, a corresponding member of the Romanian Academy, a Doctor Honoris Causa of the Óbuda University, Budapest, Hungary, and a Doctor Honoris Causa of the Széchenyi István University, Győr, Hungary. He received the Elsevier Scopus Award for Excellence in Global Contribution (2017), was named a 2022 academic data leader by Chief Data Officer (CDO) Magazine, and was listed as one of the top 10 researchers in artificial intelligence and automation (according to IIoT World as of July 2017).
Presentation Title: Reinforcement Learning-Based Control Structures with Transportation Applications
Presentation Abstract: Reinforcement Learning (RL) is an attractive field of machine learning and artificial intelligence that involves solving control and decision-making problems in various fields, such as transportation. RL is closely related to optimization and neural networks.This plenary lecture presents research results from the Process Control Group at Politehnica University of Timisoara, Romania. The group’s research focuses on RL-based control and combines it with metaheuristic algorithms to improve performance in four control structures.
- First, a Policy Iteration (PI) RL-based control structure is presented, where the weights and the biases of the control policy neural network (NN), i.e., the parameter vector, are searched using the metaheuristic GWO algorithm, aiming to overcome some known drawbacks of classical Gradient Descent (GD)-based implementations.
- Second, a Deep Q-Learning (DQL) control structure is described, where the NNs involved in the RL optimization process are initialized with the metaheuristic GSA, resulting in a more efficient initial version of the control policy neural network.
- Third, an efficient way is presented to to implement RL in the form of the Proximal Policy Optimization (PPO), where the metaheuristic SMA is used in the PPO algorithm to dynamically adjust the value of the learning rates involved in the RL process, resulting in an adaptive SMA-based PPO optimal control structure.
- Fourth, the safety issue of RL implementations is discussed using a Deep Deterministic Policy Gradient (DDPG) algorithm, where the NNs are initialized using the metaheuristic SMA. The safety constraints from the safe RL framework are introduced into the DDPG-based optimization process by computing a state safety penalty value that is added to the global cost function at each iteration. The safety penalty value is also introduced into the SMA-based initialization process, resulting in an initial version of the control policy NN that is able to safely explore the environment before the actual learning or control process starts.
The experimental results of the servo system and tower crane system control systems are illustrated.