Prioritized Reward of Deep Reinforcement Learning Applied Mobile Manipulation Reaching Tasks
DOI: 10.23977/jaip.2025.080313 | Downloads: 0 | Views: 41
Author(s)
Zunchao Zheng 1
Affiliation(s)
1 Intelligent Process Automation and Robotics Lab, Karlsruhe Institute of Technology, Karlsruhe, Germany
Corresponding Author
Zunchao ZhengABSTRACT
In this paper, we apply deep reinforcement learning (DRL) for reaching target positions with a mobile manipulator while coordinating the mobile base and the manipulator and study the performance of different reward functions to get a higher success rate and more efficient movement. The reward is basically defined by the function of distance between the robot and the goal. We propose principles to build reward functions based on geometric series theory and discuss possible reward forms combined with different elements. We also present a prioritized reward function for mobile manipulation to weight movements of different parts and further provide a method to define the weights. Experiments are carried out in both two-dimensional and three-dimensional collision-free environments, and a further investigation into a relative task of going through an opening doorway is evaluated in the end.
KEYWORDS
Mobile Manipulation, Deep Reinforcement Learning, Reward EngineeringCITE THIS PAPER
Zunchao Zheng, Prioritized Reward of Deep Reinforcement Learning Applied Mobile Manipulation Reaching Tasks. Journal of Artificial Intelligence Practice (2025) Vol. 8: 102-114. DOI: http://dx.doi.org/10.23977/jaip.2025.080313.
REFERENCES
[1] R. Philippsen, L. Sentis, O. Khatib, "An open source extensible software package to create wholebody compliant skills in personal mobile manipulators," in 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011, pp. 1036–1041.
[2] A. Dietrich, T. Wimbock, A. Albu-Schaffer, G. Hirzinger, "Reactive whole-body control: Dynamic mobile manipulation using a large number of actuated degrees of freedom," IEEE Robotics & Automation Magazine, vol. 19, no. 2, pp. 20–33, 2012.
[3] G. B. Avanzini, A. M. Zanchettin, P. Rocco, "Constraint-based model predictive control for holonomic mobile manipulators," in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1473–1479.
[4] M. V. Minniti, F. Farshidian, R. Grandia, M. Hutter, "Whole-body mpc for a dynamically stable mobile manipulator," IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 3687–3694, 2019.
[5] J. Pankert, M. Hutter, "Perceptive model predictive control for continuous mobile manipulation," IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6177–6184, 2020.
[6] F. Farshidian et al., OCS2: An open source library for optimal control of switched systems, [Online]. Available: https://github.com/leggedrobotics/ocs2.
[7] T. P. Lillicrap, J. J. Hunt, A. Pritzel, et al., "Continuous control with deep reinforcement learning," arXiv preprint arXiv:1509.02971, 2015.
[8] S. Fujimoto, H. van Hoof, D. Meger, "Addressing function approximation error in actor-critic methods," CoRR, vol. abs/1802.09477, 2018. arXiv:1802.09477. [Online]. Available: http://arxiv.org/abs/1802.09477.
[9] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, O. Klimov, "Proximal policy optimization algorithms," CoRR, vol. abs/1707.06347, 2017.
[10] M. J. Mataric, “Reward functions for accelerated learning," in Machine Learning Proceedings 1994, Elsevier, 1994, pp. 181–189.
[11] S. Koenig, R. G. Simmons, “The effect of representation and knowledge on goal-directed exploration with reinforcement-learning algorithms," Machine Learning, vol. 22, no. 1-3, pp. 227–250, 1996.
[12] L. Matignon, G. J. Laurent, N. Le Fort-Piat, "Reward function and initial values: Better choices for accelerated goal-directed reinforcement learning," in International Conference on Artificial Neural Networks, Springer, 2006, pp. 840–849.
[13] M. Grzes and D. Kudenko, "Theoretical and empirical analysis of reward shaping in reinforcement learning," in 2009 International Conference on Machine Learning and Applications, IEEE, 2009, pp. 337–344.
[14] J. Kindle, F. Furrer, T. Novkovic, J. J. Chung, R. Siegwart, and J. Nieto, "Whole-body control of a mobile manipulator using end-to-end reinforcement learning,” arXiv preprint arXiv:2003.02637, 2020.
[15] C. Wang, Q. Zhang, Q. Tian, et al., "Learning mobile manipulation through deep reinforcement learning," Sensors, vol. 20, no. 3, p. 939, 2020.
[16] G. Brockman, V. Cheung, L. Pettersson, et al., Openai gym, eprint: arXiv:1606.01540. 2016.
[17] S. Jauhri, J. Peters, G. Chalvatzaki, "Robot learning of mobile manipulation with reachability behavior priors," IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 8399–8406, 2022.
[18] F. Schmalstieg, D. Honerkamp, T. Welschehold, A. Valada, Learning hierarchical interactive multi-object search for mobile manipulation, IEEE Robotics and Automation Letters, 8(12):8549–8556, 2023
[19] T. Ni, K. Ehsani, L. Weihs, and J. Salvador. Towards disturbance-free visual mobile manipulation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 5219–5231, 2023.
[20] E. Coumans, Y. Bai, Pybullet, a python module for physics simulation for games, robotics and machine learning, http://pybullet.org, 2016–2019.
[21] A. Hill, A. Raffin, M. Ernestus, et al., Stable baselines, https://github.com/hill-a/stable-baselines, 2018.
Downloads: | 15130 |
---|---|
Visits: | 487026 |
Sponsors, Associates, and Links
-
Power Systems Computation
-
Internet of Things (IoT) and Engineering Applications
-
Computing, Performance and Communication Systems
-
Advances in Computer, Signals and Systems
-
Journal of Network Computing and Applications
-
Journal of Web Systems and Applications
-
Journal of Electrotechnology, Electrical Engineering and Management
-
Journal of Wireless Sensors and Sensor Networks
-
Journal of Image Processing Theory and Applications
-
Mobile Computing and Networking
-
Vehicle Power and Propulsion
-
Frontiers in Computer Vision and Pattern Recognition
-
Knowledge Discovery and Data Mining Letters
-
Big Data Analysis and Cloud Computing
-
Electrical Insulation and Dielectrics
-
Crypto and Information Security
-
Journal of Neural Information Processing
-
Collaborative and Social Computing
-
International Journal of Network and Communication Technology
-
File and Storage Technologies
-
Frontiers in Genetic and Evolutionary Computation
-
Optical Network Design and Modeling
-
Journal of Virtual Reality and Artificial Intelligence
-
Natural Language Processing and Speech Recognition
-
Journal of High-Voltage
-
Programming Languages and Operating Systems
-
Visual Communications and Image Processing
-
Journal of Systems Analysis and Integration
-
Knowledge Representation and Automated Reasoning
-
Review of Information Display Techniques
-
Data and Knowledge Engineering
-
Journal of Database Systems
-
Journal of Cluster and Grid Computing
-
Cloud and Service-Oriented Computing
-
Journal of Networking, Architecture and Storage
-
Journal of Software Engineering and Metrics
-
Visualization Techniques
-
Journal of Parallel and Distributed Processing
-
Journal of Modeling, Analysis and Simulation
-
Journal of Privacy, Trust and Security
-
Journal of Cognitive Informatics and Cognitive Computing
-
Lecture Notes on Wireless Networks and Communications
-
International Journal of Computer and Communications Security
-
Journal of Multimedia Techniques
-
Automation and Machine Learning
-
Computational Linguistics Letters
-
Journal of Computer Architecture and Design
-
Journal of Ubiquitous and Future Networks