File(s) under permanent embargo
Reason: This item is currently closed access.
An MDP model-based reinforcement learning approach for production station ramp-up optimization: Q-learning analysis
journal contribution
posted on 2017-11-01, 14:54 authored by Stefanos Doltsinis, Pedro FerreiraPedro Ferreira, Niels LohseNiels LohseRamp-up is a significant bottleneck for the introduction
of new or adapted manufacturing systems. The effort
and time required to ramp-up a system is largely dependent on
the effectiveness of the human decision making process to select
the most promising sequence of actions to improve the system to
the required level of performance. Although existing work has
identified significant factors influencing the effectiveness of rampup,
little has been done to support the decision making during
the process. This paper approaches ramp-up as a sequential
adjustment and tuning process that aims to get a manufacturing
system to a desirable performance in the fastest possible time.
Production stations and machines are the key resources in a
manufacturing system. They are often functionally decoupled
and can be treated in the first instance as independent rampup
problems. Hence, this paper focuses on developing a Markov
decision process (MDP) model to formalize ramp-up of production
stations and enable their formal analysis. The aim is to
capture the cause-and-effect relationships between an operator’s
adaptation or adjustment of a station and the station’s response to
improve the effectiveness of the process. Reinforcement learning
has been identified as a promising approach to learn from rampup
experience and discover more successful decision-making
policies. Batch learning in particular can perform well with little
data. This paper investigates the application of a Q-batch learning
algorithm combined with an MDP model of the ramp-up process.
The approach has been applied to a highly automated production
station where several ramp-up processes are carried out. The
convergence of the Q-learning algorithm has been analyzed
along with the variation of its parameters. Finally, the learned
policy has been applied and compared against previous ramp-up
cases.
Funding
This work was supported by the European Commission as a part of the CP-FP 246083-2 IDEAS and CP-FP 229208-2 FRAME Project.
History
School
- Mechanical, Electrical and Manufacturing Engineering
Published in
IEEE Transactions on Systems, Man, and Cybernetics: SystemsVolume
44Issue
9Pages
1125 - 1138Citation
DOLTSINIS, S., FERREIRA, P. and LOHSE, N., 2014. An MDP model-based reinforcement learning approach for production station ramp-up optimization: Q-learning analysis. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 44 (9), pp.1125-1138.Publisher
© IEEEVersion
- NA (Not Applicable or Unknown)
Publisher statement
This work is made available according to the conditions of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) licence. Full details of this licence are available at: https://creativecommons.org/licenses/by-nc-nd/4.0/Publication date
2014-01-09ISSN
2168-2216eISSN
2168-2232Publisher version
Language
- en