针对Flow-shop排序问题的固有复杂性,设计了一种遗传强化学习算法.首先,引入状态变量和行动变量,把组合优化的排序问题转换成序贯决策问题加以解决;其次,设计了一个Q-学习算法和基于组合算子的遗传算法相集成,遗传算法利用染色体的优良模式及其适应值信息来指导智能体的学习过程,提高学习效率和效果,强化学习则对染色体进行局部优化进而改良遗传群体,二者有机结合共同解决Flow-shop排序问题;再次,提出了多种适应性策略,使算法关键参数能够周期性递变,以更好地在深度搜索和广度搜索之间均衡;最后,仿真优化实验结果验证了该算法的有效性.
Considering the inherent complexity of Flow-shop scheduling problem, an algorithm named Genetic Reinforcement Learning, GRL, is designed to solve it. First, state variable and action variable are employed to transform the combinational-optimization scheduling problem into sequential-decision problem. Secondly, a Q-Learning algorithm is proposed to integrate with a Genetic Algorithm based on combined operators. The agent is supervised by chromosomes' good modes and their fitness information. As a result, the agent' s learning performance is improved. The genetic population is also meliorated by the local optimization of Reinforcement Learning to each chromosome. So GA and RL are integrated in GRL to solve the Flow-shop scheduling problem. Thirdly, several self-adaptive policies are introduced into GRL algorithm to make it balance in exploitation and exploration. Finally, the algorithm is validated by simulation experiments.