由于机组约束的存在,发电商在制定竞标策略时不能仅仅针对单个时间点进行优化,还应考虑到这一决策对相邻时段的影响。该文在电能交易的时前市场背景下,以Q学习算法为基础,建立了考虑容量、爬升速率、最小开/关机时间等机组运行约束和机组启动成本的发电商竞标策略模型。该模型从市场前一小时的出清结果出发,在考虑机组运行约束的前提下组织当前时刻机组的报价策略,通过同环境的不断交互,可以在随机波动较大的电力市场环境中得到当日累积回报最大的竞标决策。最后,通过一个10机组系统对模型进行了仿真验证。
Power generation companies' (Genco) bidding strategy should consider the optimality in an entire period of time due to the unit operating constraints coupling over times. This paper presents a model for obtaining a Genco's optimal bidding strategy in the hour-ahead power market through Q-learning with unit operating conslraints and start-up cost incorporated. The optimal bidding strategy in terms of cumulative total returns is gained through the iterative learning process. Numerical testing results show that this method is effective.