\documentclass[12pt]{article}
\usepackage{amsmath,amssymb,amsfonts}
\begin{document}
In an incomplete information game, a big challenge is to
find the best way of exploiting available information for optimal decision making of the agents. In this paper, two decision making methods, namely model-based and learning-based bidding strategies, are proposed and compared, for repeated Cournot competition of the generators in a day-ahead electricity market. The sum of the rivals" offered quantities (SROQ) is considered as the state of the agent and its value is estimated using an adaptive expectation method. In the model-based approach, the convergence of the agents" strategies to the Nash equilibrium point is also studied in two different cases. In the learning-based
approach, the optimal bidding strategy is learned through combination of state estimation and a reinforcement learning method. Using the estimated state (SROQ), the optimal decision is learned through a fuzzy Q-learning algorithm. Through a case study, which is performed on the three-bus benchmark Cournot model, the convergence of the generators bids to the Nash Cournot equilibrium is examined.
\end{document}