Efficient State Representation for Multi-Agent Reinforcement Learning in Traffic Signal Control

Mengze Wang

doi:10.54097/bj3pnm85

Authors

Mengze Wang

DOI:

https://doi.org/10.54097/bj3pnm85

Keywords:

Intelligent transportation system, Traffic signal control, Multi-agent reinforcement learning, Graph neural network

Abstract

With the continuous advancement of urbanization, traffic congestion has emerged as a critical bottleneck limiting the efficiency of urban transportation systems. Conventional traffic signal control strategies, which rely on fixed-time schemes or rule-based adaptive methods, struggle to cope with the highly dynamic and stochastic nature of real-world traffic conditions. In recent years, reinforcement learning (RL) has gained increasing attention in the field of traffic signal control (TSC) due to its ability to autonomously optimize decision-making in dynamic environments. As the foundation of agent decision-making, the representation of environmental states plays a decisive role in control performance. However, most existing studies construct traffic states using only a limited set of representative features, such as queue lengths and signal phase information, which are insufficient to comprehensively capture the complex spatiotemporal dynamics of traffic flows, thereby constraining the learning capability of agents in complex environments. To address these limitations, this paper proposes an Efficient State Representation for Multi-Agent Reinforcement Learning (ESR-MARL) framework. The proposed method incorporates richer traffic information for fine-grained modeling and employs a channel-wise attention mechanism to independently learn and effectively fuse heterogeneous traffic features, enabling the extraction of a more comprehensive and informative traffic state representation. Extensive experiments conducted on both synthetic and real-world traffic datasets demonstrate that ESR-MARL achieves at least a 27.37% improvement in average travel time compared with state-of-the-art baseline methods, thereby validating the effectiveness and superiority of the proposed approach.

Downloads

Download data is not yet available.

References

[1] P. W. Shaikh, M. R. Shah, A. A. Shaikh, and S. A. Mahar, “A review on swarm intelligence and evolutionary algorithms for solving the traffic signal control problem,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 1, pp. 48–63, 2022.

[2] M. E. M. Ali, A. Durdu, S. A. Çeltek, and M. A. Özdemir, “An adaptive method for traffic signal control based on fuzzy logic with Webster and modified Webster formula using SUMO traffic simulator,” IEEE Access, vol. 9, pp. 102985–102997, 2021.

[3] Z. Yu, N. Nianwen, Y. Zheng, Y. Lv, F. Liu, and Y. Zhou, “Review of intelligent traffic signal control strategies driven by deep reinforcement learning,” Computer Science, vol. 50, no. 4, pp. 159–171, 2023.

[4] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA, USA: MIT Press, 2018.

[5] A. Haydari and Y. Yilmaz, “Deep reinforcement learning for intelligent transportation systems: A survey,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 1, pp. 11–32, 2022.

[6] J. J. A. Calvo and I. Dusparic, “Heterogeneous multi-agent deep reinforcement learning for traffic lights control,” in Proc. 26th Irish Conf. Artificial Intelligence and Cognitive Science (AICS), 2018, pp. 1–12.

[7] H. Wei, N. Xu, H. Zhang, G. Zheng, X. Zang, C. Chen, W. Zhang, Y. Zhu, K. Xu, and Z. Li, “CoLight: Learning network-level cooperation for traffic signal control,” in Proc. 28th ACM Int. Conf. Information and Knowledge Management (CIKM), 2019, pp. 1913–1922.

[8] D. Garg, M. Chli, and G. Vogiatzis, “Deep reinforcement learning for autonomous traffic light control,” in Proc. 3rd IEEE Int. Conf. Intelligent Transportation Engineering (ICITE), Sep. 2018, pp. 214–218.

[9] Z. Zhao, K. Wang, Y. Wang, and X. Liang, “Enhancing traffic signal control with composite deep intelligence,” Expert Systems with Applications, vol. 244, Art. no. 123020, 2024.

[10] H. Wei, G. Zheng, H. Yao, and Z. Li, “IntelliLight: A reinforcement learning approach for intelligent traffic light control,” in Proc. 24th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining (KDD), 2018, pp. 2496–2505.

[11] Q. Wu, L. Zhang, J. Shen, L. Lü, B. Du, and J. Wu, “Efficient pressure: Improving efficiency for signalized intersections,” arXiv preprint arXiv:2112.02336, 2021.

[12] L. Zhang, Q. Wu, J. Shen, L. Lü, B. Du, and J. Wu, “Expression might be enough: Representing pressure and demand for reinforcement learning based traffic signal control,” in Proc. Int. Conf. Machine Learning (ICML), 2022, pp. 26645–26654.

[13] L. Li, R. Li, Y. Peng, C. Huang, and J. Yuan, “Cooperative max-pressure enhanced traffic signal control,” in Proc. 31st ACM Int. Conf. Information and Knowledge Management (CIKM), 2022, pp. 4173–4177.

[14] Y. Sun, K. Lin, and A. Kashif, “KeyLight: Intelligent traffic signal control method based on improved graph neural network,” IEEE Transactions on Consumer Electronics, vol. 70, no. 1, pp. 2861–2871, 2024.

[15] K. Yang, Z. Wang, X. Meng, L. Li, Y. Shi, Y. Yu, and Z. Yao, “Store-and-forward with graph attention: Enhanced multi-agent reinforcement learning for emergency-responsive traffic signal control,” Engineering Applications of Artificial Intelligence, vol. 159, Art. no. 111602, 2025.

[16] Y. Zhang, H. Goel, P. Li, M. Damani, S. Chinchali, and G. Sartoretti, “CoordLight: Learning decentralized coordination for network-wide traffic signal control,” IEEE Transactions on Intelligent Transportation Systems, vol. 26, no. 6, pp. 8034–8049, 2025.

[17] S. Mousavi, M. Schukat, and E. Howley, “Traffic light control using deep policy-gradient and value-function-based reinforcement learning,” IET Intelligent Transport Systems, vol. 11, no. 7, pp. 417–423, 2017.

[18] G. Zheng, Y. Xiong, X. Zang, C. Chen, W. Zhang, Y. Zhu, K. Xu, and Z. Li, “Learning phase competition for traffic signal control,” in Proc. 28th ACM Int. Conf. Information and Knowledge Management (CIKM), 2019, pp. 1963–1972.

[19] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolutional block attention module,” in Lecture Notes in Computer Science, vol. 11211, 2018, pp. 3–19.

[20] H. Zhang, S. Feng, C. Chen, W. Zhang, Y. Zhu, Z. Li, and Z. Wang, “CityFlow: A multi-agent reinforcement learning environment for large scale city traffic scenario,” in Proc. The World Wide Web Conf. (WWW), 2019, pp. 3620–3624.

[21] J. Koonce and L. Rodegerdts, Traffic Signal Timing Manual, Tech. Rep. FHWA-HOP-08-024, Federal Highway Administration, Washington, DC, USA, 2008.

[22] P. Varaiya, “The max-pressure controller for arbitrary networks of signalized intersections,” in Advances in Dynamic Network Modeling in Complex Transportation Systems. Springer, 2013, pp. 27–66.

[23] T. Chu, J. Wang, L. Codecà, and Z. Li, “Multi-agent deep reinforcement learning for large-scale traffic signal control,” IEEE Transactions on Intelligent Transportation Systems, 2019.

[24] C. Chen, W. Zhang, Y. Zhu, G. Zheng, and Z. Li, “Toward a thousand lights: Decentralized deep reinforcement learning for large-scale traffic signal control,” in Proc. AAAI Conf. Artificial Intelligence, vol. 34, 2020, pp. 3414–3421.