AAAI 2024 Full-Day Workshop
Cooperative Multi-Agent Systems Decision-making and Learning:
From Individual Needs to Swarm Intelligence
Vancouver Convention Centre, Vancouver, BC, Canada, Feb. 26, 2024
[This workshop will be held in person]
[AAAI 2024 registration (workshop-only) is required]

About

With the tremendous growth of AI technology, Robotics, IoT, and high-speed wireless sensor networks (like 5G) in recent years, it gradually forms an artificial ecosystem termed artificial social systems that involves entire AI agents from software entities to hardware devices. How to integrate artificial social systems into human society and coexist harmoniously is a critical issue for the sustainable development of human beings. At this point, rational decision-making and efficient learning from multi-agent systems (MAS) interaction are the preconditions to guarantee multi-agent working in safety, balance the group utilities and system costs in the long term, and satisfy group members' needs in their cooperation.

The main interest of this workshop is the technique of modeling cooperative MAS decision-making and learning from the cognitive modeling perspective. It aims to bring researchers in different communities together to present their research, discuss future research directions, and cross-fertilize the different communities. Researchers and practitioners whose research might apply to cooperative MAS decision-making and learning or who might be able to use those techniques in their research are welcome. The workshop will consist of invited speakers, presentations from researchers with original research papers, and poster sessions. We hope that through our multi-faceted workshop and the talks of our expert speakers, we can attract the interest of the AI and robotics community in research challenges specific to cooperative MAS decision-making and learning.

Topics

We solicit contributions from topics including but not limited to:

  • MAS cognitive modeling
  • Intrinsically motivated AI agent modeling in MAS
  • Innate-values-driven reinforcement learning
  • MAS deep reinforcement learning
  • Multi-Object MAS decision-making and learning
  • Adaptive learning with social rewards
  • Cognitive models in swarm intelligence and robotics
  • Game-theoretic approaches in MAS decision-making
  • Consensus in MAS collaboration
  • Trust based MAS decision-making and learning
  • Trustworthy AI agents in Human-robot interaction
  • Cognitive model application in intelligent social systems

Workshop Video: Part 1 (9:00 am -- 12:30 pm)

Workshop Video: Part 2 (2:00 pm -- 17:30 pm)

Sponsors

Invited Speakers

Maria Gini

EIF RAS, IEEE Fellow,
University of Minnesota

Michael L. Littman

AAAI/AAAS/ACM Fellow,
Brown University

Kevin Leyton-Brown

AAAI/ACM/RSC Fellow,
University of British Columbia

Marco Pavone

Director of Autonomous Systems Lab,
Stanford University

Sven Koenig

AAAI Fellow, the Association of ACM/IEEE/AAAS
University of Southern California

Aaron Courville

CIFAR Fellow and Chair,
Université de Montréal

Giovanni Beltrame

Director of Making Innovative Space Technology Lab,
Polytechnique Montreal

Christopher Amato

Director of Lab for Learning and Planning in Robotics,
Northeastern University

Workshop Schedule



All times in PST (GMT-8), Feb. 26, 2024
9:00a Welcome Remarks: Matthew E. Taylor (University of Alberta)
9:10a Invited Session 1
9:10-9:40a
Maria Gini (University of Minnesota) - Topic: What would you do if you had 10, 1000, or 10000 robots?

Abstract: If you had robots to do tasks, how many robots would you use? Does the answer depend on the tasks? Are the tasks independent of each other, so they can be done in parallel with no coordination, or must some tasks be completed before others? Do the robots need a central controller to coordinate their work, or can each robot work independently? Who will decide which robot does what task? These are central research questions that I will address in the talk.

9:40-10:10a
Aaron Courville (Université de Montréal) - Topic: Q-value Shaping

Abstract: In various real-world scenarios, interactions among agents often resemble the dynamics of general-sum games, where each agent strives to optimize its own utility. Despite the ubiquitous relevance of such settings, decentralized machine learning algorithms have struggled to find equilibria that maximize individual utility while preserving social welfare. In this talk I will discuss our latest efforts in this direction by introducing Q-value Shaping, a novel decentralized RL algorithm tailored to optimizing an agent's individual utility while fostering cooperation among adversaries in partially competitive environments. we assume that during training, the opponent samples actions proportionally to their action-value function Q and we further assume that the agent has access to this Q function. Experimental results demonstrate the effectiveness of our approach at achieving state-of-the-art performance in benchmark scenarios such as the Iterated Prisoner's Dilemma and the Coin Game. We believe this method to be an important step toward training agent for practical multi-agent applications.

10:10-10:15a Q/A and Discussion for Invited Session 1
10:15a
Contributed Papers - Oral Presentations

10:15-10:27a --- Communicating Unexpectedness for Out-of-Distribution Multi-Agent Reinforcement Learning
10:27-10:39a --- Balancing Fairness and Efficiency in Traffic Routing via Interpolated Traffic Assignment
10:39-10:51a --- Bayesian Soft Actor-Critic: A Directed Acyclic Strategy Graph Based Deep Reinforcement Learning
10:51-11:03a --- Learning Communication Policies for Different Follower Behaviors in a Collaborative Reference Game
11:03-11:15a --- Developing a Unified Training Framework for Multi-agent Imperfect-information Games: a Case Study of Mahjong and Bridge

11:15a Invited Session 2
11:15-11:45a
Giovanni Beltrame (Polytechnique Montreal) - Topic: The role of hierarchy in multi-agent decision making

Abstract: The emerging behaviors of swarms have fascinated scientists and gathered significant interest in the field of robotics. While most robot swarms seen in literature are egalitarian (i.e. all robots have identical roles and capabilities), recent evidence suggest that introducing hierarchy (i.e. some robots act as local decision makers) is essential to successfully deploy robot swarms in a wider range of practical applications. While their abundance in nature hints that hierarchies may have certain advantages over egalitarian swarms, the conditions favoring hierarchies have not been empirically demonstrated. We show evidence that egalitarian swarms perform well in environments that are comparable in size to the collective sensing capability of the swarm, but will eventually fail as environments become larger or more complex. We show how hierarchies extend overall sensing capability of the swarm in a resource-effective manner, succeeding in larger and less structured environments with fewer resources.

11:45-12:15p
Michael L. Littman (Brown University) - Topic: Interacting Agents and Safe(r) AI

Abstract: RL plays an important role in creating modern chatbots. This talk explores how some of the current shortcomings in chatbot creation can be mitigated by taking a multiagent perspective and proposes that considerably more human feedback is needed to create chatbots that would generally be seen as "safe" and reliable.

12:15-12:20p Q/A and Discussion for Invited Session 2
12:20p Break
2:00p
Contributed Papers - Poster Session I

2:00-2:05p --- Cognitive Multi-agent Q-Learning for Cooperation in Competitive Environments
2:05-2:10p --- SocialGFs: Learning Social Gradient Fields for Multi-Agent Reinforcement Learning
2:10-2:15p --- Innate-Values-driven Reinforcement Learning for Cooperative Multi-Agent Systems
2:15-2:20p --- Bridging Agent Dynamics and Population Behaviors: Scalable Learning for Mean Field Games on Graph via Neural Operators
2:20-2:25p --- Edge Computing based Human-Robot Cognitive Fusion: A Medical Case Study in the Autism Spectrum Disorder Therapy
2:25-2:30p --- Q/A

2:30p Invited Session 3
2:30-3:00p
Sven Koenig (USC) - Topic: Multi-Agent Pathfinding and Its Applications

Abstract: The coordination of robots and other agents is becoming increasingly important for industry. For example, on the order of one thousand robots navigate autonomously in Amazon fulfillment centers to move inventory pods all the way from their storage locations to the picking stations that need the products they store (and vice versa). Optimal and, in some cases, even approximately optimal path planning for these robots is NP-hard, yet one must find high-quality collision-free paths for them in real-time. Algorithms for such multi-agent path-finding problems had been studied in robotics and theoretical computer science for a long time but were insufficient since they are either fast but result in insufficient solution quality or result in good solution quality but are too slow. In this talk, I will discuss different variants of multi-agent path-finding problems and cool ideas for both solving them (in centralized and decentralized ways) and executing the resulting plans robustly. I will also discuss several applications of the technology (funded by NSF and Amazon Robotics), including warehousing, manufacturing, and train scheduling.

3:00-3:30p
Christopher Amato (Northeastern University) - Topic: Correcting Some Misconceptions about MARL

Abstract: Multi-agent reinforcement learning (MARL) has exploded in popularity but there is a lack of understanding of when current methods work and what is the best way to learn in multi-agent settings. This talk will include some of the fundamental challenges and misunderstandings of multi-agent reinforcement learning. In particular, it will discuss how 1) centralized critics are not strictly better than decentralized critics in MARL (and can be worse), and 2) state-based critics are unsound and work well only in fully-observable multi-agent problems. Furthermore, it will discuss related methods in value-based MARL.

3:30-3:35p Q/A and Discussion for Invited Session 3
3:35p
Contributed Papers - Poster Session II

3:35-3:40p --- User Models and Bayesian Decision-Making for Human-in-the-Loop Problems
3:40-3:45p --- Multi-Robot Cooperative Navigation in Crowds: A Game-Theoretic Learning-Based Model Predictive Control Approach
3:45-3:50p --- Exploratory Training: When Annotators Learn About Data
3:50-3:55p --- Exploiting Relational Planning and Task-Specific Abstractions for Multiagent Reinforcement Learning in Relational Domains
3:55-4:00p --- Q/A

4:00p Invited Session 4
4:00-4:30p
Marco Pavone (Stanford University) - Topic: Artificial Currency Based Government Welfare Programs: From Optimal Design to Audit Games

Abstract: Artificial currencies have grown in prominence in many real-world resource allocation settings, helping alleviate fairness and equity concerns of traditional monetary mechanisms that often favor users with higher incomes. In particular, artificial currencies have gained traction in government welfare programs that support eligible users in the population, e.g., transit benefits programs provide eligible users with subsidized public transit. While such artificial currency based welfare programs are typically well-intentioned and offer immense potential in improving the outcomes for the eligible group of users, the deployment of many such programs in practice is nascent; hence, the efficacy and optimal design of such programs still needs to be formalized. Moreover, such programs are susceptible to several fraud mechanisms, with a notable concern being misreporting fraud, wherein users can misreport their private attributes to gain access to more artificial currency (credits) than they are entitled to. This talk introduces models and methods to study the equilibrium outcomes and the optimal design of such artificial currency based welfare programs to achieve particular societal objectives of an administrator running the benefits program. Moreover, to address the issue of misreporting fraud, we propose a natural audit game, wherein the administrator can audit users at some cost and levy fines against them for misreporting information. Methodologically, we propose a bi-level optimization framework to optimally design artificial currency based welfare mechanisms and develop convex and linear programming approaches to compute the associated equilibrium outcomes. Finally, to highlight the practical viability of our proposed methods, we present case studies in the context of two welfare programs: (i) San Mateo County’s Community Transport Benefits Program, wherein users are provided with travel credits to offset some of their payments for using tolled express lanes on highways, and (ii) Washington D.C.’s federal transit benefits programs that provide subsidized public transit to federal employees.

4:30-5:10p
Kevin Leyton-Brown (UBC) - Topic: Modeling Nonstrategic Human Play in Games

Abstract: It is common to assume that players in a game will adopt Nash equilibrium strategies. However, experimental studies have demonstrated that Nash equilibrium is often a poor description of human players' behavior, even in unrepeated normal-form games. Nevertheless, human behavior in such settings is far from random. Drawing on data from real human play, the field of behavioral game theory has developed a variety of models that aim to capture these patterns. This talk will survey over a decade of work on this topic, built around the core idea of treating behavioral game theory as a machine learning problem. It will touch on questions such as: - Which human biases are most important to model in single-shot game theoretic settings? - What loss function should be used to evaluate and fit behavioral models? - What can be learned about examining the parameters of these models? - How can richer models of nonstrategic play be leveraged to improve models of strategic agents? - When does a description of nonstrategic behavior "cross the line" and deserve to be called strategic? - How can advances in deep learning be used to yield stronger--albeit harder to interpret--models? Finally, there has been much recent excitement about large language models such as GPT-4. Time permitting, the talk will conclude by describing how the economic rationality of such models can be assessed and presenting some initial experimental findings showing the extent to which these models replicate human-like cognitive biases.

5:10-5:15p Q/A and Discussion for Invited Session 4
5:15-5:20p Concluding Remarks

Contributions

Oral Presentation


[#12] Communicating Unexpectedness for Out-of-Distribution Multi-Agent Reinforcement Learning,
Min Whoo Lee, Kibeom Kim, Soo Wung Shin, Minsu Lee and Byoung-Tak Zhang
[Link to Paper] [Link to Poster]
[#3] Balancing Fairness and Efficiency in Traffic Routing via Interpolated Traffic Assignment,
Devansh Jalota, Kiril Solovey, Matthew Tsao, Stephen Zoepf and Marco Pavone
[Link to Paper] [Link to Poster]
[#5] Bayesian Soft Actor-Critic: A Directed Acyclic Strategy Graph Based Deep Reinforcement Learning,
Qin Yang and Ramviyas Parasuraman
[Link to Paper] [Link to Poster]
[#8] Learning Communication Policies for Different Follower Behaviors in a Collaborative Reference Game,
Philipp Sadler, Sherzod Hakimov and David Schlangen
[Link to Paper] [Link to Poster]
[#11] Developing a Unified Training Framework for Multi-agent Imperfect-information Games: a Case Study of Mahjong and Bridge,
Zhilei Fan, Wengang Zhou and Houqiang Li
[Link to Paper] [Link to Poster]

Posters


[#14] Cognitive Multi-agent Q-Learning for Cooperation in Competitive Environments,
Duy Nhat Phan, Patrick Hytla, Andrew Rice and Thuy Ngoc Nguyen
[Link to Paper] [Link to Poster]
[#4] SocialGFs: Learning Social Gradient Fields for Multi-Agent Reinforcement Learning,
Qian Long, Fangwei Zhong, Mingdong Wu, Yizhou Wang and Song-Chun Zhu
[Link to Paper] [Link to Poster]
[#9] Bridging Agent Dynamics and Population Behaviors: Scalable Learning for Mean Field Games on Graph via Neural Operators,
Xu Chen, Shuo Liu and Xuan Di
[Link to Paper] [Link to Poster]
[#2] Innate-Values-driven Reinforcement Learning for Cooperative Multi-Agent Systems,
Qin Yang
[Link to Paper] [Link to Poster]
[#7] Exploratory Training: When Annotators Learn About Data,
Rajesh Shrestha, Omeed Habibelahian, Arash Termehchy and Paolo Papotti
[Link to Paper] [Link to Poster]
[#20] User Models and Bayesian Decision-Making for Human-in-the-Loop Problems,
Sammie Katt and Samuel Kaski
[Link to Paper] [Link to Poster]
[#6] Multi-Robot Cooperative Navigation in Crowds: A Game-Theoretic Learning-Based Model Predictive Control Approach,
Viet-Anh Le, Vaishnav Tadiparthi, Behdad Chalaki, Hossein Nourkhiz Mahjoub, Jovin D'Sa, Ehsan Moradi-Pari and Andreas A. Malikopoulos
[Link to Paper] [Link to Poster]
[#1] Edge Computing based Human-Robot Cognitive Fusion: A Medical Case Study in the Autism Spectrum Disorder Therapy,
Qin Yang
[Link to Paper] [Link to Poster]
[#17] Exploiting Relational Planning and Task-Specific Abstractions for Multiagent Reinforcement Learning in Relational Domains,
Nikhilesh Prabhakar, Ranveer Singh, Prasad Tadepalli and Sriraam Natarajan
[Link to Paper] [Link to Poster]

Organizing Committee

Qin Yang

Assistant Professor,
Intelligent Social Systems and Swarm Robotics Lab (IS3R),
Bradley University

Matthew E. Taylor

Associate Professor,
Intelligent Robot Learning Lab,
University of Alberta

Rui Liu

Assistant Professor,
Cognitive Robotics and AI Lab (CRAI),
Kent State University

Tianpei Yang

Postdoctoral Fellow,
Intelligent Robot Learning Lab,
University of Alberta

Submission

Submissions can contain relevant work in all possible stages, including recently published work, is under submission elsewhere, was only recently finished, or is still ongoing. Authors of papers published or under submission elsewhere are encouraged to submit these papers or short versions (including abstracts) to the workshop, educating other researchers about their work, as long as resubmissions are clearly labeled to avoid copyright violations.

We welcome contributions of both short (2-4 pages) and long papers (6-8 pages) related to our stated vision in the AAAI 2024 proceedings format. Position papers and surveys are also welcome. The contributions will be non-archival but will be hosted on our workshop website. All contributions will be peer reviewed (single-blind).

Acceptance papers will be made publicly available on the workshop website. These non-archival papers and their corresponding posters will also remain available on this website after the workshop. The authors will retain copyright of their papers.