by Lin, Huh, Stauffer, Lim (NeurIPS 2021)
Communication requires a common language, a lingua franca. How could such a lingua france emerge?
Studies by cognitive science and evolutionary linguistics have some evidence to suggest that communication began with sounds that had grounded meaning.
Emergent communication in Multi-Agent RL (MARL) is a challenging problem due to non-stationary and non-Markovian transition dynamics. MARL models unable to solve tasks that rely on emergent communication, even with centralized learning and shared policy parameters.
Previous work shows that because there is no grounded information for agents to associate symbolic utterances with, communication channel goes unused.
Consider N fully cooperative MARL agents in partially observable MDP.
Policy networks trained using A3C with Generalized Advantage Estimation.
No centralized training or self-play. Each agent has independent policy.
More specifically, equip each agent with (1) an image autoencoder for its own observations, and (2) a communication autoencoder that maps from the image autoencoder’s latent to the communication symbols. The autoencoded image is concatenated with the output of the message encoder.
Autoencoder trained jointly with GRU policy performed worse than separately trained autoencoder and GRU policy.tags: emergent-communication - multiagent - autoencoders