Autocurriculum

NOTE: Curricula is the plural form of curriculum.

Autocurricula to the idea that agents can create their own learning objectives and curricula through their interactions with other agents.

It allows agents to learn from each other and adapt to changing environments in a decentralized and scalable way.

I need to understand this to understand the MAESTRO paper.

Is this the same thing as Curriculum Reinforcement Learning? Which is used to prevent getting stuck in local minima.

Well, in curriculum learning, we hand craft the various curriculums, in increasing levels of difficult. However, with autocurriculum, the curriculum is generated automatically based on data / feedback from the learner (so very personalized). The RL agent decides this for itself, best exemplified in Hide and Seek video (paper).

Analogy

This is a great analogy with how school works. If you go to a public school, you go through Curriculum Learning. The curriculum is the same for everyone, and not personalized to you. But with Autocurricula, it’s like being homeschooled. Wait, not even. It’s extremely smart self-learning agent. The RL agent generates a sequence of increasingly challenging tasks for itself to learn from.

Now, I don’t understand how autocurriculum is implemented.

The need for autocurriculum are in cases where there are sparse rewards? So you try to split the difficult into simpler, more manageable tasks.

Links

Had my chat with ChatGPT: https://chat.openai.com/chat/b74fbc79-c438-4756-80e9-ee94ed0885da

Notes on the original paper: “Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research”

paper published in 2018

Summary From ChatGPT: The paper argues that current approaches to artificial intelligence (AI) focus too much on individual agents and not enough on the interactions between agents in a multi-agent system. The authors propose a new research direction that focuses on “multi-agent intelligence,” which involves studying how agents interact with each other and how their interactions lead to the emergence of new behaviors and innovations.

A focus on multi-agent intelligence and autocurricula can lead to the development of more robust, adaptable, and innovative AI systems that can better handle complex, real-world problems.

OpenAI Paper on Hide And Seek

Paper published in 2019 The paper builds upon the concept of “autocurricula” introduced in the previous paper we discussed and focuses specifically on how multi-agent interaction can lead to the emergence of tool use.

The authors present a simulation environment in which agents are tasked with collecting a set of objects and depositing them in a designated area. The agents have different abilities and limitations, such as movement speed and sensor range. The agents can also discover and use tools, such as ramps and barriers, to overcome obstacles and complete their task more efficiently.

The autocurricula approach is used to allow the agents to learn to use the tools through their interactions with the environment and each other. The agents can create their own learning objectives and curricula based on their experiences, and this learning is decentralized and scalable.

The paper presents several experiments that demonstrate the emergence of tool use from the autocurricula approach. The agents are able to discover and use tools in novel and creative ways, such as using a ramp as a bridge or using a barrier to create a shortcut.

The authors also explore the role of communication and coordination in the emergence of tool use. They find that agents that can communicate with each other are able to learn more quickly and use tools more effectively than those that cannot.

Overall, the paper demonstrates how the autocurricula approach can lead to the emergence of new and creative behaviors in multi-agent systems. The approach allows agents to learn from their environment and each other in a decentralized and scalable way, leading to the emergence of complex and adaptive behaviors.