Skip to content Skip to footer

AGENTGYM Evolves Agents towards General AI from Specific Tasks: Utilizing Various Environments and Independent Learning

Artificial intelligence (AI) research aims to create adaptable and self-learning agents that can handle diverse tasks across different environments. Yet achieving this level of versatility and autonomy is a significant challenge, with current models often requiring extensive human supervision, limiting their scalability.

Past research in this arena includes frameworks like AgentBench, AgentBoard, and AgentOhana, which are mainly focused on developing and evaluating large language model-based (LLM-based) agents. However, these models often rely on behavioural cloning or isolated environmental training, also restricting their scalability and adaptability. Other models such as GPT-3.5-Turbo, GPT-4-Turbo, and Llama-2-Chat, and methods like ReAct and self-improvement approaches using environmental feedback and interactive learning have been explored.

Attempting to tackle this issue, researchers from the Fudan NLP Lab & Fudan Vision and Learning Lab introduced the AGENTGYM framework. This innovative model supports a vast range of environments and tasks, enabling agents to explore widely in real-time. AGENTGYM includes an array of tools for training and evaluating LLM-based agents, focusing on improving their adaptability and performance across different tasks.

The AGENTGYM framework features a platform consisting of numerous environments and tasks, expanded instruction databases, and a set of high-quality trajectories. It employs a new method called AGENTEVOL, which helps agents learn from new experiences through interactions across different environments. This method bolsters the agent’s ability to generalize and adapt to new tasks. Additionally, it comes with a benchmark suite, AGENTEVAL, for evaluating agents’ performance and generalization abilities. Training and evaluation are based on a comprehensive dataset formed using diverse instructions collected from various environments and expanded through crowdsourcing and AI-based methods.

Experimental results showed that agents using AGENTEVOL performed comparably to leading models across a range of tasks. The framework’s capacity to integrate diverse instructions and tasks into the training process led to more versatile agents, adept at handling broader challenges. It’s demonstrated that these agents can evolve to succeed at a significantly higher rate in diverse environments, such as 77.0% in WebShop and 88.0% in ALFWorld, surpassing several baseline models. Consequently, AGENTGYM serves as a potential advancement in the development of generalist AI agents, enhancing their effectiveness and efficiency in real-world applications.

To conclude, the AGENTGYM framework represents a substantial step toward developing adaptable and generally-capable AI agents. This pioneering work from the research team at Fudan NLP Lab & Fudan Vision and Learning Lab has managed to overcome limitations of current approaches by allowing autonomous evolution of agents across diverse environments. The promising results could potentially dictate the future direction of AI research in creating more robust and adaptable agents. The unique integration of assorted environments and autonomous learning methods through the AGENTGYM and AGENTEVOL models supports the potential for developing more skilled, generalist AI agents.

The published paper gives their credit to the project’s researchers, and they invite anyone interested in keeping up-to-date with their work to join their various social networks and subscribe to their newsletter.

Leave a comment

0.0/5