Codestory, a team of researchers, has developed a new multi-agent coding framework known as Aide. Notably, Aide has achieved a 40.3% of accepted solutions on the SWE-Bench-Lite benchmark, which sets a new record in the field. This coding framework is designed to enhance productivity and facilitate easy integration into development environments.
Central to this software framework is the concept of multiple agents, each responsible for a specific code symbol such as a function, class, enum or type. This granular approach allows for natural language communication among agents. The communication is facilitated through the Language Server Protocol (LSP), ensuring efficient and accurate information exchange.
In a practical example, up to 30 agents may operate simultaneously during a single programming run. These agents work in a collaborative manner, sharing information and jointly making decisions. This approach has been demonstrated to significantly improve productivity, with the framework’s application on the SWE-Bench-Lite benchmark cited as compelling evidence.
The development of an editor environment for these agents was facilitated by the use of ClaudeSonnet3.5 and GPT-4o, both of which are renowned for their robust coding capabilities. GPT-40 excels at code editing while Sonnet3.5 is remarkably adept at managing and navigating the codebase.
Sonnet3.5’s agentic aspect plays a crucial role in this framework. It promotes the separation of functions instead of compounding complexity within individual ones. This method, coupled with GPT-4o’s exceptional code editing skills, has enabled improved performance compared to previous software models.
The SWE-Bench-Lite benchmark was chosen to gauge the agent’s performance as it successfully mimics real-world coding challenges. The results indicated that the collaborative approach greatly enhanced the code quality and underlined the potential of these agents to handle complex coding tasks independently.
Despite these significant advancements, there remain challenges that need to be addressed before the framework can be fully integrated into development environments. Necessary research is being conducted to improve the communication between developers and these agents, manage concurrent code modifications and maintain the stability of the code. The team is also aiming to enhance the efficiency of the software model, specifically in terms of inference speed and intelligence costs.
Significantly, Codestory’s objective is not to replace human developers but to enhance their capabilities. The incorporation of multiple software agents to handle detailed tasks will enable developers to focus on more complicated problems. In conclusion, the Aide framework promises to revolutionize the software development process by improving efficiency and accuracy.