Skip to content Skip to sidebar Skip to footer

AI Governance

Planetarium: A Novel Benchmark for Assessing LLMs in Converting Natural Language Descriptions of Planning Issues into Planning Domain Definition Language PDDL

Large language models (LLMs) have shown promise in solving planning problems, but their success has been limited, particularly in the process of translating natural language planning descriptions into structured planning languages such as the Planning Domain Definition Language (PDDL). Current models, including GPT-4, have achieved only 35% accuracy on simple planning tasks, emphasizing the need…

Read More

Introducing Inspect: The Most Recent AI Safety Assessment Platform Launched by the UK’s AI Safety Institute

The UK government-backed AI Safety Institute has launched a new tool called Inspect, aimed at enhancing the safety and accountability of Artificial Intelligence (AI) technologies. The software library is a significant innovation in AI technology and is expected to increase the robustness of AI safety assessments globally and promote cooperation in AI R&D. As anticipated…

Read More