In an effort to improve AI systems and their ability to collaborate with humans, scientists are trying to better understand human decision-making, including its suboptimal aspects, and model it in AI. A model for human or AI agent behaviour, developed by researchers at MIT and the University of Washington, takes into account an agent’s unknown computational constraints which may affect their problem-solving abilities. The model, termed an “inference budget”, calculates an agent’s constraints from their prior actions and uses this to forecast future behaviour.
Potential uses of such an AI include more efficiently understanding a human’s behaviour to infer their goals, potentially preventing mistakes, or adapting to the human’s weaknesses. The scientists used examples of navigation and chess-playing to show how their method outperforms other, popular techniques for modelling this type of decision-making.
Traditional computational models of human behaviour tended to insert noise into the model to simulate suboptimal decision-making. However, this fails to capture the varied ways in which humans make suboptimal decisions. To build their model, the scientists observed chess players, noting less thinking time was needed for simple moves, while stronger players generally spent more time planning complex matches. Thus, the planning time, or “depth”, was a reliable indication of human behaviour.
A problem-solving algorithm is run for a fixed time to solve the problem being studied, and at the end, the decisions made at each step are analysed. This forms the basis for comparing an agent’s decision-making when solving the same problem. The step at which the agent stops planning is identified, allowing the calculation of how long the agent will plan to solve a particular problem. This, in turn, aids in predicting how the agent would react when solving a similar problem. Notably, these ‘inference budgets’ are very interpretable, i.e., more complex problems require more planning, stronger players plan for longer.
The approach was tested on three different modelling tasks and matched or outperformed other popular methods in each instance. In the future, the researchers aim to apply this approach to other domains, such as reinforcement learning used in robotics. The work was funded in part by the MIT Schwarzman College of Computing Artificial Intelligence for Augmentation and Productivity program and the National Science Foundation.