Research Overview

plans? Just as planning and decision-making is an important part of our everyday life, it also plays a major role in modern autonomous systems. For such an agent (e.g. a robot) to display intelligent behavior, it typically needs to be able to reason about the outcome of its actions, and whether they might help the agent achieve its goals. Typically, we are interested in obtaining the best possible sequence of actions that the agent can take (or a good approximation). Planning in Artificial Intelligence, or the problem of obtaining such a course of action (a plan) for autonomous agents, has been a prominent field of study for several decades.

In order to tackle this problem, various frameworks have appeared over the years that allow us to formulate, and solve, such a planning problem mathematically. One such framework, the theory of Markov Decision Processes (MDPs), is particularly interesting in that it allows the agent to reason about stochasticity in the outcome of its actions, i.e. that there is more than just one possible outcome for a given action, and we only know that each of these can happen with some probability. This allows the formulation of the best possible plan as the one that maximizes the expected performance of the agent, taking into account all possible contingencies.

A harder problem arises if we consider that the knowledge that the agent possesses about its environment is limited and uncertain. For example, a mobile robot can only know its position in its environment with a certain confidence. A vision system faces the same problem of uncertainty if it tries to track a moving target. In short, most information available to an autonomous agent is noisy, and this evidently impacts its performance. The agent may then explicitly reason about this aditional source of uncertainty while planning. To deal with this issue, the framework of Partially Observable MDPs (POMDPs) was introduced.

More recently, interest has appeared in studying multi-agent systems (e.g. teams of robots) and the ways through which multiple agents can cooperate in order to achieve a common goal. This problem is generally much harder that planning for a single agent, since in this case the agents have to reason about every possible combination of actions they can jointly take, and also that each of them may receive multiple individual, imperfect observations. The problem of communication between agents, which may itself be imperfect or costly, must also be considered. On the other hand, a team of cooperative agents can produce more complex plans, and solve a much wider array of problems. The framework of Decentralized POMDPs (Dec-POMDPs) allows this kind of cooperative multi-agent problems to be modeled. Its main drawback is that of computational complexity: solving this type of problems involves reasoning over a very large number of possible sequences of actions and measurements, particularly if communication between agents is not possible, and even for very small problems. For this reason, Dec-POMDPs have only been applied, so far, on theoretical problems and small proof-of-concept scenarios. Applying this powerful framework to real teams of robots, in practical scenarios, is the main focus of my PhD work. If possible, this would certainly advance the current state of planning for multiple agent systems. Therefore, my work is focused in studying ways to reduce the overall complexity of Dec-POMDPs, by finding approximate solutions to these models, or exploiting communication between agents, for example.