Research Overview
Just as planning and decision-making is an
important part of our everyday life, it also plays a major role in
modern autonomous systems. For such an agent (e.g. a robot) to
display intelligent behavior, it typically needs to be able to
reason about the outcome of its actions, and whether they might help
the agent achieve its goals. Typically, we are interested in
obtaining the best possible sequence of actions that the agent can
take (or a good approximation). Planning in Artificial Intelligence,
or the problem of obtaining such a course of action (a plan) for
autonomous agents, has been a prominent field of study for several
decades.
In order to tackle this problem, various frameworks have appeared
over the years that allow us to formulate, and solve, such a
planning problem mathematically. One such framework, the theory of
Markov Decision Processes (MDPs), is particularly interesting in
that it allows the agent to reason about stochasticity in the
outcome of its actions, i.e. that there is more than just one
possible outcome for a given action, and we only know that each of
these can happen with some probability. This allows the formulation
of the best possible plan as the one that maximizes the expected
performance of the agent, taking into account all possible
contingencies.
A harder problem arises if we consider that the knowledge that the
agent possesses about its environment is limited and uncertain. For
example, a mobile robot can only know its position in its
environment with a certain confidence. A vision system faces the
same problem of uncertainty if it tries to track a moving target. In
short, most information available to an autonomous agent is noisy,
and this evidently impacts its performance. The agent may then
explicitly reason about this aditional source of uncertainty while
planning. To deal with this issue, the framework of Partially
Observable MDPs (POMDPs) was introduced.
More recently, interest has appeared in studying multi-agent systems
(e.g. teams of robots) and the ways through which multiple agents
can cooperate in order to achieve a common goal. This problem is
generally much harder that planning for a single agent, since in
this case the agents have to reason about every possible combination
of actions they can jointly take, and also that each of them may
receive multiple individual, imperfect observations. The problem of
communication between agents, which may itself be imperfect or
costly, must also be considered. On the other
hand, a team of cooperative agents can produce more complex plans,
and solve a much wider array of problems. The framework of
Decentralized POMDPs (Dec-POMDPs) allows this kind of cooperative
multi-agent problems to be modeled. Its main drawback is that of
computational complexity: solving this type of problems involves
reasoning over a very large number of possible sequences of actions
and measurements, particularly if communication between agents is
not possible, and even for very small problems. For this reason,
Dec-POMDPs have only been applied, so far, on theoretical problems
and small proof-of-concept scenarios. Applying this powerful framework to
real teams of robots, in practical scenarios, is the main focus of
my PhD work. If possible, this would certainly advance the
current state of planning for multiple agent systems. Therefore, my
work is focused in studying ways to reduce the overall complexity of
Dec-POMDPs, by finding approximate solutions to these models, or exploiting
communication between agents, for example.