Education & Training

  • PhD 2010-2015

    PhD in Computer Science & Engineering
    PhD Thesis: Machine Learning for Intelligent Agents

    Department of Computer Science & Engineering, University of Ioannina

  • MSc 2008-2010

    Master in Computer Science
    MSc Thesis: Autonomous Mobile Robot Navigation using Reinforcement Learning

    Department of Computer Science, University of Ioannina

  • BSc 2002-2007

    Bachelor in Computer Science

    Department of Computer Science, University of Ioannina

Filter by type:

Sort by year:

A Bayesian Ensemble Regression Framework on the Angry Birds Game

N. Tziortziotis , G. Papagiannis, K. Blekas
Journal Paper IEEE Transactions on Computational Intelligence and AI in Games (TCIAIG), (99), 2015


In this article we introduce AngryBER, an intelligent agent architecture on the Angry Birds domain that employs a Bayesian ensemble inference mechanism to promote decision making abilities. It is based on an efficient tree-like structure for encoding and representing game screenshots, where it exploits its enhanced modeling capabilities. This has the advantage to establish an informative feature space and translate the task of game playing into a regression analysis problem. A Bayesian ensemble regression framework is presented by considering that every combination of objects’ material and bird type has its own regression model. We address the problem of action selection as a multi-armed bandit problem, where the Upper Confidence Bound (UCB) strategy has been used. An efficient online learning procedure has been also developed for training the regression models. We have evaluated the proposed methodology on several game levels, and compared its performance with published results of all agents that participated in the 2013 and 2014 Angry Birds AI competitions. The superiority of the new method is readily deduced by inspecting the reported results.

Machine Learning for Intelligent Agents

N. Tziortziotis
PhD Thesis Department of Computer Science & Engineering, University of Ioannina, Greece, March 2015


This dissertation studies the problem of developing intelligent agents, which are able to acquire skills in an autonomous way, simulating human behaviour. An autonomous intelligent agent acts e ectively in an unknown environment, directing its activity to- wards achieving a specific goal based on some performance measure. Through this interaction, a rich amount of information is received, which allows the agent to per- ceive the consequences of its actions, identify important behavioural components, and adapt its behaviour through learning. In this direction, the present dissertation con- cerns the development, implementation and evaluation of machine learning techniques for building intelligent agents. Three important and very challenging tasks are consid- ered: i) approximate reinforcement learning, where the agent’s policy is evaluated and improved through the approximation of the value function, ii) Bayesian reinforcement learning, where the reinforcement learning problem is modeled as a decision-theoretic problem, by placing a prior distribution over Markov Decision Processes (MDPs) that encodes the agent’s belief about the true environment, and iii) Development of intel- ligent agents on games, which constitute a really challenging platform for developing machine learning methodologies, involving a number of issues that should be resolved, such as the appropriate choice of state representation, continuous action spaces, etc..

Quality Optimization of H.264/AVC Video Transmission over Noisy Environments Using a Sparse Regression Framework

K. Pandremmenou, N. Tziortziotis , S. Paluri, W. Zhang, K. Blekas, L. P. Kondi, S. Kumar
Conference Paper Visual Information Processing and Communication VI, Proceedings of SPIE-IS&T Electronic Imaging, San Francisco, CA, February 2015.


We propose the use of the Least Absolute Shrinkage and Selection Operator (LASSO) regression method in order to predict the Cumulative Mean Squared Error (CMSE), incurred by the loss of individual slices in video transmission. We extract a number of quality-relevant features from the H.264/AVC video sequences, which are given as input to the LASSO. This method has the benefit of not only keeping a subset of the features that have the strongest effects towards video quality, but also produces accurate CMSE predictions. Particularly, we study the LASSO regression through two different architectures; the Global LASSO (G.LASSO) and Local LASSO (L.LASSO). In G.LASSO, a single regression model is trained for all slice types together, while in L.LASSO, motivated by the fact that the values for some features are closely dependent on the considered slice type, each slice type has its own regression model, in an effort to improve LASSO’s prediction capability. Based on the predicted CMSE values, we group the video slices into four priority classes. Additionally, we consider a video transmission scenario over a noisy channel, where Unequal Error Protection (UEP) is applied to all prioritized slices. The provided results demonstrate the efficiency of LASSO in estimating CMSE with high accuracy, using only a few features.

Cover Tree Bayesian Reinforcement Learning

N. Tziortziotis , C. Dimitrakakis, K. Blekas
Journal Paper Journal of Machine Learning Reaserch (JMLR), (15):2313-2335, 2014


This paper proposes an online tree-based Bayesian approach for reinforcement learning. For inference, we employ a generalised context tree model. This defines a distribution on multivariate Gaussian piecewise-linear models, which can be updated in closed form. The tree structure itself is constructed using the cover tree method, which remains efficient in high dimensional spaces. We combine the model with Thompson sampling and approximate dynamic programming to obtain effective exploration policies in unknown environments. The flexibility and computational simplicity of the model render it suitable for many reinforcement learning problems in continuous state spaces. We demonstrate this in an experimental comparison with a Gaussian process model, a linear model and simple least squares policy iteration.

The Reinforcement Learning Competition

C. Dimitrakakis, G. Li, N. Tziortziotis
Magazine Paper Artificial Intelligence (AI) Magazine, 2014


Reinforcement learning is one of the most general problems in artificial intelligence. It has been used to model problems in automated experiment design, control, economics, game playing, scheduling and telecommunications. The aim of the reinforcement learning competition is to encourage the development of very general learning agents for arbitrary reinforcement learning problems and to provide a test-bed for the unbiased evaluation of algorithms.

Usable ABC Reinforcement Learning

C. Dimitrakakis, N. Tziortziotis
Conference Paper Advances in Neural Information Processing Systems 27 (NIPS 2014), ABC in Montreal workshop, Montreal, Canada, December 2014

The issues with the use of Approximate Bayesian Computation in Reinforcement Learning is the following. Firstly, that the model set may comprise simulators which are purely deterministic. Secondly, that there is a dependence between the policy used and the data collected, which necessitate maintaining a representation of the policy used as well as the data history. Thirdly, there is the question of the statistics used. Finally, there is the problem selecting a policy given the data observed so far. In this paper, we report some progress on using more sophisticated statistics and policy search algorithms and show that they have significant impact.

A Bayesian Ensemble Regression Framework on the Angry Birds Game

N. Tziortziotis, G. Papagiannis, K. Blekas
Conference Paper ECAI Symposium on Artificial Intelligence in Angry Birds, Prague, Czech Republic, August 2014.
Second Place on the the Angry Birds AI Competiton 2014.

An ensemble inference mechanism is proposed on the Angry Birds domain. It is based on an efficient tree structure for encoding and representing game screenshots, where it exploits its enhanced modeling capability. This has the advantage to establish an informative feature space and modify the task of game playing to a regression analysis problem. To this direction, we assume that each type of object material and bird pair has its own Bayesian linear regression model. In this way, a multi-model regression framework is designed that simultaneously calculates the conditional expectations of several objects and makes a target decision through an ensemble of regression models. The learning procedure is performed according to an online estimation strategy for the model parameters. We provide comparative experimental results on several game levels that empirically illustrate the efficiency of the proposed methodology.

Play Ms. Pac-Man using an Advanced Reinforcement Learning Agent

N. Tziortziotis, K. Tziortziotis and K. Blekas
Conference Paper 8th Hellenic Conference on Artificial Intelligence (SETN 2014), Ioannina, Greece, May 2014.

Reinforcement Learning (RL) algorithms have been promising methods for designing intelligent agents in games. Although their capability of learning in real time has been already proved, the high dimensionality of state spaces in most game domains can be seen as a significant barrier. This paper studies the popular arcade video game Ms. Pac-Man and outlines an approach to deal with its large dynamical environment. Our motivation is to demonstrate that an abstract but informative state space description plays a key role in the design of efficient RL agents. Thus, we can speed up the learning process without the necessity of Q-function approximation. Several experiments were made using the multiagent MASON platform where we measured the ability of the approach to reach optimum generic policies which enhances its generalization abilities.

ABC Reinforcement Learning

C. Dimitrakakis, N. Tziortziotis
Conference Paper30th International Conference on Machine Learning Learning (ICML 2013), Atlanta, USA, June 2013, JMLR W & CP 28(3):684-692.

We introduce a simple, general framework for likelihood-free Bayesian reinforcement learning, through Approximate Bayesian Computation (ABC). The advantage is that we only require a prior distribution on a class of simulators. This is useful when a probabilistic model of the underlying process is too complex to formulate, but where detailed simulation models are available. ABC-RL allows the use of any Bayesian reinforcement learning technique in this case. It can be seen as an extension of simulation methods to both planning and inference. We experimentally demonstrate the potential of this approach in a comparison with LSPI. Finally, we introduce a theorem showing that ABC is sound.

Linear Bayesian Reinforcement Learning

N. Tziortziotis, C. Dimitrakakis, K. Blekas
Conference Paper23rd International Joint Conference on Artificial Intelligence (IJCAI 2013), Beijing, China, August 2013.

This paper proposes a simple linear Bayesian approach to reinforcement learning. We show that with an appropriate basis, a Bayesian linear Gaussian model is sufficient for accurately estimating the system dynamics, and in particular when we allow for correlated noise. Policies are estimated by first sampling a transition model from the current posterior, and then performing approximate dynamic programming on the sampled model. This form of approximate Thompson sampling results in good exploration in unknown environments. The approach can also be seen as a Bayesian generalisation of least-squares policy iteration, where the empirical transition matrix is replaced with a sample from the posterior.

Resource Allocation in Visual Sensor Networks Using a Reinforcement Learning Framework

K. Pandremmenou, N. Tziortziotis, L. P. Kondi, K. Blekas
Conference Paper18th IEEE International Conference on Digital Signal Processing (DSP), Santorini, Greece, July 2013.

In recent years, video delivery over wireless visual sensor networks (VSNs) has gained increasing attention. The lossy compression and channel errors that occur during wireless multimedia transmissions can degrade the quality of the transmitted video sequences. This paper addresses the problem of cross-layer resource allocation among the nodes of a wireless direct-sequence code division multiple access (DS-CDMA) VSN. The optimal group of pictures (GoP) length during the encoding process is also considered, based on the motion level of each video sequence. Three optimization criteria that optimize a different objective function of the video qualities of the nodes are used. The nodes' transmission parameters, i.e., the source coding rates, channel coding rates and power levels can only take discrete values. In order to tackle the resulting optimization problem, a reinforcement learning (RL) strategy that promises efficient exploration and exploitation of the parameters' space is employed. This makes the proposed methodology usable in large or continuous state spaces as well as in an online mode. Experimental results highlight the efficiency of the proposed method.

Model-based Reinforcement learning using online clustering

N. Tziortziotis, K. Blekas
Conference Paper24th IEEE International Conference onTools with Artificial Intelligence (ICTAI 2012), Pireus, Greece, November 2012.

A significant issue in representing reinforcement learning agents in Markov decision processes is how to design efficient feature spaces in order to estimate optimal policy. This particular study addresses this challenge by proposing a compact framework that employs an on-line clustering approach for constructing appropriate basis functions. Also, it performs a state-action trajectory analysis to gain valuable affinity information among clusters and estimate their transition dynamics. Value function approximation is used for policy evaluation in a least-squares temporal difference framework. The proposed method is evaluated in several simulated and real environments, where we took promising results.

An online kernel-based clustering approach for value function approximation

N. Tziortziotis, K. Blekas
Conference Paper 7th Hellenic Conference on Artificial Intelligence (SETN 2012), Lamia, Greece, May 2012.

Value function approximation is a critical task in solving Markov decision processes and accurately modeling reinforcement learning agents. A significant issue is how to construct efficient feature spaces from samples collected by the environment in order to obtain an optimal policy. The particular study addresses this challenge by proposing an on-line kernel-based clustering approach for building appropriate basis functions during the learning process. The method uses a kernel function capable of handling pairs of state-action as sequentially generated by the agent. At each time step, the procedure either adds a new cluster, or adjusts the winning cluster’s parameters. By considering the value function as a linear combination of the constructed basis functions, the weights are optimized in a temporal-difference framework in order to minimize the Bellman approximation error. The proposed method is evaluated in numerous known simulated environments.

Value Function Approximation through Sparse Bayesian Modeling

N. Tziortziotis, K. Blekas
Conference Paper 9th European Workshop on Reinforcement Learning (EWRL-9), Athens, Greece, September 2011.

In this study we present a sparse Bayesian framework for value function approximation. The proposed method is based on the on-line construction of a dictionary of states which are collected during the exploration of the environment by the agent. A linear regression model is established for the observed partial discounted return of such dictionary states, where we employ the Relevance Vector Machine (RVM) and exploit its enhanced modeling capability due to the embedded sparsity properties. In order to speed-up the optimization procedure and allow dealing with large-scale problems, an incremental strategy is adopted. A number of experiments have been conducted on both simulated and real environments, where we took promising results in comparison with another Bayesian approach that uses Gaussian processes.

A Bayesian Reinforcement Learning framework using Relevant Vector Machines

N. Tziortziotis, K. Blekas
Conference Paper 25th International Conference on Artificial Inteligence (AAAI-2011), San Francinco, USA, August 2011.

In this work we present an advanced Bayesian formulation to the task of control learning that employs the Relevance Vector Machines (RVM) generative model for value function evaluation. The key aspect of the proposed method is the design of the discount return as a generative linear model that constitutes a well-known probabilistic approach. This allows to augment the model with advantegeous sparse priors provided by the RVM's regression framework. We have also taken into account the significant issue of selecting the proper parameters of the kernel design matrix. Experiments have shown that our method produces improved performance in both simulated and real test environments.