โ› MineRL Competition [Part 3]: Behavioral Cloning

In this post, we study a basic approach to integrate one the main components of this competition: the demonstrations dataset. By providing this dataset, we can reduce the sample complexity of RL algorithms, which is essential given the speed and complexity of the simulated environment. We will apply a technique, called Behavioral Cloning, which is more related to supervised learning than RL. We will also try to enhance a cloned policy by applying A2C again.

Read More

โ› MineRL Competition [Part 2]: A2C Navigation

In this second post, we switch from a value-based method (DQN1) to a basic policy gradient method (A2C2). We will then continue by trying to solve the NavigateDense task, this time in a non-flat world, using a more complex and rich action space. Using a policy gradient method allows us to completely exploit the action space, using also continuous actions (the camera) and combining them in a more complex fashion.

Read More

โ› MineRL Competition [Part 1]: Domain Analysis and First Simple Solution

MineRL is a recently started NeurIPS19 competition, with the goal of sample-efficient DRL algorithms in a very complex and hierarchical problem. I always had a passion for minecraft, since it is a quite complete game, which (in humans) stimulate strategy, exploration and building creativity. MineRL was created starting from Malmo, in which other competitions were hosted in the past (e.g. this one). This time, I also want to thank the organizers for the excellent job of building a simple package ready to use (Malmo, instead, was quite complex to setup and to use).

Read More

๐Ÿ Sneks [Part 1]: Single-Agent and Base Solution using DQN

Despite recent successes in the field of Deep Reinforcement Learning, like Alphastar or OpenAI Five there are still major problems to be tackled in a meaningful way: sample efficiency, exploration-exploitation, life-long learning and others. These problems are visible not only in complex environments but are also presents in problems specifically designed to isolate that trait, e.g. Atariโ€™s Montezuma Revenge for the exploration-exploitation dilemma.

Read More

Tensorflow to PyTorch: My personal travel guide

When it comes to neural networks and optimization, there are many possible computational frameworks, each of which has its particular design. The two main choices that are of mainstream use today are Tensorflow and PyTorch. This post will present a series of notebooks I created during my switch from TF to PyTorch.

Read More