Multi-UAV OpenAIGym

A Reinforcement Learning Environment for Multi-Service UAV-enabled Wireless Systems

This is a multi-purpose environment for autonomous UAVs offering different communication services in a variety of application contexts (e.g., wireless mobile connectivity services, edge computing, data gathering). The environment was developed based on OpenAI Gym framework, in order to simulate different features of operational environments and by adopting the Reinforcement Learning to generate policies that maximize some desired performance. The quality of the resulting policies can be compared with a simple baseline to evaluate the system and derive guidelines to adopt this technique in different use-cases. All this provides a flexible and extensible OpenAI Gym environment, which allows to generate, evaluate, and compare policies for autonomous multi-drone systems in multi-service applications. 3D env. run

Customize Your Environment

The application context is based on a multi-agent system made up by a variabale number of UAVs which are able to provide one or more (up to three) services to cluster(s) of users who request it. All the the Environment objects (e.g., obstacles, drones, grid-map, users and many others) have been created from scratch in Python. The methods related to the trainining part are made by creating a custom environment with custom methods. All this is made so that my environment is consistent with the OpenAI Gym API. It is possibile to: 2D env. run

  • create different environments of different xy (2D) and z (3D) dimensions;
  • set objects of different heights and with a different distribution;
  • use a desired resolution for the 'xy plane-grid' if you want to make the agents able to detect objects according to a resolution which is larger than the minimum one (the minimum resolution allows the agent to detect perfectly every obstacles and it is based on xy plane-squares of side 1);
  • set a variable number of UAVs and charging stations (or even no charging station);
  • create a base station (eNodeB) or no;
  • set the number of users clusters and in case make users move according to a random walk;
  • set the radius of the UAVs footprint and select a single-service or a multi-service system;
  • select either a continue (infinite in time) or a discrete (variable in time) service request coming from users;
  • select the the users priority (i.e., either all the same or differentiated according to the user account, such as 'free user', 'premium user', . . .);
  • vary the training parameters (e.g, LR, EPSILON, DISCOUNT, etc.);
  • set any other feature you would like to include;
The safe operation of all the features described here is not guaranteed.

How to use it

See section 'How to use' at https://github.com/DamianoBrunori/MultiUAV-OpenAIGy .