Deep Reinforcement Learning For Dynamic Spectrum Access

Model

In this project I use deep reinforcement learning to tackle the problem of dynamic spectrum access. As shown in the graphic at the top, often there are many agents (pairs of radios) trying to send information using a finite amount of available frequency bands (message tunnels in graphic). It is assumed agents always have packets to transmit. When a single agent transmits using a band, there is a successful transmission. But if multiple agents choose the same frequency band at the same time, their messages will colide and there will be wasted spectrum. The goal is maximize the number of successful transmissions and to acheive fairness such that there is a minimal differences between number of successful successful transmissions for each agent. Deep Reinforcement Learning (DRL) is a promising framework for autonomous agents to learn usage patterns in the frequency spectrum and dynamically adapt to changing environments.

Reinforcment Learning Overview

Tst

Architecture

Tst

Results

Here are some outputs from the network:

See this report for in depth analysis on the problem and results!

References

Here are some