Deep reinforcement learning for audio

Broader context

One of the long-term goals of my research is to create agents that can learn to listen autonomously. To achieve this, we have to move away from the supervised learning paradigm, where an “oracle” model provides a “ground truth” label that needs to be learned. Instead, we need to create agents that are incentivized to pursue their own goals. One way to achieve this is through (deep) reinforcement learning.

Current status

I have already started to explore ways in which virtual agents can learn through reinforcement learning. An existing (private) repository is available which implements on particular environment and learning objective.

Open research questions

There are plenty of open research questions for this topic:

  • How to set up a realistic environment?
  • What are suitable goals for an RL agent?
  • How to make measure performance in such a scenario?
  • How can we use those agents post-training?

References

There are practicaly zero references on applying reinforcement learning to audio in the way that I aim to apply it. Therefore, the following are just general references to reinforcement learning.