While deep reinforcement learning (RL) is becoming an integral part of good
decision-making in data science, it is still plagued with sample inefficiency.
This can be challenging when applying deep-RL in real-world environments where
physical interactions are expensive and can risk sys