Autonomous parallel-style on-ramp merging in human controlled traffic continues to be an existing issue for autonomous vehicle control. Existing non-learning based solutions for vehicle control rely on rules and optimization primarily. These methods have been seen to present significan