Existing theoretical studies on offline reinforcement learning (RL) mostly
consider a dataset sampled directly from the target task. In practice, however,
data often come from several heterogeneous but related sources. Motivated by
this gap, this work aims at rigorously understanding o