We consider the problem of online multi-agent Nash social welfare (NSW)
maximization. While previous works of Hossain et al. [2021], Jones et al.
[2023] study similar problems in stochastic multi-agent multi-armed bandits and
show that $\sqrt{T}$-regret is possible after $T$ rounds, their fairness
measure is the product of all agents' rewards, instead of the