Sep, 2023

有序保留的 GFlowNets

TL;DROrder-Preserving GFlowNets (OP-GFNs) are proposed to sample candidates in proportion to a learned reward function consistent with a given order, eliminating the need for a predefined scalar reward in tasks like multi-objective optimization, and it is proven to concentrate on higher hierarchy candidates, achieving state-of-the-art performance in various tasks.