The last years have seen a surge in models predicting the scanpaths of
fixations made by humans when viewing images. However, the field is lacking a
principled comparison of those models with respect to their predictive power.
In the past, models have usually been evaluated based on comparing human
scanpaths to scanpaths generated from the model. Here, inste