Surgical scenes convey crucial information about the quality of surgery. Pixel-wise localization of tools and anatomical structures is the first task towards deeper surgical analysis for microscopic or endoscopic surgical views. This is typically done via fully-supervised methods which are annotation greedy and in several cases, demanding medical expertise.