Radiology reporting is a crucial part of the communication between radiologists and other medical professionals, but it can be time-consuming and error-prone. One approach to alleviate this is structured reporting, which saves time and enables a more accurate evaluation than free-text reports. However, there is limited research on automating structured reporting, and no public benchmark is available for evaluating and comparing different methods. To close this gap, we introduce Rad-ReStruct, a new benchmark dataset that provides fine-grained, hierarchically ordered annotations in the form of structured reports for X-Ray images. We model the structured reporting task as hierarchical visual question answering (VQA) and propose hi-VQA, a novel method that considers prior context in the form of previously asked questions and answers for populating a structured radiology report. Our experiments show that hi-VQA achieves competitive performance to the state-of-the-art on the medical VQA benchmark VQARad while performing best among methods without domain-specific vision-language pretraining and provides a strong baseline on Rad-ReStruct. Our work represents a significant step towards the automated population of structured radiology reports and provides a valuable first benchmark for future research in this area. We will make all annotations and our code for annotation generation, model evaluation, and training publicly available upon acceptance. Our dataset and code is available at https://github.com/ChantalMP/Rad-ReStruct.

本研究提出了一个新的基准数据集 Rad-ReStruct，通过采用分层结构化报告的形式，提供了细粒度、分层次的注释，以用于支持医学影像元数据的自动填充以及加快放射学报告的编写速度，并通过开发一种新颖的分层视觉问答模型 hi-VQA 实现进行项目分析。

Rad-ReStruct: 一种结构化放射学报告的新型VQA基准和方法