mk-runner/SEI
[MICCAI'24] Structural Entities Extraction and Patient Indications Incorporation for Chest X-ray Report Generation
SEI
The automated generation of imaging reports proves invaluable in alleviating the workload of radiologists. A clinically applicable reports generation algorithm should demonstrate its effectiveness in producing reports that accurately describe radiology findings and attend to patient-specific indications. In this paper, we introduce a novel method, Structural Entities extraction and patient indications Incorporation (SEI) for chest X-ray report generation. Specifically, we employ a structural entities extraction (SEE) approach to eliminate presentation-style vocabulary in reports and improve the quality of factual entity sequences. This reduces the noise in the following cross-modal alignment module by aligning X-ray images with factual entity sequences in reports, thereby enhancing the precision of cross-modal alignment and further aiding the model in gradient-free retrieval of similar historical cases. Subsequently, we propose a cross-modal fusion network to integrate information from X-ray images, similar historical cases, and patient-specific indications. This process allows the text decoder to attend to discriminative features of X-ray images, assimilate historical diagnostic information from similar cases, and understand the examination intention of patients. This, in turn, assists in triggering the text decoder to produce high-quality reports. Experiments conducted on MIMIC-CXR validate the superiority of SEI over state-of-the-art approaches on both natural language generation and clinical efficacy metrics.
Update
- 2024-09-09, Upload the Poster
- 2024-09-19, Update the repository to make it easy.
- 2024-09-19, Update the generated reports for the MIMIC-CXR test set.
Requirements
torch==2.1.2+cu118transformers==4.23.1torchvision==0.16.2+cu118radgraph==0.09- Due to the specific environment of RadGraph, please refer to
knowledge_encoder/factual serialization. pyfor the environment of the structural entities approach.
Checkpoints
You can download checkpoints of SEI as follows:
- For
MIMIC-CXR, you can download checkpoints from Baidu Netdisk (its code isMK13) and huggingface ๐ค.
MIMIC-CXR Datasets
-
For
MIMIC-CXR, you can download medical images from PhysioNet. -
You can download
medical reportsfrom Google Drive. Note that you can apply with your license of PhysioNet, and itstoy caseis inknowledge_encoder/case.json
Reproducibility on MIMIC-CXR (SEI-1)
Structural entities extraction (SEE) approach
-
Config RadGraph environment based on
knowledge_encoder/factual_serialization.py===================environmental setting=================
Basic Setup (One-time activity)
a. Clone the DYGIE++ repository from here. This repository is managed by Wadden et al., authors of the paper Entity, Relation, and Event Extraction with Contextualized Span Representations.
git clone https://github.com/dwadden/dygiepp.git
b. Navigate to the root of the repo in your system and use the following commands to set the conda environment:
conda create --name dygiepp python=3.7 conda activate dygiepp cd dygiepp pip install -r requirements.txt conda develop . # Adds DyGIE to your PYTHONPATH
c. Activate the conda environment:
conda activate dygiepp
Notably, for our RadGraph environment, you can refer to
knowledge_encoder/radgraph_requirements.yml. -
Configure
radgraph_pathandann_pathinknowledge_encoder/see.py. Theann_pathcorresponds to the local file path to the medical reports. Theradgraph_pathcan be obtained directly from PhysioNet. -
Run the
knowledge_encoder/see.pyto extract the factual entity sequence for each report. -
Finally, the
annotation.jsonbecomesmimic_cxr_annotation_sen.jsonthat is identical tonew_ann_file_namevariable insee.py
Conducting the first stage (i.e., training cross-modal alignment module)
- Run
bash pretrain_mimic_cxr.shto pretrain a model on the MIMIC-CXR data (Note that themimic_cxr_ann_pathismimic_cxr_annotation_sen.json).
Retrieval of similar historical cases for each sample
-
Config
--loadargument inpretrain_inference_mimic_cxr.sh. Note that the argument is the pre-trained model from the first stage. -
Run
bash pretrain_inference_mimic_cxr.shto retrieve similar historical cases for each sample, formingmimic_cxr_annotation_sen_best_reports_keywords_20.json(i.e., themimic_cxr_annotation_sen.jsonbecomes this*.jsonfile).
Conducting the second stage (i.e., training report generation module)
-
Extract and preprocess the
indication sectionin the radiology report.a. Config
ann_pathandreport_dirinknowledge_encoder/preprocessing_indication_section.py, and its value ismimic_cxr_annotation_sen_best_reports_keywords_20.json.
Note thatreport_dircan be downloaded from PhysioNet.b. Run
knowledge_encoder/preprocessing_indication_section.py, formingmimic_cxr_annotation_sen_best_reports_keywords_20_all_components_with_fs_v0227.json -
Config
--loadargument infinetune_mimic_cxr.sh. Note that the argument is the pre-trained model from the first stage. Furthermore,mimic_cxr_ann_pathismimic_cxr_annotation_sen_best_reports_keywords_20_all_components_with_fs_v0227.json -
Download these checkpoints. Notably, the
chexbert.pthandradgraphare used to calculate CE metrics, andbert-base-uncasedandscibert_scivocab_uncasedare pre-trained models for cross-modal fusion network and text encoder. Then put these checkpoints in the same local dir (e.g., "/home/data/checkpoints"), and configure the--ckpt_zoo_dir /home/data/checkpointsargument infinetune_mimic_cxr.sh
| Chekpoint | Variable_name | Download |
|---|---|---|
| chexbert.pth | chexbert_path | here |
| bert-base-uncased | bert_path | huggingface |
| radgraph | radgraph_path | PhysioNet |
| scibert_scivocab_uncased | scibert_path | huggingface |
- Run
bash finetune_mimic_cxr.shto generate reports based on similar historical cases.
Test
-
You must download the medical images, their corresponding reports (i.e.,
mimic_cxr_annotation_sen_best_reports_keywords_20_all_components_with_fs_v0227.json), and checkpoints (i.e.,SEI-1-finetune-model-best.pth) in Section Datasets and Section Checkpoints, respectively. -
Config
--loadand--mimic_cxr_ann_patharguments intest_mimic_cxr.sh -
Run
bash test_mimic_cxr.shto generate reports based on similar historical cases. -
Results on MIMIC-CXR are presented as follows:
- Next, the code for this project will be streamlined.
Experiments
Main Results
Ablation Study
Citations
If you use or extend our work, please cite our paper at MICCAI 2024.
@InProceedings{liu-sei-miccai-2024,
author={Liu, Kang and Ma, Zhuoqi and Kang, Xiaolu and Zhong, Zhusi and Jiao, Zhicheng and Baird, Grayson and Bai, Harrison and Miao, Qiguang},
title={Structural Entities Extraction and Patient Indications Incorporation for Chest X-Ray Report Generation},
booktitle={Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
year={2024},
publisher={Springer Nature Switzerland},
address={Cham},
pages={433--443},
isbn={978-3-031-72384-1},
doi={10.1007/978-3-031-72384-1_41}
}
Acknowledgement
- R2Gen Some codes are adapted based on R2Gen.
- R2GenCMN Some codes are adapted based on R2GenCMN.
- MGCA Some codes are adapted based on MGCA.
References
[1] Chen, Z., Song, Y., Chang, T.H., Wan, X., 2020. Generating radiology reports via memory-driven transformer, in: EMNLP, pp. 1439โ1449.
[2] Chen, Z., Shen, Y., Song, Y., Wan, X., 2021. Cross-modal memory networks for radiology report generation, in: ACL, pp. 5904โ5914.
[3] Wang, F., Zhou, Y., Wang, S., Vardhanabhuti, V., Yu, L., 2022. Multigranularity cross-modal alignment for generalized medical visual representation learning, in: NeurIPS, pp. 33536โ33549.




