jiayangshi nielsr HF Staff commited on
Commit
02e5b74
Β·
1 Parent(s): 94b3078

Improve model card and metadata (#1)

Browse files

- Improve model card and metadata (f29b7757e90b8f2584228a4b1937df7ee8ae8a8b)


Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +49 -31
README.md CHANGED
@@ -1,14 +1,15 @@
1
  ---
2
- license: mit
3
  library_name: diffusers
 
 
4
  tags:
5
- - computed-tomography
6
- - ct-reconstruction
7
- - diffusion-model
8
- - latent-diffusion
9
- - inverse-problems
10
- - dm4ct
11
- - sparse-view-ct
12
  ---
13
 
14
  # Latent Diffusion Model – Synchrotron (DM4CT)
@@ -16,8 +17,8 @@ tags:
16
  This repository contains the pretrained **latent-space diffusion model** used in the
17
  **DM4CT: Benchmarking Diffusion Models for CT Reconstruction (ICLR 2026)** benchmark.
18
 
19
- πŸ”— Paper: https://openreview.net/forum?id=YE5scJekg5
20
- πŸ”— Arxiv: https://arxiv.org/abs/2602.18589
21
  πŸ”— Codebase: https://github.com/DM4CT/DM4CT
22
 
23
  ---
@@ -26,60 +27,77 @@ This repository contains the pretrained **latent-space diffusion model** used in
26
 
27
  This model learns a **prior over CT reconstruction images in a compressed latent space** using a denoising diffusion probabilistic model (DDPM).
28
 
29
- Unlike the pixel diffusion model, diffusion is performed in the latent space of a pretrained autoencoder.
30
 
31
  - **Architecture**:
32
  - VQ-VAE (image encoder/decoder)
33
  - 2D UNet operating in latent space
34
  - **Input resolution (image space)**: 768 Γ— 768
35
- - **Latent resolution**: (insert latent size, e.g., 192 Γ— 192)
36
  - **Channels**: 1 (grayscale CT slice)
37
  - **Training objective**: Ξ΅-prediction (standard DDPM formulation)
38
  - **Noise schedule**: Linear beta schedule
39
  - **Training dataset**: Synchrotron dataset of rocks (Synchrotron)
40
  - **Intensity normalization**: Rescaled to (-1, 1)
41
 
42
- The diffusion model operates purely in latent space and relies on the autoencoder for encoding and decoding.
43
-
44
- This model is intended to be combined with data-consistency correction for CT reconstruction.
45
 
46
  ---
47
 
48
  ## πŸ“Š Dataset: Synchrotron
49
 
50
- Source:
51
- https://zenodo.org/records/15420527
 
52
 
53
  Preprocessing steps:
54
  - Train/test split
55
  - Rescale reconstructed slices to (-1, 1)
56
  - No geometry information is embedded in the model
57
 
58
- The model learns an unconditional latent prior over CT slices.
59
-
60
  ---
61
 
62
  ## 🧠 Training Details
63
 
64
- - Optimizer: AdamW
65
- - Learning rate: 1e-4
66
- - Batch size: (insert your batch size)
67
- - Training steps: (insert number of steps)
68
- - Hardware: NVIDIA A100 GPU
69
-
70
- Training scripts:
71
- - Latent diffusion: https://github.com/DM4CT/DM4CT/blob/main/train_latent.py
72
- - Autoencoder training: (insert if separate)
73
 
74
  ---
75
 
76
  ## πŸš€ Usage
77
 
 
 
78
  ```python
79
- from diffusers import LDMPipeline
 
80
 
81
- LDMPipeline = DiffusionPipeline.from_pretrained(
82
  "jiayangshi/synchrotron_latent_diffusion"
83
  )
 
 
 
 
 
 
 
 
 
 
 
84
 
85
- pipeline.to("cuda")
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  library_name: diffusers
3
+ license: mit
4
+ pipeline_tag: image-to-image
5
  tags:
6
+ - computed-tomography
7
+ - ct-reconstruction
8
+ - diffusion-model
9
+ - latent-diffusion
10
+ - inverse-problems
11
+ - dm4ct
12
+ - sparse-view-ct
13
  ---
14
 
15
  # Latent Diffusion Model – Synchrotron (DM4CT)
 
17
  This repository contains the pretrained **latent-space diffusion model** used in the
18
  **DM4CT: Benchmarking Diffusion Models for CT Reconstruction (ICLR 2026)** benchmark.
19
 
20
+ πŸ”— Paper: [DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction](https://huggingface.co/papers/2602.18589)
21
+ πŸ”— Project Page: https://dm4ct.github.io/DM4CT/
22
  πŸ”— Codebase: https://github.com/DM4CT/DM4CT
23
 
24
  ---
 
27
 
28
  This model learns a **prior over CT reconstruction images in a compressed latent space** using a denoising diffusion probabilistic model (DDPM).
29
 
30
+ Unlike pixel-based diffusion models, diffusion is performed in the latent space of a pretrained autoencoder.
31
 
32
  - **Architecture**:
33
  - VQ-VAE (image encoder/decoder)
34
  - 2D UNet operating in latent space
35
  - **Input resolution (image space)**: 768 Γ— 768
36
+ - **Latent resolution**: 192 Γ— 192
37
  - **Channels**: 1 (grayscale CT slice)
38
  - **Training objective**: Ξ΅-prediction (standard DDPM formulation)
39
  - **Noise schedule**: Linear beta schedule
40
  - **Training dataset**: Synchrotron dataset of rocks (Synchrotron)
41
  - **Intensity normalization**: Rescaled to (-1, 1)
42
 
43
+ The diffusion model operates purely in latent space and relies on the autoencoder for encoding and decoding. This model is intended to be combined with data-consistency correction for CT reconstruction tasks.
 
 
44
 
45
  ---
46
 
47
  ## πŸ“Š Dataset: Synchrotron
48
 
49
+ The model was trained on a real-world high-resolution CT dataset acquired at a high-energy synchrotron facility.
50
+
51
+ Source: https://zenodo.org/records/15420527
52
 
53
  Preprocessing steps:
54
  - Train/test split
55
  - Rescale reconstructed slices to (-1, 1)
56
  - No geometry information is embedded in the model
57
 
 
 
58
  ---
59
 
60
  ## 🧠 Training Details
61
 
62
+ - **Optimizer**: AdamW
63
+ - **Learning rate**: 1e-4
64
+ - **Hardware**: NVIDIA A100 GPU
65
+ - **Training script**: [train_latent.py](https://github.com/DM4CT/DM4CT/blob/main/train_latent.py)
 
 
 
 
 
66
 
67
  ---
68
 
69
  ## πŸš€ Usage
70
 
71
+ You can load and use this model using the `diffusers` library:
72
+
73
  ```python
74
+ from diffusers import DiffusionPipeline
75
+ import torch
76
 
77
+ pipeline = DiffusionPipeline.from_pretrained(
78
  "jiayangshi/synchrotron_latent_diffusion"
79
  )
80
+ pipeline.to("cuda")
81
+
82
+ # Generate an unconditional sample from the CT prior
83
+ # Note: For reconstruction tasks, this model is typically used with
84
+ # a custom solver incorporating CT data consistency.
85
+ output = pipeline()
86
+ image = output.images[0]
87
+ image.save("reconstruction_prior.png")
88
+ ```
89
+
90
+ ---
91
 
92
+ ## πŸ“ Citation
93
+
94
+ ```bibtex
95
+ @inproceedings{
96
+ shi2026dmct,
97
+ title={{DM}4{CT}: Benchmarking Diffusion Models for Computed Tomography Reconstruction},
98
+ author={Shi, Jiayang and Pelt, Dani{\in d}l M and Batenburg, K Joost},
99
+ booktitle={The Fourteenth International Conference on Learning Representations},
100
+ year={2026},
101
+ url={https://openreview.net/forum?id=YE5scJekg5}
102
+ }
103
+ ```