---
license: apache-2.0
pipeline_tag: depth-estimation
---

# Diffusion Knows Transparency (DKT)

[Paper](https://huggingface.co/papers/2512.23705) | [Project Page](https://daniellli.github.io/projects/DKT/) | [GitHub](https://github.com/Daniellli/DKT)

DKT is a foundation model for **transparent-object**, **in-the-wild**, and **arbitrary-length** video depth and normal estimation. It repurposes generative video priors from video diffusion models into robust and temporally coherent perception for challenging scenarios involving refraction, reflection, and transmission.

## Usage

Please refer to the [GitHub repository](https://github.com/Daniellli/DKT) for installation instructions.

The model can then be used as follows:

```python
from dkt.pipelines.pipelines import DKTPipeline
import os
from tools.common_utils import save_video

pipe = DKTPipeline()
demo_path = 'examples/1.mp4'
prediction = pipe(demo_path)
save_dir = 'logs'
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, 'demo.mp4')
save_video(prediction['colored_depth_map'], output_path, fps=25)
```

## Citation

```bibtex
@article{dkt2025,
  title   = {Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation},
  author  = {Shaocong Xu and Songlin Wei and Qizhe Wei and Zheng Geng and Hong Li and Licheng Shen and Qianpu Sun and Shu Han and Bin Ma and Bohan Li and Chongjie Ye and Yuhang Zheng and Nan Wang and Saining Zhang and Hao Zhao},
  journal = {https://arxiv.org/abs/2512.23705},
  year    = {2025}
}
```