Instructions to use charlesw09/CLEAR-mask-free-video-subtitle-removal-CogvideoX with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use charlesw09/CLEAR-mask-free-video-subtitle-removal-CogvideoX with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("zai-org/CogVideoX-2b", dtype=torch.bfloat16, device_map="cuda") pipe.load_lora_weights("charlesw09/CLEAR-mask-free-video-subtitle-removal-CogvideoX") prompt = "A man with short gray hair plays a red electric guitar." output = pipe(prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
CogVideoX-2b CLEAR LoRA β Subtitle Removal (Supplementary)
This repository releases LoRA + expanded input-projection weights for video-to-video subtitle removal on top of zai-org/CogVideoX-2b.
Disclaimer: This is a supplementary experiment from the CLEAR project. The main paper results use Wan2.1-Control; this CogVideoX-2b variant is not expected to match that baseline. It is shared for reproducibility and comparison.
Architecture change (high level)
CogVideoX-2b is originally text-to-video. For conditioning, the first-stage conv input is expanded:
- Before:
patch_embed.proj: Conv2d(16 β 1920, β¦) - After:
patch_embed.proj: Conv2d(32 β 1920, β¦)- First 16 channels: noisy latent (inherits pretrained weights)
- Last 16 channels: subtitle-video latent (new channels, trained)
Inference concatenates noisy latent and subtitle latent along the channel dimension before the transformer, consistent with training.
Intended use
- Research: subtitle removal / video inpainting with diffusion.
- Not for high-stakes or misleading content; users are responsible for compliance with law and platform policies.
How to use
- Download CogVideoX-2b from
zai-org/CogVideoX-2b. - Place
cogvideox_2b_CLEAR_lora_checkpoint.ptlocally. - Run inference with the provided script (example):
export MODEL_PATH="/path/to/CogVideoX-2b"
export CHECKPOINT="/path/to/cogvideox_2b_CLEAR_lora_checkpoint.pt"
bash scripts/inference_cogvideox_2b.sh \
--input_video /path/to/video_with_subtitles.mp4 \
--output_dir ./output
- Downloads last month
- -
Model tree for charlesw09/CLEAR-mask-free-video-subtitle-removal-CogvideoX
Base model
zai-org/CogVideoX-2b