--- base_model: black-forest-labs/FLUX.1-dev library_name: peft pipeline_tag: text-to-image license: apache-2.0 tags: - flux - lora - diffusers - text-rendering - visual-text-rendering --- # TextPecker: Flux.1-dev-TextPecker-SQPA This model is a LoRA adapter for [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) optimized using the **TextPecker** strategy, as presented in the paper [TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering](https://huggingface.co/papers/2602.20903). TextPecker is a plug-and-play structural anomaly perceptive RL strategy that improves the structural fidelity and semantic alignment of visual text rendering in text-to-image generators. This repository provides the LoRA weights trained using Flow-GRPO. - **Repository:** https://github.com/CIawevy/TextPecker - **Paper:** [https://arxiv.org/abs/2602.20903](https://arxiv.org/abs/2602.20903) ## Usage This model provides only the LoRA weights. You will need to load the Flux.1-dev base model first. ```python import os import torch from diffusers import FluxPipeline from peft import PeftModel # Environment variable configuration (consistent with FLUX inference code) os.environ["PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION"] = "python" def load_model(model_path, lora_path=None): """Load FLUX pipeline with optional LoRA weights (aligned with FLUX inference code)""" torch_dtype = torch.bfloat16 device = "cuda" # Initialize FLUX Pipeline (FLUX.1-dev dedicated pipeline) pipe = FluxPipeline.from_pretrained( model_path, torch_dtype=torch_dtype, ).to(device) # Disable safety checker (core configuration from FLUX inference code) pipe.safety_checker = None # Optimize progress bar display (consistent with FLUX inference code) pipe.set_progress_bar_config( position=1, disable=False, leave=False, desc="Timestep", dynamic_ncols=True, ) # Load LoRA weights (standard FLUX LoRA loading method) if lora_path is not None and os.path.exists(lora_path): pipe.transformer = PeftModel.from_pretrained(pipe.transformer, lora_path) pipe.transformer.eval() # Set to inference mode print(f"Successfully loaded LoRA weights from: {lora_path}") return pipe # Core configuration (aligned with FLUX inference code parameters) model_id = "black-forest-labs/FLUX.1-dev" lora_ckpt_path = "CIawevy/Flux.1-dev-TextPecker-SQPA" # Replace with your FLUX LoRA path device = "cuda" # FLUX inference parameters (exact match with reference code) negative_prompt = " " width, height = 1024, 1024 # Standard resolution for FLUX num_inference_steps = 50 # Inference steps from FLUX reference code guidance_scale = 3.5 # Guidance scale from FLUX reference code max_sequence_length = 512 # FLUX-specific parameter (critical for proper inference) # Load FLUX model with LoRA pipe = load_model(model_id, lora_ckpt_path) # Generate image (aligned with FLUX inference code parameter format) prompt = 'a weathered cave explorers journal page, with the phrase "TextPecker" prominently written in faded ink, surrounded by sketches of ancient ruins and cryptic symbols, under a dim, mystical light.' image = pipe( prompt=prompt, negative_prompt=negative_prompt, width=width, height=height, num_inference_steps=num_inference_steps, guidance_scale=guidance_scale, max_sequence_length=max_sequence_length, # FLUX-specific core parameter generator=torch.Generator(device=device).manual_seed(42) ).images[0] # Save result (FLUX naming convention) image.save("TextPecker_flux_demo.png") print("Image saved as: TextPecker_flux_demo.png") ``` ## Citation If you find TextPecker useful in your research or work, please cite the original paper: ```bibtex @article{zhu2026TextPecker, title = {TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering}, author = {Zhu, Hanshen and Liu, Yuliang and Wu, Xuecheng and Wang, An-Lan and Feng, Hao and Yang, Dingkang and Feng, Chao and Huang, Can and Tang, Jingqun and Bai, Xiang}, journal = {arXiv preprint arXiv:2602.20903}, year = {2026} } ```