Improve model card: Update pipeline tag, library name, add usage example and specific code link

This PR improves the model card for `BAGEL-NHR-Edit` by:
* Updating the `pipeline_tag` from `any-to-any` to `image-to-image` for more accurate categorization and discoverability on the Hub.
* Setting the `library_name` to `transformers` as the model is compatible with the Hugging Face Transformers library (based on Qwen2 architecture).
* Adding `image-editing` and `text-guided-image-editing` to the `tags` metadata for better model discoverability.
* Including a direct link to the specific NHR-Edit GitHub repository (`https://github.com/Riko0/No-Humans-Required-Dataset`) in the header links.
* Adding a comprehensive Python code snippet demonstrating how to use the model for inference with the `transformers` library.
* Adding the Hugging Face Papers link for the paper alongside the existing arXiv link in the header for easy access.

Please review and merge this pull request if the changes align with the repository's guidelines.

Files changed (1) hide show

README.md +70 -10

README.md CHANGED Viewed

@@ -1,29 +1,89 @@
 ---
-license: apache-2.0
 base_model:
 - ByteDance-Seed/BAGEL-7B-MoT
-pipeline_tag: any-to-any
-library_name: bagel-mot
 arxiv: 2507.14119
 ---
 # 🥯 BAGEL-NHR-Edit
 <p align="left">
-  <a href="https://riko0.github.io/No-Humans-Required/"> 🌐 NHR Website </a> |
-  <a href="https://arxiv.org/abs/2507.14119"> 📜 NHR Paper on arXiv </a> |
-  <a href="https://huggingface.co/datasets/iitolstykh/NHR-Edit"> 🤗 NHR-Edit Dataset </a> |
 </p>
-This repository hosts the model weights for **BAGEL**, fine-tuned on the **[NHR-Edit](https://huggingface.co/datasets/iitolstykh/NHR-Edit)** dataset. For installation, usage instructions, and further documentation, please visit the [official BAGEL GitHub repository](https://github.com/bytedance-seed/BAGEL).
 ### 🛠️ Training Setup
 We performed parameter-efficient adaptation on the generation expert’s attention and FFN projection layers using LoRA.
-LoRA parameters:
 ```
 r = 16
 lora_alpha = 16
@@ -83,4 +143,4 @@ Results comparison between original Bagel-7B-MoT and BAGEL-NHR-EDIT on samples f
     url = {https://arxiv.org/abs/2507.14119},
     journal={arXiv preprint arXiv:2507.14119}
 }
-```

 ---
 base_model:
 - ByteDance-Seed/BAGEL-7B-MoT
+library_name: transformers
+license: apache-2.0
+pipeline_tag: image-to-image
 arxiv: 2507.14119
+tags:
+- image-editing
+- text-guided-image-editing
 ---
 # 🥯 BAGEL-NHR-Edit
 <p align="left">
+  <a href="https://riko0.github.io/No-Humans-Required/"> 🌐 NHR Website </a> |
+  <a href="https://huggingface.co/papers/2507.14119"> 📜 NHR Paper </a> |
+  <a href="https://arxiv.org/abs/2507.14119"> 📜 NHR Paper on arXiv </a> |
+  <a href="https://huggingface.co/datasets/iitolstykh/NHR-Edit"> 🤗 NHR-Edit Dataset </a> |
+  <a href="https://github.com/Riko0/No-Humans-Required-Dataset"> 💻 NHR-Edit Code </a>
 </p>
+This repository hosts the model weights for **BAGEL-NHR-Edit**, a **BAGEL** model fine-tuned on the **[NHR-Edit](https://huggingface.co/datasets/iitolstykh/NHR-Edit)** dataset, as presented in the paper [NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining](https://huggingface.co/papers/2507.14119). For installation, usage instructions, and further documentation specific to NHR-Edit, please visit the [NHR-Edit GitHub repository](https://github.com/Riko0/No-Humans-Required-Dataset).
+### 🚀 Sample Usage
+You can use the model with the Hugging Face `transformers` library.
+```python
+from transformers import AutoProcessor, AutoModelForCausalLM
+import torch
+from PIL import Image
+import requests # Required to load image from URL
+# Load model and processor
+# It's recommended to use bfloat16 for better performance and memory efficiency if supported.
+model = AutoModelForCausalLM.from_pretrained(
+    "iitolstykh/BAGEL-NHR-Edit",
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+    trust_remote_code=True # Needed for custom model architecture/components
+)
+processor = AutoProcessor.from_pretrained("iitolstykh/BAGEL-NHR-Edit")
+# Example image input (replace with your actual image path or PIL Image object)
+image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg"
+image = Image.open(requests.get(image_url, stream=True).raw).convert("RGB")
+# Example text instruction for editing
+instruction = "Change the car's color to red."
+# Prepare the conversation in the model's chat template format
+messages = [
+    {
+        "role": "user",
+        "content": [
+            {"type": "image", "image": image},
+            {"type": "text", "text": instruction},
+        ],
+    }
+]
+# Apply chat template and prepare inputs for the model
+text = processor.apply_chat_template(
+    messages, tokenize=False, add_generation_prompt=True
+)
+inputs = processor(text=[text], images=[image], return_tensors="pt")
+inputs = {k: v.to(model.device) for k, v in inputs.items()}
+# Generate the textual output describing the image edit
+# This output typically includes bounding box or quad information for the edited regions.
+generated_ids = model.generate(**inputs, max_new_tokens=1024)
+generated_text = processor.batch_decode(generated_ids[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True)[0]
+print("Generated Textual Description of Edit:")
+print(generated_text)
+# To visualize the actual image edit, further post-processing (e.g., using a diffusion model conditioned on this output)
+# would be required, which is beyond the scope of this basic inference example.
+```
 ### 🛠️ Training Setup
 We performed parameter-efficient adaptation on the generation expert’s attention and FFN projection layers using LoRA.
+LoRA parameters:
 ```
 r = 16
 lora_alpha = 16
     url = {https://arxiv.org/abs/2507.14119},
     journal={arXiv preprint arXiv:2507.14119}
 }
+```