Instructions to use iitolstykh/Bagel-NHR-Edit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Bagel
How to use iitolstykh/Bagel-NHR-Edit with Bagel:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
Improve model card: Update pipeline tag, library name, add usage example and specific code link
Browse filesThis PR improves the model card for `BAGEL-NHR-Edit` by:
* Updating the `pipeline_tag` from `any-to-any` to `image-to-image` for more accurate categorization and discoverability on the Hub.
* Setting the `library_name` to `transformers` as the model is compatible with the Hugging Face Transformers library (based on Qwen2 architecture).
* Adding `image-editing` and `text-guided-image-editing` to the `tags` metadata for better model discoverability.
* Including a direct link to the specific NHR-Edit GitHub repository (`https://github.com/Riko0/No-Humans-Required-Dataset`) in the header links.
* Adding a comprehensive Python code snippet demonstrating how to use the model for inference with the `transformers` library.
* Adding the Hugging Face Papers link for the paper alongside the existing arXiv link in the header for easy access.
Please review and merge this pull request if the changes align with the repository's guidelines.
|
@@ -1,29 +1,89 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
base_model:
|
| 4 |
- ByteDance-Seed/BAGEL-7B-MoT
|
| 5 |
-
|
| 6 |
-
|
|
|
|
| 7 |
arxiv: 2507.14119
|
|
|
|
|
|
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
| 11 |
# ๐ฅฏ BAGEL-NHR-Edit
|
| 12 |
|
| 13 |
<p align="left">
|
| 14 |
-
<a href="https://riko0.github.io/No-Humans-Required/"> ๐ NHR Website </a> |
|
| 15 |
-
<a href="https://
|
| 16 |
-
<a href="https://
|
|
|
|
|
|
|
| 17 |
</p>
|
| 18 |
|
| 19 |
-
This repository hosts the model weights for **BAGEL**, fine-tuned on the **[NHR-Edit](https://huggingface.co/datasets/iitolstykh/NHR-Edit)** dataset. For installation, usage instructions, and further documentation, please visit the [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
### ๐ ๏ธ Training Setup
|
| 23 |
|
| 24 |
We performed parameter-efficient adaptation on the generation expertโs attention and FFN projection layers using LoRA.
|
| 25 |
|
| 26 |
-
LoRA parameters:
|
| 27 |
```
|
| 28 |
r = 16
|
| 29 |
lora_alpha = 16
|
|
@@ -83,4 +143,4 @@ Results comparison between original Bagel-7B-MoT and BAGEL-NHR-EDIT on samples f
|
|
| 83 |
url = {https://arxiv.org/abs/2507.14119},
|
| 84 |
journal={arXiv preprint arXiv:2507.14119}
|
| 85 |
}
|
| 86 |
-
```
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
base_model:
|
| 3 |
- ByteDance-Seed/BAGEL-7B-MoT
|
| 4 |
+
library_name: transformers
|
| 5 |
+
license: apache-2.0
|
| 6 |
+
pipeline_tag: image-to-image
|
| 7 |
arxiv: 2507.14119
|
| 8 |
+
tags:
|
| 9 |
+
- image-editing
|
| 10 |
+
- text-guided-image-editing
|
| 11 |
---
|
| 12 |
|
|
|
|
| 13 |
# ๐ฅฏ BAGEL-NHR-Edit
|
| 14 |
|
| 15 |
<p align="left">
|
| 16 |
+
<a href="https://riko0.github.io/No-Humans-Required/"> ๐ NHR Website </a> |
|
| 17 |
+
<a href="https://huggingface.co/papers/2507.14119"> ๐ NHR Paper </a> |
|
| 18 |
+
<a href="https://arxiv.org/abs/2507.14119"> ๐ NHR Paper on arXiv </a> |
|
| 19 |
+
<a href="https://huggingface.co/datasets/iitolstykh/NHR-Edit"> ๐ค NHR-Edit Dataset </a> |
|
| 20 |
+
<a href="https://github.com/Riko0/No-Humans-Required-Dataset"> ๐ป NHR-Edit Code </a>
|
| 21 |
</p>
|
| 22 |
|
| 23 |
+
This repository hosts the model weights for **BAGEL-NHR-Edit**, a **BAGEL** model fine-tuned on the **[NHR-Edit](https://huggingface.co/datasets/iitolstykh/NHR-Edit)** dataset, as presented in the paper [NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining](https://huggingface.co/papers/2507.14119). For installation, usage instructions, and further documentation specific to NHR-Edit, please visit the [NHR-Edit GitHub repository](https://github.com/Riko0/No-Humans-Required-Dataset).
|
| 24 |
+
|
| 25 |
+
### ๐ Sample Usage
|
| 26 |
+
|
| 27 |
+
You can use the model with the Hugging Face `transformers` library.
|
| 28 |
+
|
| 29 |
+
```python
|
| 30 |
+
from transformers import AutoProcessor, AutoModelForCausalLM
|
| 31 |
+
import torch
|
| 32 |
+
from PIL import Image
|
| 33 |
+
import requests # Required to load image from URL
|
| 34 |
+
|
| 35 |
+
# Load model and processor
|
| 36 |
+
# It's recommended to use bfloat16 for better performance and memory efficiency if supported.
|
| 37 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 38 |
+
"iitolstykh/BAGEL-NHR-Edit",
|
| 39 |
+
torch_dtype=torch.bfloat16,
|
| 40 |
+
device_map="auto",
|
| 41 |
+
trust_remote_code=True # Needed for custom model architecture/components
|
| 42 |
+
)
|
| 43 |
+
processor = AutoProcessor.from_pretrained("iitolstykh/BAGEL-NHR-Edit")
|
| 44 |
+
|
| 45 |
+
# Example image input (replace with your actual image path or PIL Image object)
|
| 46 |
+
image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg"
|
| 47 |
+
image = Image.open(requests.get(image_url, stream=True).raw).convert("RGB")
|
| 48 |
+
|
| 49 |
+
# Example text instruction for editing
|
| 50 |
+
instruction = "Change the car's color to red."
|
| 51 |
+
|
| 52 |
+
# Prepare the conversation in the model's chat template format
|
| 53 |
+
messages = [
|
| 54 |
+
{
|
| 55 |
+
"role": "user",
|
| 56 |
+
"content": [
|
| 57 |
+
{"type": "image", "image": image},
|
| 58 |
+
{"type": "text", "text": instruction},
|
| 59 |
+
],
|
| 60 |
+
}
|
| 61 |
+
]
|
| 62 |
+
|
| 63 |
+
# Apply chat template and prepare inputs for the model
|
| 64 |
+
text = processor.apply_chat_template(
|
| 65 |
+
messages, tokenize=False, add_generation_prompt=True
|
| 66 |
+
)
|
| 67 |
+
inputs = processor(text=[text], images=[image], return_tensors="pt")
|
| 68 |
+
inputs = {k: v.to(model.device) for k, v in inputs.items()}
|
| 69 |
|
| 70 |
+
# Generate the textual output describing the image edit
|
| 71 |
+
# This output typically includes bounding box or quad information for the edited regions.
|
| 72 |
+
generated_ids = model.generate(**inputs, max_new_tokens=1024)
|
| 73 |
+
generated_text = processor.batch_decode(generated_ids[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True)[0]
|
| 74 |
+
|
| 75 |
+
print("Generated Textual Description of Edit:")
|
| 76 |
+
print(generated_text)
|
| 77 |
+
|
| 78 |
+
# To visualize the actual image edit, further post-processing (e.g., using a diffusion model conditioned on this output)
|
| 79 |
+
# would be required, which is beyond the scope of this basic inference example.
|
| 80 |
+
```
|
| 81 |
|
| 82 |
### ๐ ๏ธ Training Setup
|
| 83 |
|
| 84 |
We performed parameter-efficient adaptation on the generation expertโs attention and FFN projection layers using LoRA.
|
| 85 |
|
| 86 |
+
LoRA parameters:
|
| 87 |
```
|
| 88 |
r = 16
|
| 89 |
lora_alpha = 16
|
|
|
|
| 143 |
url = {https://arxiv.org/abs/2507.14119},
|
| 144 |
journal={arXiv preprint arXiv:2507.14119}
|
| 145 |
}
|
| 146 |
+
```
|