Image-Text-to-Text
Transformers
Safetensors
internvl
vision-action
inverse-dynamics-model
embodied-ai
game-ai
conversational
Instructions to use open-world-agents/Generalist-IDM-1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use open-world-agents/Generalist-IDM-1B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="open-world-agents/Generalist-IDM-1B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("open-world-agents/Generalist-IDM-1B") model = AutoModelForMultimodalLM.from_pretrained("open-world-agents/Generalist-IDM-1B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use open-world-agents/Generalist-IDM-1B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "open-world-agents/Generalist-IDM-1B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "open-world-agents/Generalist-IDM-1B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/open-world-agents/Generalist-IDM-1B
- SGLang
How to use open-world-agents/Generalist-IDM-1B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "open-world-agents/Generalist-IDM-1B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "open-world-agents/Generalist-IDM-1B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "open-world-agents/Generalist-IDM-1B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "open-world-agents/Generalist-IDM-1B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use open-world-agents/Generalist-IDM-1B with Docker Model Runner:
docker model run hf.co/open-world-agents/Generalist-IDM-1B
| { | |
| "</box>": 151673, | |
| "</img>": 151666, | |
| "</quad>": 151669, | |
| "</ref>": 151671, | |
| "</tool_call>": 151658, | |
| "<0>": 151675, | |
| "<1>": 151676, | |
| "<2>": 151677, | |
| "<3>": 151678, | |
| "<4>": 151679, | |
| "<5>": 151680, | |
| "<6>": 151681, | |
| "<7>": 151682, | |
| "<8>": 151683, | |
| "<9>": 151684, | |
| "<EVENT_END>": 151685, | |
| "<EVENT_START>": 151686, | |
| "<IMG_CONTEXT>": 151667, | |
| "<KEYBOARD>": 151687, | |
| "<MB_0>": 151688, | |
| "<MB_10>": 151689, | |
| "<MB_11>": 151690, | |
| "<MB_12>": 151691, | |
| "<MB_13>": 151692, | |
| "<MB_14>": 151693, | |
| "<MB_15>": 151694, | |
| "<MB_1>": 151695, | |
| "<MB_2>": 151696, | |
| "<MB_3>": 151697, | |
| "<MB_4>": 151698, | |
| "<MB_5>": 151699, | |
| "<MB_6>": 151700, | |
| "<MB_7>": 151701, | |
| "<MB_8>": 151702, | |
| "<MB_9>": 151703, | |
| "<MOUSE>": 151704, | |
| "<SCREEN>": 151705, | |
| "<SIGN_MINUS>": 151706, | |
| "<SIGN_PLUS>": 151707, | |
| "<VK_0>": 151708, | |
| "<VK_100>": 151709, | |
| "<VK_101>": 151710, | |
| "<VK_102>": 151711, | |
| "<VK_103>": 151712, | |
| "<VK_104>": 151713, | |
| "<VK_105>": 151714, | |
| "<VK_106>": 151715, | |
| "<VK_107>": 151716, | |
| "<VK_108>": 151717, | |
| "<VK_109>": 151718, | |
| "<VK_10>": 151719, | |
| "<VK_110>": 151720, | |
| "<VK_111>": 151721, | |
| "<VK_112>": 151722, | |
| "<VK_113>": 151723, | |
| "<VK_114>": 151724, | |
| "<VK_115>": 151725, | |
| "<VK_116>": 151726, | |
| "<VK_117>": 151727, | |
| "<VK_118>": 151728, | |
| "<VK_119>": 151729, | |
| "<VK_11>": 151730, | |
| "<VK_120>": 151731, | |
| "<VK_121>": 151732, | |
| "<VK_122>": 151733, | |
| "<VK_123>": 151734, | |
| "<VK_124>": 151735, | |
| "<VK_125>": 151736, | |
| "<VK_126>": 151737, | |
| "<VK_127>": 151738, | |
| "<VK_128>": 151739, | |
| "<VK_129>": 151740, | |
| "<VK_12>": 151741, | |
| "<VK_130>": 151742, | |
| "<VK_131>": 151743, | |
| "<VK_132>": 151744, | |
| "<VK_133>": 151745, | |
| "<VK_134>": 151746, | |
| "<VK_135>": 151747, | |
| "<VK_136>": 151748, | |
| "<VK_137>": 151749, | |
| "<VK_138>": 151750, | |
| "<VK_139>": 151751, | |
| "<VK_13>": 151752, | |
| "<VK_140>": 151753, | |
| "<VK_141>": 151754, | |
| "<VK_142>": 151755, | |
| "<VK_143>": 151756, | |
| "<VK_144>": 151757, | |
| "<VK_145>": 151758, | |
| "<VK_146>": 151759, | |
| "<VK_147>": 151760, | |
| "<VK_148>": 151761, | |
| "<VK_149>": 151762, | |
| "<VK_14>": 151763, | |
| "<VK_150>": 151764, | |
| "<VK_151>": 151765, | |
| "<VK_152>": 151766, | |
| "<VK_153>": 151767, | |
| "<VK_154>": 151768, | |
| "<VK_155>": 151769, | |
| "<VK_156>": 151770, | |
| "<VK_157>": 151771, | |
| "<VK_158>": 151772, | |
| "<VK_159>": 151773, | |
| "<VK_15>": 151774, | |
| "<VK_160>": 151775, | |
| "<VK_161>": 151776, | |
| "<VK_162>": 151777, | |
| "<VK_163>": 151778, | |
| "<VK_164>": 151779, | |
| "<VK_165>": 151780, | |
| "<VK_166>": 151781, | |
| "<VK_167>": 151782, | |
| "<VK_168>": 151783, | |
| "<VK_169>": 151784, | |
| "<VK_16>": 151785, | |
| "<VK_170>": 151786, | |
| "<VK_171>": 151787, | |
| "<VK_172>": 151788, | |
| "<VK_173>": 151789, | |
| "<VK_174>": 151790, | |
| "<VK_175>": 151791, | |
| "<VK_176>": 151792, | |
| "<VK_177>": 151793, | |
| "<VK_178>": 151794, | |
| "<VK_179>": 151795, | |
| "<VK_17>": 151796, | |
| "<VK_180>": 151797, | |
| "<VK_181>": 151798, | |
| "<VK_182>": 151799, | |
| "<VK_183>": 151800, | |
| "<VK_184>": 151801, | |
| "<VK_185>": 151802, | |
| "<VK_186>": 151803, | |
| "<VK_187>": 151804, | |
| "<VK_188>": 151805, | |
| "<VK_189>": 151806, | |
| "<VK_18>": 151807, | |
| "<VK_190>": 151808, | |
| "<VK_191>": 151809, | |
| "<VK_192>": 151810, | |
| "<VK_193>": 151811, | |
| "<VK_194>": 151812, | |
| "<VK_195>": 151813, | |
| "<VK_196>": 151814, | |
| "<VK_197>": 151815, | |
| "<VK_198>": 151816, | |
| "<VK_199>": 151817, | |
| "<VK_19>": 151818, | |
| "<VK_1>": 151819, | |
| "<VK_200>": 151820, | |
| "<VK_201>": 151821, | |
| "<VK_202>": 151822, | |
| "<VK_203>": 151823, | |
| "<VK_204>": 151824, | |
| "<VK_205>": 151825, | |
| "<VK_206>": 151826, | |
| "<VK_207>": 151827, | |
| "<VK_208>": 151828, | |
| "<VK_209>": 151829, | |
| "<VK_20>": 151830, | |
| "<VK_210>": 151831, | |
| "<VK_211>": 151832, | |
| "<VK_212>": 151833, | |
| "<VK_213>": 151834, | |
| "<VK_214>": 151835, | |
| "<VK_215>": 151836, | |
| "<VK_216>": 151837, | |
| "<VK_217>": 151838, | |
| "<VK_218>": 151839, | |
| "<VK_219>": 151840, | |
| "<VK_21>": 151841, | |
| "<VK_220>": 151842, | |
| "<VK_221>": 151843, | |
| "<VK_222>": 151844, | |
| "<VK_223>": 151845, | |
| "<VK_224>": 151846, | |
| "<VK_225>": 151847, | |
| "<VK_226>": 151848, | |
| "<VK_227>": 151849, | |
| "<VK_228>": 151850, | |
| "<VK_229>": 151851, | |
| "<VK_22>": 151852, | |
| "<VK_230>": 151853, | |
| "<VK_231>": 151854, | |
| "<VK_232>": 151855, | |
| "<VK_233>": 151856, | |
| "<VK_234>": 151857, | |
| "<VK_235>": 151858, | |
| "<VK_236>": 151859, | |
| "<VK_237>": 151860, | |
| "<VK_238>": 151861, | |
| "<VK_239>": 151862, | |
| "<VK_23>": 151863, | |
| "<VK_240>": 151864, | |
| "<VK_241>": 151865, | |
| "<VK_242>": 151866, | |
| "<VK_243>": 151867, | |
| "<VK_244>": 151868, | |
| "<VK_245>": 151869, | |
| "<VK_246>": 151870, | |
| "<VK_247>": 151871, | |
| "<VK_248>": 151872, | |
| "<VK_249>": 151873, | |
| "<VK_24>": 151874, | |
| "<VK_250>": 151875, | |
| "<VK_251>": 151876, | |
| "<VK_252>": 151877, | |
| "<VK_253>": 151878, | |
| "<VK_254>": 151879, | |
| "<VK_255>": 151880, | |
| "<VK_25>": 151881, | |
| "<VK_26>": 151882, | |
| "<VK_27>": 151883, | |
| "<VK_28>": 151884, | |
| "<VK_29>": 151885, | |
| "<VK_2>": 151886, | |
| "<VK_30>": 151887, | |
| "<VK_31>": 151888, | |
| "<VK_32>": 151889, | |
| "<VK_33>": 151890, | |
| "<VK_34>": 151891, | |
| "<VK_35>": 151892, | |
| "<VK_36>": 151893, | |
| "<VK_37>": 151894, | |
| "<VK_38>": 151895, | |
| "<VK_39>": 151896, | |
| "<VK_3>": 151897, | |
| "<VK_40>": 151898, | |
| "<VK_41>": 151899, | |
| "<VK_42>": 151900, | |
| "<VK_43>": 151901, | |
| "<VK_44>": 151902, | |
| "<VK_45>": 151903, | |
| "<VK_46>": 151904, | |
| "<VK_47>": 151905, | |
| "<VK_48>": 151906, | |
| "<VK_49>": 151907, | |
| "<VK_4>": 151908, | |
| "<VK_50>": 151909, | |
| "<VK_51>": 151910, | |
| "<VK_52>": 151911, | |
| "<VK_53>": 151912, | |
| "<VK_54>": 151913, | |
| "<VK_55>": 151914, | |
| "<VK_56>": 151915, | |
| "<VK_57>": 151916, | |
| "<VK_58>": 151917, | |
| "<VK_59>": 151918, | |
| "<VK_5>": 151919, | |
| "<VK_60>": 151920, | |
| "<VK_61>": 151921, | |
| "<VK_62>": 151922, | |
| "<VK_63>": 151923, | |
| "<VK_64>": 151924, | |
| "<VK_65>": 151925, | |
| "<VK_66>": 151926, | |
| "<VK_67>": 151927, | |
| "<VK_68>": 151928, | |
| "<VK_69>": 151929, | |
| "<VK_6>": 151930, | |
| "<VK_70>": 151931, | |
| "<VK_71>": 151932, | |
| "<VK_72>": 151933, | |
| "<VK_73>": 151934, | |
| "<VK_74>": 151935, | |
| "<VK_75>": 151936, | |
| "<VK_76>": 151937, | |
| "<VK_77>": 151938, | |
| "<VK_78>": 151939, | |
| "<VK_79>": 151940, | |
| "<VK_7>": 151941, | |
| "<VK_80>": 151942, | |
| "<VK_81>": 151943, | |
| "<VK_82>": 151944, | |
| "<VK_83>": 151945, | |
| "<VK_84>": 151946, | |
| "<VK_85>": 151947, | |
| "<VK_86>": 151948, | |
| "<VK_87>": 151949, | |
| "<VK_88>": 151950, | |
| "<VK_89>": 151951, | |
| "<VK_8>": 151952, | |
| "<VK_90>": 151953, | |
| "<VK_91>": 151954, | |
| "<VK_92>": 151955, | |
| "<VK_93>": 151956, | |
| "<VK_94>": 151957, | |
| "<VK_95>": 151958, | |
| "<VK_96>": 151959, | |
| "<VK_97>": 151960, | |
| "<VK_98>": 151961, | |
| "<VK_99>": 151962, | |
| "<VK_9>": 151963, | |
| "<box>": 151672, | |
| "<img>": 151665, | |
| "<press>": 151964, | |
| "<quad>": 151668, | |
| "<ref>": 151670, | |
| "<release>": 151965, | |
| "<tool_call>": 151657, | |
| "<video>": 151674, | |
| "<|box_end|>": 151649, | |
| "<|box_start|>": 151648, | |
| "<|endoftext|>": 151643, | |
| "<|file_sep|>": 151664, | |
| "<|fim_middle|>": 151660, | |
| "<|fim_pad|>": 151662, | |
| "<|fim_prefix|>": 151659, | |
| "<|fim_suffix|>": 151661, | |
| "<|im_end|>": 151645, | |
| "<|im_start|>": 151644, | |
| "<|image_pad|>": 151655, | |
| "<|object_ref_end|>": 151647, | |
| "<|object_ref_start|>": 151646, | |
| "<|quad_end|>": 151651, | |
| "<|quad_start|>": 151650, | |
| "<|repo_name|>": 151663, | |
| "<|video_pad|>": 151656, | |
| "<|vision_end|>": 151653, | |
| "<|vision_pad|>": 151654, | |
| "<|vision_start|>": 151652 | |
| } | |