YuukiAsuna
/

Vintern-1B-v2-ViTable-docvqa

Document Question Answering

image-feature-extraction

Model card Files Files and versions

YuukiAsuna commited on Nov 17, 2024

Commit

f8b2073

·

verified ·

1 Parent(s): 49d69f3

Update README.md

Files changed (1) hide show

README.md +38 -3

README.md CHANGED Viewed

@@ -1,3 +1,38 @@
----
-license: mit
----

+---
+license: mit
+datasets:
+- YuukiAsuna/VietnameseTableVQA
+language:
+- vi
+base_model:
+- 5CD-AI/Vintern-1B-v2
+pipeline_tag: document-question-answering
+library_name: transformers
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+Vintern-1B-v2-ViTable-docvqa is a fine-tuned version of the 5CD-AI/Vintern-1B-v2 multimodal model for the Vietnamese DocVQA (Table data)
+## Benchmarks
+To be developed later
+## Quickstart
+To be developed later
+**Citation:**
+```bibtex
+@misc{doan2024vintern1befficientmultimodallarge,
+      title={Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese},
+      author={Khang T. Doan and Bao G. Huynh and Dung T. Hoang and Thuc D. Pham and Nhat H. Pham and Quan T. M. Nguyen and Bang Q. Vo and Suong N. Hoang},
+      year={2024},
+      eprint={2408.12480},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG},
+      url={https://arxiv.org/abs/2408.12480},
+}
+```