Instructions to use aisingapore/Gemma-SEA-LION-v4-27B-IT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use aisingapore/Gemma-SEA-LION-v4-27B-IT with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="aisingapore/Gemma-SEA-LION-v4-27B-IT")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("aisingapore/Gemma-SEA-LION-v4-27B-IT")
model = AutoModelForMultimodalLM.from_pretrained("aisingapore/Gemma-SEA-LION-v4-27B-IT")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use aisingapore/Gemma-SEA-LION-v4-27B-IT with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "aisingapore/Gemma-SEA-LION-v4-27B-IT"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aisingapore/Gemma-SEA-LION-v4-27B-IT",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/aisingapore/Gemma-SEA-LION-v4-27B-IT

SGLang

How to use aisingapore/Gemma-SEA-LION-v4-27B-IT with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "aisingapore/Gemma-SEA-LION-v4-27B-IT" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aisingapore/Gemma-SEA-LION-v4-27B-IT",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "aisingapore/Gemma-SEA-LION-v4-27B-IT" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aisingapore/Gemma-SEA-LION-v4-27B-IT",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use aisingapore/Gemma-SEA-LION-v4-27B-IT with Docker Model Runner:
```
docker model run hf.co/aisingapore/Gemma-SEA-LION-v4-27B-IT
```

SAnocha commited on Aug 19, 2025

Commit

15bb0cd

verified ·

1 Parent(s): 9c418e9

Update README

Browse files

Files changed (1) hide show

README.md +49 -57

README.md CHANGED Viewed

@@ -21,18 +21,18 @@ license: gemma
 base_model_relation: finetune
 ---
-*Gemma-SEA-LION-v4-27B (Base Model) Last updated: 2025-08-18*
 ---
-# Model Card for Gemma-SEA-LION-v4-27B
 <!-- Provide a quick summary of what the model is/does. -->
 **SEA-LION** is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned
 for the Southeast Asia (SEA) region.
-Gemma-SEA-LION-v4-27B is a multilingual model which has undergone continued pre-training on
 approximately **500B** tokens across 11 SEA languages: Bahasa Indonesia, Burmese, Chinese, English,
 Khmer, Lao, Malay, Tagalog, Tamil, Thai and Vietnamese.
@@ -46,7 +46,7 @@ Khmer, Lao, Malay, Tagalog, Tamil, Thai and Vietnamese.
 SEA-LION stands for *Southeast Asian Languages In One Network*.
 We performed continued pre-training in English and SEA languages on Gemma 3 27B IT,
-a decoder model using the Gemma 3 architecture, to create Gemma-SEA-LION-v4-27B.
 For tokenization, the model employs the default tokenizer used in Gemma 3 27B IT.
@@ -58,7 +58,7 @@ For tokenization, the model employs the default tokenizer used in Gemma 3 27B IT
 - **Context length:** 128k
 - **Language(s) (NLP):**  Bahasa Indonesia, Burmese, Chinese, English, Khmer, Lao, Malay, Tagalog, Tamil, Thai and Vietnamese
 - **License:** [Gemma Terms of Use](https://ai.google.dev/gemma/terms)
-- **Finetuned from model:** [Gemma-3-27B-IT](https://huggingface.co/google/gemma-3-27b-it)
 ### Model Sources
@@ -92,7 +92,7 @@ due to the potential inconsistencies.
 **Limitations**
-In terms of vision capability, Gemma-SEA-LION-v4-27B has been trained and fine-tuned exclusively on the text back-end.
 As a result, its vision capabilities are expected to be comparable to those of Gemma 3 IT 27B,
 and may not exhibit significant improvements or differences in this area. [🤗 google/gemma-3-27b-it](https://huggingface.co/google/gemma-3-27b-it )
@@ -110,7 +110,7 @@ import torch
 pipe = pipeline(
     "text-generation",
-    model="aisingapore/Gemma-SEA-LION-v4-27B",
     device="cuda",
     torch_dtype=torch.bfloat16
 )
@@ -143,47 +143,17 @@ The dataset comprises Bahasa Indonesia, Burmese, Chinese, English, Khmer, Lao, M
 Thai and Vietnamese languages, collected from a mixture of sources including web data, code, open-source datasets,
 and synthetically generated datasets, amounting to a total of 500 billion tokens.
-The 500 billion tokens are sampled from a much larger pool of 1 trillion tokens from open-sourced datasets with the optimal datamix shown below determined by our experiments.
-| Language                         | Dataset Name             | Total Tokens (B) | Percentage (%) | Total percentage (%) |
-|-----------------------------------|-------------------------|------------------|----------------|---------------------|
-| Code                              | StarCoder (OLMo 2 Version) | 50B             | 10             | 10                  |
-| EN                                | Fineweb-Edu             | 80B              | 16             | 40                  |
-|                                  | DCLM-OLMo2-HQ           | 80B              | 16             |                     |
-|                                  | Non-CC-EN               | 40B              | 8              |                     |
-| ZH                                | SEA-LION Pile v1        | 13.5B            | 2.7            | 9                   |
-|                                  | Fineweb2                | 13.5B            | 2.7            |                     |
-|                                  | Fineweb2-HQ             | 4.5B             | 0.9            |                     |
-| VI                                | SEA-LION Pile v1        | 4.25B            | 0.85           | 8.5                 |
-|                                  | SEA-LION Pile v2        | 12.75B           | 2.55           |                     |
-|                                  | Fineweb2                | 8.5B             | 1.7            |                     |
-|                                  | Non-CC-VI               | 17B              | 3.4            |                     |
-| ID                                | SEA-LION Pile v1        | 5.66B            | 1.13           | 8.5                 |
-|                                  | SEA-LION Pile v2        | 17B              | 3.4            |                     |
-|                                  | Fineweb2                | 11.33B           | 2.27           |                     |
-|                                  | Non-CC-ID               | 8.5B             | 1.7            |                     |
-| TH                                | SEA-LION Pile v1        | 3.035B           | 0.61           | 8.5                 |
-|                                  | SEA-LION Pile v2        | 9.107B           | 1.82           |                     |
-|                                  | Fineweb2                | 3.035B           | 0.61           |                     |
-|                                  | WangChanBERTa           | 3.035B           | 0.61           |                     |
-|                                  | Dolmav1                 | 3.035B           | 0.61           |                     |
-|                                  | Non-CC-TH               | 21.25B           | 4.25           |                     |
-| TL, TA, MS, KM, LO and MY         | ALL_LANG                | 77.5B            | 15.5           | 15.5                |
 Note:
 - All token counts are counted using Gemma 3 tokenizer.
 - Pre-training was conducted with batches of 8k token lengths.
-- SEA-Pile v1 is processed from Common Crawl WET, which is published [here](https://huggingface.co/datasets/aisingapore/sea-lion-pile).
 The main proportion is from mC4 dataset (corpus [link](https://huggingface.co/datasets/bertin-project/mc4-sampling)).
 The cutoff date of this version is September 2020.
-- SEA-Pile v2 is processed from Common Crawl WARC from October 2020 to April 2024.
 - Tamil news is sourced with permission from [Seithi](https://seithi.mediacorp.sg/)
@@ -194,16 +164,17 @@ The cutoff date of this version is September 2020.
 #### Training Hyperparameters
-- **Training regime:**
-| Hyperparameter    | Gemma-SEA-LION-v4-27B |
-|-------------------|-----------------------|
-| Precision         | bfloat16              |
-| Optimizer         | decoupled_adamw       |
-| Scheduler         | CosineAnnealing       |
-| Learning Rate     | 4.00E-08              |
-| Global Batch Size | 1024                  |
  <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
@@ -217,11 +188,11 @@ The cutoff date of this version is September 2020.
 <!-- This should link to a Dataset Card if possible. -->
-We evaluated Gemma-SEA-LION-v4-27B on general language capabilities.
 **Testing Data**
-General NLP Behaviour
 For the evaluation of general language capabilities, we employed the SEA-HELM evaluation benchmark
 across a variety of tasks. These tasks include Question Answering (QA), Sentiment Analysis (Sentiment),
@@ -229,26 +200,47 @@ Toxicity Detection (Toxicity), Translation in both directions (Eng>Lang & Lang>E
 Abstractive Summarisation (Abssum), Causal Reasoning (Causal), Natural Language Inference (NLI),
 and linguistic diagnostics (LINDSEA).
 #### Factors
 <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-Our evaluations were set based on task. For all tasks, the model is expected to provide an answer tag
-from which the answer is automatically extracted. For tasks where options are provided,
-the answer should comprise one of the pre-defined options. The scores for each task is normalised to account
-for baseline performance due to random chance.
 #### Metrics
 <!-- These are the evaluation metrics being used, ideally with a description of why. -->
-The evaluation was done **five-shot** with native prompts on a sample of 100-1000 instances for each dataset.
 ### Results
-For details on Gemma-SEA-LION-v4-27B performance, please refer to the SEA-HELM leaderboard, [Leaderboard results on SEA-HELM](https://leaderboard.sea-lion.ai/).
 #### Summary

 base_model_relation: finetune
 ---
+*Gemma-SEA-LION-v4-27B-IT (IT Model) Last updated: 2025-08-18*
 ---
+# Model Card for Gemma-SEA-LION-v4-27B-IT
 <!-- Provide a quick summary of what the model is/does. -->
 **SEA-LION** is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned
 for the Southeast Asia (SEA) region.
+Gemma-SEA-LION-v4-27B-IT is a multilingual model which has undergone continued pre-training on
 approximately **500B** tokens across 11 SEA languages: Bahasa Indonesia, Burmese, Chinese, English,
 Khmer, Lao, Malay, Tagalog, Tamil, Thai and Vietnamese.
 SEA-LION stands for *Southeast Asian Languages In One Network*.
 We performed continued pre-training in English and SEA languages on Gemma 3 27B IT,
+a decoder model using the Gemma 3 architecture, to create Gemma-SEA-LION-v4-27B-IT.
 For tokenization, the model employs the default tokenizer used in Gemma 3 27B IT.
 - **Context length:** 128k
 - **Language(s) (NLP):**  Bahasa Indonesia, Burmese, Chinese, English, Khmer, Lao, Malay, Tagalog, Tamil, Thai and Vietnamese
 - **License:** [Gemma Terms of Use](https://ai.google.dev/gemma/terms)
+- **Finetuned from model:** Gemma-SEA-LION-v4-27B
 ### Model Sources
 **Limitations**
+In terms of vision capability, Gemma-SEA-LION-v4-27B-IT has been trained and fine-tuned exclusively on the text back-end.
 As a result, its vision capabilities are expected to be comparable to those of Gemma 3 IT 27B,
 and may not exhibit significant improvements or differences in this area. [🤗 google/gemma-3-27b-it](https://huggingface.co/google/gemma-3-27b-it )
 pipe = pipeline(
     "text-generation",
+    model="aisingapore/Gemma-SEA-LION-v4-27B-IT",
     device="cuda",
     torch_dtype=torch.bfloat16
 )
 Thai and Vietnamese languages, collected from a mixture of sources including web data, code, open-source datasets,
 and synthetically generated datasets, amounting to a total of 500 billion tokens.
 Note:
 - All token counts are counted using Gemma 3 tokenizer.
 - Pre-training was conducted with batches of 8k token lengths.
+- SEA-LION Pile v1 is processed from Common Crawl WET, which is published [here](https://huggingface.co/datasets/aisingapore/sea-lion-pile).
 The main proportion is from mC4 dataset (corpus [link](https://huggingface.co/datasets/bertin-project/mc4-sampling)).
 The cutoff date of this version is September 2020.
+- SEA-LION Pile v2 is processed from Common Crawl WARC from October 2020 to April 2024.
 - Tamil news is sourced with permission from [Seithi](https://seithi.mediacorp.sg/)
 #### Training Hyperparameters
+- **Training regime:** We perform post-training using a variety of Reinforcement Learning (RL) methods.
+The instruction fine-tuning dataset combines our SEA-Instruct, Infinity-Instruct,
+and OpenMath-Instruct 2 with open-source datasets such as
+nvidia/Llama-Nemotron-Post-Training-Dataset (RL set) and zwhe99/DeepMath-103K.
+Prompt sampling is guided by a gradient-based analysis process.
+Our post-training workflow consists of multiple stages: instruction fine-tuning,
+model merging, online RL for both instruction following and math using DRGPPO,
+and on-policy alignment via APO. For alignment, rejected-chosen pairs are generated
+from the target model, with the “chosen” responses obtained by rewriting and improving upon
+the *rejected* outputs.
  <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 <!-- This should link to a Dataset Card if possible. -->
+We evaluated Gemma-SEA-LION-v4-27B-IT on both general language capabilities and instruction-following capabilities.
 **Testing Data**
+General
 For the evaluation of general language capabilities, we employed the SEA-HELM evaluation benchmark
 across a variety of tasks. These tasks include Question Answering (QA), Sentiment Analysis (Sentiment),
 Abstractive Summarisation (Abssum), Causal Reasoning (Causal), Natural Language Inference (NLI),
 and linguistic diagnostics (LINDSEA).
+Instruction-following
+We evaluated the models on instruction-following capabilities with two datasets,
+SEA-IFEval (based on IFEval) and SEA-MTBench (based on MT-Bench).
+The two datasets were originally in English, the linguists and native speakers
+in the team worked together to filter, localise and translate the datasets
+into the respective target languages to ensure that the examples remained reasonable,
+meaningful and natural.
 #### Factors
 <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+For instruction-following tasks, our evaluations were organised based on each specific task.
+SEA-IFEval (more languages)
+SEA-IFEval evaluates a model's ability to adhere to constraints provided in the prompt,
+for example beginning a response with a specific word/phrase or answering with
+a certain number of sections. Additionally, accuracy is normalised by the proportion of responses
+in the correct language (if the model performs the task correctly but responds in the wrong language,
+it is judged to have failed the task).
+SEA-MTBench
+SEA-MTBench evaluates a model's ability to engage in multi-turn (2 turns) conversations and
+respond in ways that align with human needs. We use gpt-4-1106-preview as the judge model and
+compare against gpt-3.5-turbo-0125 as the baseline model. The metric used is the weighted win rate
+against the baseline model (i.e. average win rate across each category: Math, Reasoning, STEM, Humanities, Roleplay, Writing, Extraction).
 #### Metrics
 <!-- These are the evaluation metrics being used, ideally with a description of why. -->
+The evaluation was done **zero-shot** with native prompts on a sample of 100-1000 instances for each dataset.
 ### Results
+For details on Gemma-SEA-LION-v4-27B-IT performance, please refer to the SEA-HELM leaderboard, [Leaderboard results on SEA-HELM](https://leaderboard.sea-lion.ai/).
 #### Summary