--- license: apache-2.0 base_model: LiquidAI/LFM2.5-1.2B-Instruct tags: - recruitment - cv-matching - keyword-extraction - resume-screening - hr-tech - sft - dpo - lora - unsloth datasets: - custom language: - en pipeline_tag: text-generation model-index: - name: LFM2.5-1.2B-MOAT results: - task: type: text-generation name: CV-JD Assessment metrics: - name: Score MAE type: mae value: 6.82 - name: JSON Parse Rate type: accuracy value: 99.9 - name: Verdict Accuracy type: accuracy value: 76.8 - name: Score Bias type: custom value: 1.53 --- # LFM2.5-1.2B-MOAT **M**ulti-task **O**ptimized **A**ssessment **T**ool — a finetuned [LiquidAI/LFM2.5-1.2B-Instruct](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) model for recruitment AI. Handles two tasks with a single model: 1. **CV-JD Assessment** — Match scoring + qualitative analysis 2. **Keyword Extraction** — Structured keyword extraction from job descriptions and CVs ## Training - **Base model**: LiquidAI/LFM2.5-1.2B-Instruct (1.2B params, hybrid Mamba2 + Attention) - **Stage 1 — Multi-task SFT**: 39,641 examples (19,588 assessments + 20,053 keywords), LoRA r=32/α=64, 1 epoch, LR=5e-5 - **Stage 2 — Targeted DPO**: 2,374 filtered problematic pairs (|score diff| ≥ 5pts), LoRA r=16/α=32, beta=0.2, LR=5e-6 - **Hardware**: NVIDIA RTX 5080 16GB, total training time ~3.5 hours - **Training data**: Gemini-generated assessments and keyword extractions across tech, healthcare, finance, and blue collar domains ## Performance ### CV-JD Assessment (4,898 held-out samples) | Metric | V1 Baseline | MOAT V2 | Target | |--------|------------|---------|--------| | JSON Parse Rate | 97.0% | **99.9%** | ≥95% | | Score MAE | 13.1 pts | **6.82 pts** | <8 | | Score Bias | -13.0 pts | **+1.53 pts** | ~0 | | Verdict Accuracy | 50.0% | **76.8%** | >60% | | Within 5 pts | — | **51.4%** | — | | Within 10 pts | — | **77.5%** | — | | Median Absolute Error | — | **4.90 pts** | — | ### Keyword Extraction (10 diverse samples across domains) | Field | Accuracy | |-------|----------| | JSON Parse Rate | 100% | | Schema Complete | 100% | | Experience Years | 100% | | Domain | 90% | | Education | 80% | | Seniority | 80% | | Skills (avg F1) | 0.58 | Skills F1 varies by domain: white collar (0.74-0.84) > blue collar/healthcare (0.33-0.58). The model extracts correct skills but sometimes at different granularity than reference labels. ## Usage with vLLM ```python from vllm import LLM, SamplingParams model = LLM( model="GazTrab/LFM2.5-1.2B-MOAT", max_model_len=4096, gpu_memory_utilization=0.85, dtype="bfloat16", trust_remote_code=True, max_num_seqs=64, ) tokenizer = model.get_tokenizer() sampling_params = SamplingParams( temperature=0.1, top_p=0.1, top_k=50, repetition_penalty=1.05, max_tokens=2048, ) # Build prompt using chat template messages = [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": USER_PROMPT}, ] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) outputs = model.generate([prompt], sampling_params) print(outputs[0].outputs[0].text) ``` ### Important Notes - **max_model_len=4096** — the model was trained with this context length - **temperature=0.1, top_p=0.1** — low temperature for consistent structured output - **trust_remote_code=True** — required for the LFM2.5 architecture (hybrid Mamba2 + Attention) - Prompts exceeding ~2048 tokens should be truncated (leave room for generation) - The model outputs raw JSON — no markdown fences needed ## Usage with Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_name = "GazTrab/LFM2.5-1.2B-MOAT" tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, ) messages = [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": USER_PROMPT}, ] input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device) output = model.generate( input_ids, max_new_tokens=2048, temperature=0.1, top_p=0.1, top_k=50, repetition_penalty=1.05, do_sample=True, ) response = tokenizer.decode(output[0][input_ids.shape[1]:], skip_special_tokens=True) print(response) ``` ## Task Prompts ### Task 1: CV-JD Assessment **System prompt:** ``` You are an expert recruitment AI that analyzes CV-JD compatibility. You MUST respond with valid JSON only. No additional text before or after the JSON. Output schema: { "match_score": , "executive_summary": "<2-3 sentence overview>", "strengths": ["", "", ...], "gaps": ["", "", ...], "recommendation": "Interview|Consider|Not recommended", "verdict": "STRONG_MATCH|GOOD_MATCH|MODERATE_MATCH|WEAK_MATCH|NOT_SUITABLE" } Guidelines: - Be specific and quantified in strengths/gaps (e.g., "5/7 required skills", "3 years below requirement") - Reference actual skills from the JD and CV - Verdict must align with match_score brackets - Keep strengths and gaps to 2-4 items each ``` **User prompt format:** ``` Analyze the following CV against the Job Description and provide a structured assessment. === JOB DESCRIPTION === {jd_text} === CANDIDATE CV === {cv_text} Respond with JSON only: ``` **Verdict-to-score mapping:** | Verdict | Score Range | |---------|-------------| | STRONG_MATCH | 85-100 | | GOOD_MATCH | 70-84 | | MODERATE_MATCH | 50-69 | | WEAK_MATCH | 30-49 | | NOT_SUITABLE | 0-29 | ### Task 2: Keyword Extraction **System prompt:** ``` You are an expert recruitment AI that extracts structured keywords from documents. You MUST respond with valid JSON only. No additional text before or after the JSON. Output schema: { "skills": ["", "", ...], "experience_years": , "education": "", "certifications": ["", "", ...], "domain": "<2-4 word domain>", "seniority": "" } Guidelines: - Extract only explicitly stated skills, not inferred ones - For CVs: infer experience_years from work history dates - For JDs: use the stated requirement, or 0 if not specified - Skills should be lowercase - Keep domain to 2-4 words ``` **User prompt format (for JDs):** ``` Extract structured keywords from the following Job Description. === JOB DESCRIPTION === {jd_text} Respond with JSON only: ``` **User prompt format (for CVs):** ``` Extract structured keywords from the following CV/Resume. === CANDIDATE CV === {cv_text} Respond with JSON only: ``` ## Limitations - **Low-score bias**: Scores in the 0-20 range tend to be overestimated by ~8 points (model struggles to score below ~17) - **Blue collar granularity**: Keyword extraction for trade/blue collar roles sometimes outputs overly verbose skill descriptions - **Training data domains**: Primarily trained on tech, healthcare, and finance — generalizes to other domains but with slightly lower quality - **Context length**: Long CVs or JDs may need truncation to stay within the 2048-token prompt budget ## Citation ```bibtex @misc{gaztrab2026moat, title={LFM2.5-1.2B-MOAT: Multi-task Optimized Assessment Tool for Recruitment}, author={GazTrab}, year={2026}, url={https://huggingface.co/GazTrab/LFM2.5-1.2B-MOAT} } ```