About
“Qwen (Ben) Franklin” is a set of local LoRA experiments that turn compact Qwen-family models into a Benjamin Franklin conversational persona. The adapters range from fast 1.7B prototypes through 4B cleanup/factual variants to the newer 7B Qwen2.5 experiments, which are the largest Franklin LoRAs proven trainable on this machine.
The key lesson: the 7B models feel more coherent, but stubborn factual corrections such as the Craven Street bones story still benefit from retrieval or prompt-context. Some 4B variants contain stronger factual phrasing but suffer offline tool_call-tag regressions.
Performance table
| Model | Base family | GB | r | Score | Flags | Card |
|---|---|---|---|---|---|---|
| qwen2.5-7b-ben-franklin-v1-lite-r4-qv | Qwen2.5 7B Instruct 4-bit | 0.06 | 4 | 30 | {'base_identity_leak': 1, 'overdenial': 1} | card |
| qwen2.5-7b-ben-franklin-v2-coherence-r4-qv | Qwen2.5 7B Instruct 4-bit | 0.06 | 4 | 33 | {'base_identity_leak': 1} | card |
| qwen2.5-7b-ben-franklin-v3-factual-coherence-r4-qv | Qwen2.5 7B Instruct 4-bit | 0.06 | 4 | 35 | {'base_identity_leak': 1} | card |
| qwen2.5-7b-ben-franklin-v3c-factual-r8-qv-from-base | Qwen2.5 7B Instruct 4-bit | 0.08 | 8 | 28 | {'base_identity_leak': 1, 'overdenial': 1} | card |
| qwen3-4b-instruct-2507-ben-franklin-v1-lora | Qwen3 4B Instruct 4-bit | 1.34 | 16 | card | ||
| qwen3-4b-instruct-2507-ben-franklin-v2-chatml-lora | Qwen3 4B Instruct 4-bit | 2.19 | 32 | card | ||
| qwen3-4b-instruct-2507-ben-franklin-v3-chatml-completions-lora | Qwen3 4B Instruct 4-bit | 2.19 | 32 | card | ||
| qwen3-4b-instruct-2507-ben-franklin-v4-toolcall-clean-lora | Qwen3 4B Instruct 4-bit | 1.81 | 32 | card | ||
| qwen3-4b-instruct-2507-ben-franklin-v5-english-lock-lora | Qwen3 4B Instruct 4-bit | 1.81 | 32 | -92 | {'tool_call': 15, 'continuity_miss': 1} | card |
| qwen3-1.7b-ben-franklin-identity-reinforced-lora | Qwen3 1.7B 4-bit | 0.99 | 32 | card | ||
| qwen3-1.7b-ben-franklin-openai-expanded-lora | Qwen3 1.7B 4-bit | 1.2 | 32 | card | ||
| qwen3-1.7b-ben-franklin-thinking-lora | Qwen3 1.7B 4-bit | 0.99 | 32 | card | ||
| qwen3-1.7b-ben-franklin-thinking-v2-lora | Qwen3 1.7B 4-bit | 0.78 | 32 | card | ||
| qwen3-1.7b-ben-franklin-thinking-v3-negative-identity-lora | Qwen3 1.7B 4-bit | 0.78 | 32 | card | ||
| qwen3-1.7b-ben-franklin-thinking-v4-balanced-lora | Qwen3 1.7B 4-bit | 0.78 | 32 | card | ||
| qwen3-1.7b-ben-franklin-thinking-v5-ood-fixed-lora | Qwen3 1.7B 4-bit | 0.78 | 32 | card | ||
| qwen3-1.7b-ben-franklin-thinking-v6-1-ood-fixed-lora | Qwen3 1.7B 4-bit | 0.99 | 32 | card | ||
| qwen3-1.7b-ben-franklin-thinking-v6-contrastive-lora | Qwen3 1.7B 4-bit | 1.62 | 32 | card | ||
| qwen3-1.7b-ben-franklin-thinking-v7-natural-dialogue-lora | Qwen3 1.7B 4-bit | 1.2 | 32 | card | ||
| qwen3-1.7b-ben-franklin-thinking-v8-minimal-thought-lora | Qwen3 1.7B 4-bit | 0.99 | 32 | card | ||
| qwen3-1.7b-ben-franklin-thinking-v9-factual-dialogue-lora | Qwen3 1.7B 4-bit | 1.62 | 32 | card | ||
| qwen3-1.7b-ben-franklin-thinking-v9-from-v2-factual-dialogue-lora | Qwen3 1.7B 4-bit | 1.62 | 32 | card |
Data mix overview
Training data included persona SFT, OpenAI-expanded Franklin dialogue, thinking/identity reinforcement, OOD corrections, natural dialogue repair, tool-call cleanup, English-lock cleanup, 7B ChatML answer-clean rows, and targeted coherence/factual repair rows.
| Dataset | Rows |
|---|---|
| franklin_7b_coherence_repair_v2.jsonl | 287 |
| franklin_7b_factual_coherence_repair_v3.jsonl | 308 |
| franklin_identity_reinforcement.jsonl | 212 |
| franklin_negative_identity_thinking.jsonl | 829 |
| franklin_persona_openai_expanded.jsonl | 891 |
| franklin_persona_sft.jsonl | 288 |
| franklin_qwen3_4b_answer_only.jsonl | 3253 |
| franklin_qwen3_4b_english_lock_cleanup.jsonl | 816 |
| franklin_qwen3_4b_toolcall_cleanup.jsonl | 920 |
| franklin_qwen3_8b_chatml_answer_clean.jsonl | 2400 |
| franklin_thinking_sft.jsonl | 1259 |
| franklin_thinking_strong_reinforcement.jsonl | 575 |
| franklin_v4_general_balanced_thinking.jsonl | 825 |
| franklin_v5_out_of_domain_correction.jsonl | 684 |
| franklin_v6_1_targeted_ood_fix.jsonl | 870 |
| franklin_v6_contrastive_thinking.jsonl | 599 |
| franklin_v6_contrastive_thinking_final.jsonl | 2684 |
| franklin_v6_contrastive_thinking_weighted.jsonl | 1584 |
| franklin_v7_natural_dialogue_repair.jsonl | 1830 |
| franklin_v8_minimal_thought_repair.jsonl | 1040 |
| franklin_v9_factual_dialogue.jsonl | 3019 |
How these models might be useful
- Local historical-character chatbots with a warmer Franklin voice.
- RAG-backed educational demos where retrieved facts ground the persona.
- Comparative LoRA experiments on identity persistence, tool-call cleanup, English-only steering, and natural dialogue.
- Small-footprint offline agents that offer practical advice in a civic/philosophical style.
- Training-data research: the folder preserves adapter artifacts, configs, model cards, and benchmark links for future ablations.
Browse models
qwen2.5-7b-ben-franklin-v1-lite-r4-qv
Family: Qwen2.5 7B Instruct 4-bit · Size: 0.06 GB · LoRA: r=4 alpha=8 · model card
Benchmark: 30 · Flags: {'base_identity_leak': 1, 'overdenial': 1}
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
- Benchmark score: 30 with flags {'base_identity_leak': 1, 'overdenial': 1}.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen2.5-7b-ben-franklin-v1-lite-r4-qv
qwen2.5-7b-ben-franklin-v2-coherence-r4-qv
Family: Qwen2.5 7B Instruct 4-bit · Size: 0.06 GB · LoRA: r=4 alpha=8 · model card
Benchmark: 33 · Flags: {'base_identity_leak': 1}
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
- Best modest improvement over 7B v1 for coherence and reduced over-denial in the broad benchmark.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
franklin_7b_coherence_repair_v2.jsonl: 287
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen2.5-7b-ben-franklin-v2-coherence-r4-qv
qwen2.5-7b-ben-franklin-v3-factual-coherence-r4-qv
Family: Qwen2.5 7B Instruct 4-bit · Size: 0.06 GB · LoRA: r=4 alpha=8 · model card
Benchmark: 35 · Flags: {'base_identity_leak': 1}
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
- Best numeric coherence benchmark score among evaluated 7B adapters.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
- Despite the name, not a clean factual fix: Craven Street answer still hallucinated.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
franklin_7b_coherence_repair_v2.jsonl: 287
franklin_7b_factual_coherence_repair_v3.jsonl: 308
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen2.5-7b-ben-franklin-v3-factual-coherence-r4-qv
qwen2.5-7b-ben-franklin-v3c-factual-r8-qv-from-base
Family: Qwen2.5 7B Instruct 4-bit · Size: 0.08 GB · LoRA: r=8 alpha=16 · model card
Benchmark: 28 · Flags: {'base_identity_leak': 1, 'overdenial': 1}
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
- Proves r=8 q/v LoRA from clean 7B base can train on this 8GB GPU.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
- Regressed versus v1/v2/v3 in the coherence benchmark.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
franklin_7b_factual_coherence_repair_v3.jsonl: 308
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen2.5-7b-ben-franklin-v3c-factual-r8-qv-from-base
qwen3-4b-instruct-2507-ben-franklin-v1-lora
Family: Qwen3 4B Instruct 4-bit · Size: 1.34 GB · LoRA: r=16 alpha=32 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Middle-size family: more capable than 1.7B while still comfortable on 8GB VRAM.
- Several variants target ChatML/completion formatting, tool-call cleanup, and English-lock behavior.
Weaknesses
- Some later 4B adapters, especially v5, know targeted facts but emit visible tool_call tags offline.
- Can leak base-model identity or policy/meta phrasing depending on prompt path.
Data mix:
franklin_qwen3_4b_answer_only.jsonl: 3253
Compute: Inference/training: comfortable on RTX 3070 8GB in 4-bit. Full-module LoRA at r=16-32 was used historically; expect several GB VRAM and slower but practical training.
./adapters/qwen3-4b-instruct-2507-ben-franklin-v1-lora
qwen3-4b-instruct-2507-ben-franklin-v2-chatml-lora
Family: Qwen3 4B Instruct 4-bit · Size: 2.19 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Middle-size family: more capable than 1.7B while still comfortable on 8GB VRAM.
- Several variants target ChatML/completion formatting, tool-call cleanup, and English-lock behavior.
Weaknesses
- Some later 4B adapters, especially v5, know targeted facts but emit visible tool_call tags offline.
- Can leak base-model identity or policy/meta phrasing depending on prompt path.
Data mix:
franklin_qwen3_4b_answer_only.jsonl: 3253
Compute: Inference/training: comfortable on RTX 3070 8GB in 4-bit. Full-module LoRA at r=16-32 was used historically; expect several GB VRAM and slower but practical training.
./adapters/qwen3-4b-instruct-2507-ben-franklin-v2-chatml-lora
qwen3-4b-instruct-2507-ben-franklin-v3-chatml-completions-lora
Family: Qwen3 4B Instruct 4-bit · Size: 2.19 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Middle-size family: more capable than 1.7B while still comfortable on 8GB VRAM.
- Several variants target ChatML/completion formatting, tool-call cleanup, and English-lock behavior.
Weaknesses
- Some later 4B adapters, especially v5, know targeted facts but emit visible tool_call tags offline.
- Can leak base-model identity or policy/meta phrasing depending on prompt path.
Data mix:
franklin_qwen3_4b_answer_only.jsonl: 3253
Compute: Inference/training: comfortable on RTX 3070 8GB in 4-bit. Full-module LoRA at r=16-32 was used historically; expect several GB VRAM and slower but practical training.
./adapters/qwen3-4b-instruct-2507-ben-franklin-v3-chatml-completions-lora
qwen3-4b-instruct-2507-ben-franklin-v4-toolcall-clean-lora
Family: Qwen3 4B Instruct 4-bit · Size: 1.81 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Middle-size family: more capable than 1.7B while still comfortable on 8GB VRAM.
- Several variants target ChatML/completion formatting, tool-call cleanup, and English-lock behavior.
- Targeted cleanup of visible tool_call artifacts.
Weaknesses
- Some later 4B adapters, especially v5, know targeted facts but emit visible tool_call tags offline.
- Can leak base-model identity or policy/meta phrasing depending on prompt path.
Data mix:
franklin_qwen3_4b_toolcall_cleanup.jsonl: 920
Compute: Inference/training: comfortable on RTX 3070 8GB in 4-bit. Full-module LoRA at r=16-32 was used historically; expect several GB VRAM and slower but practical training.
./adapters/qwen3-4b-instruct-2507-ben-franklin-v4-toolcall-clean-lora
qwen3-4b-instruct-2507-ben-franklin-v5-english-lock-lora
Family: Qwen3 4B Instruct 4-bit · Size: 1.81 GB · LoRA: r=32 alpha=64 · model card
Benchmark: -92 · Flags: {'tool_call': 15, 'continuity_miss': 1}
Strengths
- Middle-size family: more capable than 1.7B while still comfortable on 8GB VRAM.
- Several variants target ChatML/completion formatting, tool-call cleanup, and English-lock behavior.
- Contains useful cleaned English/factual phrasing, including better Craven/Hewson material.
- Benchmark score: -92 with flags {'tool_call': 15, 'continuity_miss': 1}.
Weaknesses
- Some later 4B adapters, especially v5, know targeted facts but emit visible tool_call tags offline.
- Can leak base-model identity or policy/meta phrasing depending on prompt path.
- Offline benchmark showed severe visible tool_call tag regression.
Data mix:
franklin_qwen3_4b_english_lock_cleanup.jsonl: 816
Compute: Inference/training: comfortable on RTX 3070 8GB in 4-bit. Full-module LoRA at r=16-32 was used historically; expect several GB VRAM and slower but practical training.
./adapters/qwen3-4b-instruct-2507-ben-franklin-v5-english-lock-lora
qwen3-1.7b-ben-franklin-identity-reinforced-lora
Family: Qwen3 1.7B 4-bit · Size: 0.99 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-identity-reinforced-lora
qwen3-1.7b-ben-franklin-openai-expanded-lora
Family: Qwen3 1.7B 4-bit · Size: 1.2 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-openai-expanded-lora
qwen3-1.7b-ben-franklin-thinking-lora
Family: Qwen3 1.7B 4-bit · Size: 0.99 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-thinking-lora
qwen3-1.7b-ben-franklin-thinking-v2-lora
Family: Qwen3 1.7B 4-bit · Size: 0.78 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
franklin_7b_coherence_repair_v2.jsonl: 287
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-thinking-v2-lora
qwen3-1.7b-ben-franklin-thinking-v3-negative-identity-lora
Family: Qwen3 1.7B 4-bit · Size: 0.78 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
franklin_7b_factual_coherence_repair_v3.jsonl: 308
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-thinking-v3-negative-identity-lora
qwen3-1.7b-ben-franklin-thinking-v4-balanced-lora
Family: Qwen3 1.7B 4-bit · Size: 0.78 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-thinking-v4-balanced-lora
qwen3-1.7b-ben-franklin-thinking-v5-ood-fixed-lora
Family: Qwen3 1.7B 4-bit · Size: 0.78 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-thinking-v5-ood-fixed-lora
qwen3-1.7b-ben-franklin-thinking-v6-1-ood-fixed-lora
Family: Qwen3 1.7B 4-bit · Size: 0.99 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-thinking-v6-1-ood-fixed-lora
qwen3-1.7b-ben-franklin-thinking-v6-contrastive-lora
Family: Qwen3 1.7B 4-bit · Size: 1.62 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-thinking-v6-contrastive-lora
qwen3-1.7b-ben-franklin-thinking-v7-natural-dialogue-lora
Family: Qwen3 1.7B 4-bit · Size: 1.2 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
- Focused on more natural short conversational replies.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-thinking-v7-natural-dialogue-lora
qwen3-1.7b-ben-franklin-thinking-v8-minimal-thought-lora
Family: Qwen3 1.7B 4-bit · Size: 0.99 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
- Focused on reducing visible thought/over-reasoning style.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-thinking-v8-minimal-thought-lora
qwen3-1.7b-ben-franklin-thinking-v9-factual-dialogue-lora
Family: Qwen3 1.7B 4-bit · Size: 1.62 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
- Focused on factual dialogue and hard Franklin biography prompts.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
franklin_7b_factual_coherence_repair_v3.jsonl: 308
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-thinking-v9-factual-dialogue-lora
qwen3-1.7b-ben-franklin-thinking-v9-from-v2-factual-dialogue-lora
Family: Qwen3 1.7B 4-bit · Size: 1.62 GB · LoRA: r=32 alpha=64 · model card
Benchmark: · Flags: not benchmarked
Strengths
- Largest Benjamin Franklin LoRA family proven trainable on this RTX 3070 8GB machine.
- Best base reasoning/coherence among the local Franklin adapters.
- Good short-turn continuity in the coherence benchmark.
- Focused on factual dialogue and hard Franklin biography prompts.
Weaknesses
- Qwen2.5 7B base is already highly steerable, so improvements over the prompted base are modest.
- q/v-only LoRA is weak for implanting stubborn factual corrections.
- Craven Street/Hewson factuality remains unreliable unless retrieval/prompt context is supplied.
Data mix:
franklin_qwen3_8b_chatml_answer_clean.jsonl: 2400
franklin_7b_coherence_repair_v2.jsonl: 287
franklin_7b_factual_coherence_repair_v3.jsonl: 308
Compute: Inference: RTX 3070 8GB works in 4-bit with the adapter; expect roughly 6-7GB VRAM. Training proven only with q_proj/v_proj, r=4 or r=8, max_seq_length=512, batch_size=1, gradient_accumulation=16; full-module LoRA is not recommended on 8GB.
./adapters/qwen3-1.7b-ben-franklin-thinking-v9-from-v2-factual-dialogue-lora