--- library_name: pytorch license: apache-2.0 datasets: - huuuyeah/meetingbank tags: - pytorch - transformer - meeting-summarization - custom-code - attention-variant --- # Run Sliding Gqa Custom PyTorch Transformer checkpoint trained on MeetingBank for meeting summarization research. This repository is part of the [`transformer-lab`](https://huggingface.co/collections/Pradheep1647/transformer-lab-6a07fe3185f5728e217997e0) collection. ## Model Details | Field | Value | |---|---| | Repository | `Pradheep1647/run_sliding_gqa-meetingbank-bs8-e20-fp32-19` | | Attention | `sliding_gqa` | | Dataset | `meetingbank` | | Layers | `6` | | Hidden size | `512` | | Heads | `8` | | Batch size | `8` | | Epochs | `20` | | Precision | `fp32` | | Checkpoint | `meeting_model19.pt` | ## Training Loss ![Training loss](loss_curve.svg) Raw curve data is available in [`loss_curve.csv`](loss_curve.csv). ## Available Models | Variant | Repository | |---|---| | `gqa_rope` | [`Pradheep1647/run_gqa_rope-meetingbank-bs8-e20-fp32-19`](https://huggingface.co/Pradheep1647/run_gqa_rope-meetingbank-bs8-e20-fp32-19) | | `mqa` | [`Pradheep1647/run_mqa-meetingbank-bs8-e20-fp32-19`](https://huggingface.co/Pradheep1647/run_mqa-meetingbank-bs8-e20-fp32-19) | | `gqa` | [`Pradheep1647/run_gqa-meetingbank-bs8-e20-fp32-19`](https://huggingface.co/Pradheep1647/run_gqa-meetingbank-bs8-e20-fp32-19) | | `mha` | [`Pradheep1647/run_mha-meetingbank-bs8-e20-fp32-19`](https://huggingface.co/Pradheep1647/run_mha-meetingbank-bs8-e20-fp32-19) | | `sliding_gqa` | [`Pradheep1647/run_sliding_gqa-meetingbank-bs8-e20-fp32-19`](https://huggingface.co/Pradheep1647/run_sliding_gqa-meetingbank-bs8-e20-fp32-19) | ## Files | File | Purpose | |---|---| | `meeting_model19.pt` | PyTorch checkpoint containing `model_state_dict`, optimizer states, epoch, and global step. | | `config.json` | Training and architecture config converted from the Hydra run config. | | `tokenizer.json` | MeetingBank transcript tokenizer alias for source inputs. | | `transcript_tokenizer.json` | Explicit MeetingBank transcript tokenizer. | | `summary_tokenizer.json` | MeetingBank summary tokenizer for target text. | | `loss_curve.csv` | TensorBoard `train/loss` scalar export. | | `loss_curve.svg` | Static training-loss plot generated from `loss_curve.csv`. | ## Usage These checkpoints are from a custom PyTorch codebase, not a `transformers.AutoModel` checkpoint. Use the repo-native builder to instantiate the architecture, then load the checkpoint state dict. ```python from pathlib import Path import torch from huggingface_hub import hf_hub_download from omegaconf import OmegaConf import src # registers components from src.model.builder import build_transformer repo_id = "Pradheep1647/run_sliding_gqa-meetingbank-bs8-e20-fp32-19" config_path = hf_hub_download(repo_id=repo_id, filename="config.json") checkpoint_path = hf_hub_download(repo_id=repo_id, filename="meeting_model19.pt") cfg = OmegaConf.load(config_path) model = build_transformer(cfg) state = torch.load(checkpoint_path, map_location="cpu") model.load_state_dict(state["model_state_dict"]) model.eval() print(f"Loaded {repo_id} from {Path(checkpoint_path).name}") ``` ## Notes - This is a research checkpoint for comparing attention variants under the same MeetingBank setup. - The config and tokenizers are included so future runs can reproduce the architecture and preprocessing assumptions. - Use `config.json` as the source of truth for architecture parameters.