Telco Troubleshooting Agentic Challenge β€” Track A submission

code.zip in this repository is the reproducible submission for Track A of the Telco Troubleshooting Agentic Challenge (Phase 2). It implements a single-shot Qwen3-32B agent over a rule-first / LLM-residual pipeline.

  • Base model: Qwen/Qwen3-32B (no fine-tuning, no LoRA adapters).
  • Inference stack: vLLM (OpenAI-compatible endpoint) β€” no internet access required at runtime.
  • Throughput: ~9 s / scenario on qwen/qwen3-32b, ~1 k tokens / scenario.

Contents

File Description
code.zip Self-contained submission archive. Extract and run per the embedded README.md.

The archive layout follows the competition's SUBMISSION.md requirements:

code.zip
β”œβ”€β”€ README.md            # how to install + run + expected outputs
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ run.py / run.sh      # submission entry point
β”œβ”€β”€ models/deploy.sh     # vLLM deployment command for Qwen3-32B
β”œβ”€β”€ .env.example
β”œβ”€β”€ src/telco_agent/     # agent code (classifier, physics, prompts, resolver, …)
└── scripts/             # legacy dev helpers

How to reproduce

unzip code.zip -d telco_track_a
cd telco_track_a

# 1. Install deps
pip install -r requirements.txt

# 2. Deploy the base model (vLLM, local GPU host)
bash models/deploy.sh        # default: Qwen3-32B at http://localhost:8001

# 3. Run the agent (writes result/ with traces.json, results.csv, runtime.json, README.md)
bash run.sh --input /path/to/scenarios.json --output result

result/ will contain the three artefacts required by the rubric:

  • results.csv β€” scenario_id,prediction (one row per problem).
  • traces.json β€” every LLM completion produced.
  • runtime.json β€” per-problem wall time in seconds, written by the runtime_logger decorator reproduced verbatim from SUBMISSION.md.

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 1 β€” Extract drive-test KPIs, signaling counts,   β”‚
β”‚     cell configs. Rule cascade β†’ fault kind.     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 2 β€” Pre-filter options to 1-4 IDs.               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 3 β€” Pre-compute per-candidate dB gain delta      β”‚
β”‚     (matches the simulator's gain_pattern).      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 4 β€” Single-shot Qwen3 call:                      β”‚
β”‚     system="5G RAN engineer, output \\boxed{IDs}"β”‚
β”‚     user=facts + candidates + per-kind guidance  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 5 β€” Parse \\boxed{...} β†’ final answer.           β”‚
β”‚     Fallback to rule-based prior on parse fail.  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The classifier produces 11 fault kinds (pdcch, multi_pingpong, multi_az_txinc, add_neighbor, thld_dec, a3_dec, multi_tilt_txdec, tilt_up, test_server, insufficient, unknown) and applies a focused resolution procedure per kind, including hedging strategies for ambiguous single-answer cases.

Why rule-first + LLM-residual?

  • Reliability β€” single-shot \boxed{IDs} over pre-filtered candidates is dramatically more reliable than multi-turn tool-calling loops on this scale of problem (past winners greenpark12345 and vaderyang used the same pattern).
  • Reproducibility β€” the rule engine narrows the answer space to 1–4 candidates, making the LLM job near-deterministic at temperature=0.1.
  • Cost β€” ~1 k tokens / scenario vs 5–10Γ— that for tool-calling loops.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for soul0101/telco-track-a-submission

Base model

Qwen/Qwen3-32B
Finetuned
(506)
this model