Spaces:

QSBench
/

Noise_Detection

Sleeping

App Files Files Community

QSBench commited on Apr 2

Commit

0553f08

verified ·

1 Parent(s): 94dd65f

Update GUIDE.md

Browse files

Files changed (1) hide show

GUIDE.md +113 -41

GUIDE.md CHANGED Viewed

@@ -1,65 +1,137 @@
-You can explore datasets, visualize circuit QASM, and train a classical ML model to predict the noise type.
 ---
-## 🔎 Explorer Tab
-The **Explorer** tab provides a preview of the dataset:
-1. **Dataset Dropdown** – Select one of the datasets:
-   - Core (Clean)
-   - Depolarizing Noise
-   - Amplitude Damping
-   - Hardware-aware Noise
-2. **Split Dropdown** – Select the data split (`train`, `test`, etc.).
-3. **Preview Table** – Shows the first 10 circuits in the split.
-4. **Raw QASM** – Original QASM for the selected circuit.
-5. **Transpiled QASM** – QASM after transpilation, if available.
-6. **Info Box** – Displays dataset name and other info.
-7. **Summary Box** – Shows number of rows in the dataset.
 ---
-## 🧠 Classification Tab
-The **Classification** tab allows you to train a Random Forest classifier on the selected features.
-1. **Input Features** – Select numeric features derived from the circuit:
-   - Adjacency features (density, degree mean, etc.)
-   - QASM features (length, gate counts, measure count, etc.)
-2. **Test Split** – Fraction of data used for testing (default `0.2`).
-3. **Trees (n_estimators)** – Number of trees in the Random Forest.
-4. **Max Depth** – Maximum tree depth. Critical parameter; increasing it may cause runtime issues.
-5. **Random Seed** – Seed for reproducibility.
-Click **Train & Evaluate** to:
-- Fit the classifier
-- Compute metrics:
-  - Accuracy
-  - Macro F1
-  - Weighted F1
-- Show confusion matrix
-- Show top 10 feature importances
-> ⚠️ Note: Max depth is the most influential hyperparameter. Setting it too high may crash the Space. Start with lower values.
 ---
-## 📊 Output
-After training, you will see:
-1. **Confusion Matrix** – True vs predicted labels.
-2. **Feature Importance** – Most relevant features for classification.
-3. **Metrics** – Overall classification performance.
 ---
-## 🔗 Links
-- [QSBench Website](https://qsbench.github.io)
-- [Hugging Face Datasets](https://huggingface.co/QSBench)
-- [GitHub Repository](https://github.com/QSBench)

+# 🌌 QSBench: Noise Classification Guide
+Welcome to the **QSBench Noise Classification Hub**.
+This tool demonstrates how Machine Learning can distinguish different **noise conditions** in quantum circuits using only structural and topological features — without running expensive simulations.
+---
+## ⚠️ Important: Demo Dataset Notice
+This Space uses **demo shards** of the QSBench datasets.
+- **Limited size**: The dataset is intentionally reduced for fast loading and demonstration.
+- **Impact**: Model performance may be unstable or noisy, especially on the minority class.
+- **Goal**: Showcase how circuit structure correlates with noise type — not achieve production-level accuracy.
 ---
+## 🧠 1. What is Being Predicted?
+The model performs **multi-class classification** into four noise conditions:
+### Classes
+- **`clean`** — Ideal circuit without noise
+- **`depolarizing`** — Uniform depolarizing noise
+- **`amplitude_damping`** — Energy relaxation / amplitude damping
+- **`hardware_aware`** — Realistic hardware-aware noise after transpilation
+The task is to predict the **noise_label** from circuit features only.
 ---
+## 🧩 2. How the Model “Sees” a Circuit
+The model does **not** simulate quantum states or noise channels.
+Instead, it relies on **structural proxies**:
+### 🔹 Topology Features
+- `adj_density` — How densely qubits are connected
+- `adj_degree_mean` — Average qubit connectivity
+- `adj_degree_std` — Variability in connectivity
+→ These reflect the **interaction graph** and entanglement potential.
+### 🔹 Gate Structure
+- `total_gates`
+- `single_qubit_gates`
+- `two_qubit_gates`
+- `cx_count` (or similar two-qubit counts)
+→ Two-qubit gates strongly influence noise sensitivity.
+### 🔹 Complexity Metrics
+- `depth`
+- `gate_entropy`
+→ Capture how “deep” and “structured” the circuit is.
+### 🔹 QASM-derived Signals
+- `qasm_length`
+- `qasm_line_count`
+- `qasm_gate_keyword_count`
+→ Lightweight text-based proxies for circuit complexity.
+---
+## 🤖 3. Model Overview
+The system uses:
+### HistGradientBoostingClassifier
+- Fast and accurate gradient boosting on tabular data
+- Handles non-linear relationships well
+- Supports `class_weight="balanced"` to deal with class imbalance
+**Pipeline includes:**
+- Median imputation for missing values
+- Standard scaling
+- Gradient boosting classifier
+---
+## 📊 4. Understanding the Results
+After clicking **"Train & Evaluate"**, you get:
+### A. Confusion Matrix
+Shows how often each true noise type is predicted correctly or confused with others.
+### B. Correct vs Incorrect
+Simple histogram of prediction accuracy.
+### C. Top-10 Feature Importances
+Highlights which circuit features contribute most to distinguishing noise types.
+Typical strong signals:
+- `cx_count` / two-qubit gate counts
+- Topology features (`adj_density`, `adj_degree_*`)
+- `depth` and complexity metrics
+---
+## 📉 5. Metrics Explained
+- **Accuracy** — Overall fraction of correctly classified circuits
+- **Macro F1** — Average F1-score per class (treats all classes equally — sensitive to minority class `clean`)
+- **Weighted F1** — F1-score weighted by class support
+- **Per-class Precision / Recall / F1** — Detailed view, especially important for the underrepresented `clean` class
+---
+## 🧪 6. Experimentation Tips
+Try the following to better understand the model:
+- **Focus on `clean` class** — select features carefully and observe how `class_weight="balanced"` helps
+- Remove strong features (e.g. `cx_count`) → see performance drop
+- Use only topology features → isolate structural effect
+- Increase **Trees** (`max_iter`) to 300–500 for more stable predictions
+- Adjust **Max depth** and **Test split** to check robustness
+- Compare results with and without `class_weight`
 ---
+## 🔬 7. Key Insight
+> Noise type is not invisible — it leaves detectable fingerprints in circuit structure.
+Even without expensive noisy simulation, features like gate counts, connectivity, and depth already contain enough signal to classify the underlying noise condition.
+This demonstrates the power of **structure-aware** quantum machine learning.
 ---
+## 🔗 8. Project Resources
+- 🤗 **Hugging Face**: [https://huggingface.co/QSBench](https://huggingface.co/QSBench)
+- 💻 **GitHub**: [https://github.com/QSBench](https://github.com/QSBench)
+- 🌐 **Website**: [https://qsbench.github.io](https://qsbench.github.io)