Spaces:

kimtaeyeong1229
/

AI-Study-Roadmap

Sleeping

App Files Files Community

kimtaeyeong1229 commited on Feb 21

Commit

d4bad9d

verified ·

1 Parent(s): 9eddf4d

Upload views/dl_lab.py with huggingface_hub

Browse files

Files changed (1) hide show

views/dl_lab.py +203 -0

views/dl_lab.py ADDED Viewed

	@@ -0,0 +1,203 @@

+"""DL Lab page: PyTorch MLP training with live progress."""
+import streamlit as st
+import plotly.graph_objects as go
+from plotly.subplots import make_subplots
+import plotly.express as px
+import numpy as np
+import pandas as pd
+from utils.data import get_train_test_data
+from utils.models import train_mlp, build_sklearn_model, train_sklearn_model, XGBOOST_AVAILABLE
+NEEDS_SCALING = {'SVM (RBF)', 'KNN', 'Logistic Regression'}
+def show():
+    st.title("딥러닝 실습 — PyTorch MLP")
+    st.markdown("신경망 구조와 학습 파라미터를 직접 설정하고, 에포크별 학습 과정을 실시간으로 확인하세요.")
+    # Concept explanation
+    with st.expander("퍼셉트론 → MLP 개념 설명"):
+        st.markdown("""
+### 퍼셉트론 (1957, Rosenblatt)
+생물 뉴런을 모방한 최초의 인공 뉴런:
+$$z = w_1x_1 + w_2x_2 + \\cdots + w_nx_n + b = w^Tx + b$$
+$$\\text{output} = \\sigma(z)$$
+**한계**: XOR 문제 해결 불가 (선형 분리 불가)
+### MLP: 퍼셉트론을 여러 층으로 쌓기
+```
+입력층(8) → 은닉층1(64, ReLU) → 은닉층2(32, ReLU) → 출력층(1, Sigmoid)
+```
+### 학습 과정 (역전파)
+1. **순전파**: 입력 → 출력 → 손실 계산
+2. **역전파**: 손실 → 기울기 계산(미분) → 가중치 업데이트
+   $$w \\leftarrow w - \\eta \\cdot \\frac{\\partial L}{\\partial w}$$
+### 활성화 함수 비교
+| 함수 | 수식 | 용도 |
+|------|------|------|
+| **ReLU** | $\\max(0, x)$ | 은닉층 기본값 |
+| **Sigmoid** | $\\frac{1}{1+e^{-x}}$ | 이진분류 출력층 |
+| **BatchNorm** | 배치 정규화 | 학습 안정화 |
+| **Dropout** | 랜덤 뉴런 제거 | 과적합 방지 |
+""")
+    st.markdown("---")
+    st.subheader("하이퍼파라미터 설정")
+    col1, col2 = st.columns(2)
+    with col1:
+        st.markdown("**네트워크 구조**")
+        h1 = st.slider("은닉층 1 크기", 16, 256, 64, step=16)
+        h2 = st.slider("은닉층 2 크기", 8, 128, 32, step=8)
+        add_h3 = st.checkbox("은닉층 3 추가", value=False)
+        h3 = st.slider("은닉층 3 크기", 8, 64, 16, step=8) if add_h3 else None
+        dropout = st.slider("Dropout 비율", 0.0, 0.7, 0.3, step=0.05)
+    with col2:
+        st.markdown("**학습 설정**")
+        epochs = st.slider("Epochs (에포크 수)", 10, 200, 100, step=10)
+        lr = st.select_slider(
+            "학습률 (Learning Rate)",
+            options=[0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05],
+            value=0.001,
+        )
+        batch_size = st.select_slider("Batch Size", options=[16, 32, 64, 128], value=32)
+    hidden_dims = [h1, h2] + ([h3] if h3 else [])
+    # Model architecture preview
+    arch_str = f"입력(8) → {' → '.join([str(h) for h in hidden_dims])} → 출력(1)"
+    st.info(f"**네트워크 구조**: {arch_str}")
+    total_params = 8 * hidden_dims[0]
+    for i in range(len(hidden_dims) - 1):
+        total_params += hidden_dims[i] * hidden_dims[i + 1]
+    total_params += hidden_dims[-1] * 1
+    st.caption(f"예상 파라미터 수 (선형 레이어만): ~{total_params:,}개")
+    st.markdown("---")
+    if st.button("학습 시작", type="primary", use_container_width=True):
+        X_train, X_test, y_train, y_test, X_tr_sc, X_te_sc, _ = get_train_test_data()
+        # Live progress placeholders
+        progress_bar = st.progress(0, text="학습 시작...")
+        col_loss, col_acc = st.columns(2)
+        loss_placeholder = col_loss.empty()
+        acc_placeholder = col_acc.empty()
+        metrics_placeholder = st.empty()
+        history = {
+            'epoch': [],
+            'train_loss': [], 'test_loss': [],
+            'train_acc': [], 'test_acc': [],
+        }
+        def progress_callback(epoch, total, tr_loss, tr_acc, te_loss, te_acc):
+            history['epoch'].append(epoch)
+            history['train_loss'].append(tr_loss)
+            history['test_loss'].append(te_loss)
+            history['train_acc'].append(tr_acc)
+            history['test_acc'].append(te_acc)
+            progress_bar.progress(epoch / total, text=f"Epoch {epoch}/{total}")
+            # Loss chart
+            fig_l = go.Figure()
+            fig_l.add_trace(go.Scatter(x=history['epoch'], y=history['train_loss'],
+                                       name='학습 손실', line=dict(color='#2ecc71')))
+            fig_l.add_trace(go.Scatter(x=history['epoch'], y=history['test_loss'],
+                                       name='테스트 손실', line=dict(color='#e74c3c')))
+            fig_l.update_layout(title='손실 (BCE Loss)', xaxis_title='Epoch',
+                                 yaxis_title='Loss', height=280, margin=dict(t=40, b=30))
+            loss_placeholder.plotly_chart(fig_l, use_container_width=True)
+            # Accuracy chart
+            fig_a = go.Figure()
+            fig_a.add_trace(go.Scatter(x=history['epoch'], y=history['train_acc'],
+                                       name='학습 정확도', line=dict(color='#2ecc71')))
+            fig_a.add_trace(go.Scatter(x=history['epoch'], y=history['test_acc'],
+                                       name='테스트 정확도', line=dict(color='#e74c3c')))
+            fig_a.update_layout(title='정확도', xaxis_title='Epoch',
+                                 yaxis_title='Accuracy', height=280, margin=dict(t=40, b=30))
+            acc_placeholder.plotly_chart(fig_a, use_container_width=True)
+            if epoch % 10 == 0 or epoch == total:
+                metrics_placeholder.markdown(
+                    f"**Epoch {epoch}/{total}** | "
+                    f"Train Loss: `{tr_loss:.4f}` Acc: `{tr_acc*100:.1f}%` | "
+                    f"Test Loss: `{te_loss:.4f}` Acc: `{te_acc*100:.1f}%`"
+                )
+        result = train_mlp(
+            X_tr_sc, X_te_sc, y_train, y_test,
+            hidden_dims=hidden_dims,
+            epochs=epochs,
+            lr=lr,
+            batch_size=batch_size,
+            dropout=dropout,
+            progress_callback=progress_callback,
+        )
+        progress_bar.empty()
+        # Final metrics
+        st.markdown("---")
+        st.subheader("최종 결과")
+        c1, c2, c3 = st.columns(3)
+        c1.metric("최종 테스트 정확도", f"{result['final_acc']*100:.2f}%")
+        c2.metric("최고 테스트 정확도", f"{max(result['test_accs'])*100:.2f}%",
+                  f"Epoch {result['test_accs'].index(max(result['test_accs']))+1}")
+        c3.metric("수렴 판정", "수렴" if abs(result['test_accs'][-1] - result['test_accs'][-10]) < 0.01
+                  else "미수렴", help="마지막 10 에포크 변화량 < 1%")
+        # Confusion matrix
+        st.subheader("혼동 행렬")
+        cm = result['confusion_matrix']
+        fig_cm = px.imshow(
+            cm, text_auto=True,
+            x=['예측: 사망', '예측: 생존'],
+            y=['실제: 사망', '실제: 생존'],
+            color_continuous_scale='Blues',
+            title='MLP (PyTorch) — 혼동 행렬',
+        )
+        fig_cm.update_layout(coloraxis_showscale=False)
+        st.plotly_chart(fig_cm, use_container_width=True)
+        # Compare with ML models
+        st.markdown("---")
+        st.subheader("ML 모델과 성능 비교")
+        compare_algos = ['Logistic Regression', 'Random Forest', 'Gradient Boosting']
+        if XGBOOST_AVAILABLE:
+            compare_algos.append('XGBoost')
+        cmp_results = {'MLP (PyTorch)': result['final_acc']}
+        for a in compare_algos:
+            use_sc = a in NEEDS_SCALING
+            X_tr = X_tr_sc if use_sc else X_train.values
+            X_te = X_te_sc if use_sc else X_test.values
+            m = build_sklearn_model(a, {})
+            r = train_sklearn_model(m, X_tr, X_te, y_train, y_test)
+            cmp_results[a] = r['accuracy']
+        cmp_df = pd.DataFrame([
+            {'모델': k, '정확도': v} for k, v in sorted(cmp_results.items(), key=lambda x: -x[1])
+        ])
+        fig_bar = px.bar(
+            cmp_df, x='정확도', y='모델', orientation='h',
+            text_auto='.3f',
+            color='정확도', color_continuous_scale='RdYlGn',
+            title='MLP vs ML 모델 비교',
+            range_x=[0.6, 0.95],
+        )
+        fig_bar.update_layout(coloraxis_showscale=False)
+        st.plotly_chart(fig_bar, use_container_width=True)
+        st.success(f"MLP 학습 완료! 최종 테스트 정확도: **{result['final_acc']*100:.2f}%**")