cwchang commited on
Commit
0ad27b7
·
1 Parent(s): 5b8cdf3

feat: add intuitive UX improvements with step indicators and real-time validation

Browse files

Major UX enhancements to make the voice cloning workflow more intuitive:

Step Indicator System:
- Add visual 3-step progress indicator (Upload → Configure → Generate)
- Show current step with blue highlight, completed steps with green checkmarks
- Automatic step progression when tasks complete
- Responsive design (full labels on desktop, icons only on mobile)

Real-time Form Validation:
- Instant visual feedback with colored borders (green=valid, red=error)
- Character counters for text inputs
- Success checkmarks on field headers when complete
- Error messages displayed inline near relevant fields

Smart Button States:
- Show yellow info box explaining why button is disabled
- Dynamic messages based on missing requirements
- Automatically hide when conditions are met

Contextual Help & Guidance:
- Blue help boxes with specific instructions for each field
- Concrete placeholder examples instead of vague descriptions
- Key information highlighted with <strong> tags
- Audio requirements (3-10s, clear, single speaker)

Auto-flow Navigation:
- Auto-advance to step 2 after successful upload
- Auto-advance to step 3 when target text entered
- Smooth scroll to relevant section when step changes
- Visual guidance keeps user oriented

UX Principles Applied:
✓ Immediate feedback (instant border color changes)
✓ Clear visual hierarchy (current/completed/pending states)
✓ Error prevention (help text, examples, requirements)
✓ Visibility (progress, completion status, errors inline)
✓ Guidance (auto-progression, contextual help)

Technical Implementation:
- New state: currentStep, field-specific error states
- Step completion checks: isStep1Complete, isStep2Complete
- Smart disabled reason function: getDisabledReason()
- New icons: AlertCircle, HelpCircle from lucide-react
- Auto-scroll on step transition with useEffect

Documentation: UX_IMPROVEMENTS_V2.md

Files changed (2) hide show
  1. UX_IMPROVEMENTS_V2.md +298 -0
  2. frontend/src/App.tsx +255 -46
UX_IMPROVEMENTS_V2.md ADDED
@@ -0,0 +1,298 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # UX 改進 V2 - 直覺化使用體驗
2
+
3
+ ## 改進目標
4
+
5
+ 讓使用者能更直覺地理解操作流程,減少困惑,提升完成任務的效率。
6
+
7
+ ## 主要改進項目
8
+
9
+ ### 1. ✅ 視覺化步驟指示器
10
+
11
+ **問題**: 使用者不知道要按什麼順序操作,也不清楚目前進度
12
+
13
+ **解決方案**:
14
+ - 添加頂部三步驟進度指示器:上傳音訊 → 設定參考 → 生成語音
15
+ - 每個步驟顯示編號和狀態(未完成/進行中/已完成)
16
+ - 完成的步驟顯示綠色勾選標記
17
+ - 當前步驟高亮顯示藍色
18
+ - 響應式設計:桌面顯示文字標籤,移動版只顯示圖標
19
+
20
+ **實作**:
21
+ ```tsx
22
+ {/* 步驟 1: 上傳音訊 */}
23
+ <div className={`w-10 h-10 rounded-full ${
24
+ isStep1Complete ? 'border-green-500 bg-green-500/10' :
25
+ currentStep === 1 ? 'border-brand-primary bg-brand-primary/10' :
26
+ 'border-border bg-background-tertiary'
27
+ }`}>
28
+ {isStep1Complete ? <CheckCircle /> : <span>1</span>}
29
+ </div>
30
+ ```
31
+
32
+ ### 2. ✅ 即時表單驗證與視覺反饋
33
+
34
+ **問題**: 使用者不知道輸入是否正確,直到點擊生成按鈕才發現錯誤
35
+
36
+ **解決方案**:
37
+ - **即時邊框顏色變化**:
38
+ - 預設: 灰色邊框
39
+ - 有內容且正確: 綠色邊框 + 勾選標記
40
+ - 有錯誤: 紅色邊框 + 錯誤圖標
41
+ - **字元計數器**: 即時顯示輸入字元數
42
+ - **成功指示**: 欄位完成時在標題欄顯示綠色勾選
43
+
44
+ **實作**:
45
+ ```tsx
46
+ <textarea
47
+ className={`${
48
+ error ? 'border-red-500 focus:ring-red-500/20' :
49
+ hasContent ? 'border-green-500 focus:ring-green-500/20' :
50
+ 'border-border-light focus:ring-brand-primary/20'
51
+ }`}
52
+ />
53
+ <div className="absolute bottom-3 right-3">
54
+ {text.length} 字元
55
+ </div>
56
+ ```
57
+
58
+ ### 3. ✅ 智能提示與說明
59
+
60
+ **問題**: 使用者不清楚要輸入什麼內容或格式要求
61
+
62
+ **解決方案**:
63
+ - **藍色提示框**: 每個輸入區域都有藍色背景的提示說明
64
+ - **具體範例**: Placeholder 提供實際範例而非空泛描述
65
+ - **重點標示**: 使用 `<strong>` 標記關鍵資訊
66
+
67
+ **範例**:
68
+ ```tsx
69
+ {/* 上傳提示 */}
70
+ <div className="bg-blue-500/10 border border-blue-500/30 rounded-lg p-3">
71
+ <HelpCircle className="w-4 h-4 text-blue-400" />
72
+ <p className="text-blue-400 text-sm">
73
+ 上傳 <strong>3-10 秒</strong>的清晰語音,單一說話者,無背景噪音效果最佳
74
+ </p>
75
+ </div>
76
+
77
+ {/* 參考文字提示 */}
78
+ <p className="text-blue-400 text-sm">
79
+ 輸入參考音訊中說的<strong>完整內容</strong>,逐字稿越準確,生成品質越好
80
+ </p>
81
+ ```
82
+
83
+ ### 4. ✅ 智能按鈕狀態與禁用原因
84
+
85
+ **問題**: 按鈕被禁用時,使用者不知道為什麼無法點擊
86
+
87
+ **解決方案**:
88
+ - **禁用提示框**: 當按鈕無法點擊時,上方顯示黃色提示框說明原因
89
+ - **動態訊息**: 根據實際缺少的內容顯示對應提示
90
+ - **自動隱藏**: 條件滿足後提示框自動消失
91
+
92
+ **實作**:
93
+ ```tsx
94
+ {!canGenerate && !loading && (
95
+ <div className="bg-yellow-500/10 border border-yellow-500/30 rounded-lg p-3">
96
+ <Info className="w-4 h-4 text-yellow-400" />
97
+ <span>{getDisabledReason()}</span>
98
+ </div>
99
+ )}
100
+
101
+ // 動態判斷原因
102
+ const getDisabledReason = (): string => {
103
+ if (!refAudioId) return '請先上傳參考音訊'
104
+ if (!xVectorOnly && !refText.trim()) return '請輸入參考文字'
105
+ if (!targetText.trim()) return '請輸入目標文字'
106
+ return ''
107
+ }
108
+ ```
109
+
110
+ ### 5. ✅ 改進錯誤訊息顯示
111
+
112
+ **問題**: 錯誤訊息顯示在底部,使用者可能看不到
113
+
114
+ **解決方案**:
115
+ - **欄位級錯誤**: 錯誤訊息直接顯示在相關輸入欄位下方
116
+ - **紅色圖標**: 使用 AlertCircle 圖標標示錯誤
117
+ - **即時清除**: 使用者開始輸入時自動清除錯誤訊息
118
+
119
+ **實作**:
120
+ ```tsx
121
+ {refTextError && (
122
+ <div className="flex items-center gap-2 text-red-400 text-sm">
123
+ <AlertCircle className="w-4 h-4" />
124
+ <span>{refTextError}</span>
125
+ </div>
126
+ )}
127
+ ```
128
+
129
+ ### 6. ✅ 自動流程引導
130
+
131
+ **問題**: 使用者完成一個步驟後不知道接下來要做什麼
132
+
133
+ **解決方案**:
134
+ - **自動步驟切換**:
135
+ - 上傳成功 → 自動切換到步驟 2
136
+ - 輸入目標文字 → 自動切換到步驟 3
137
+ - **平滑滾動**: 進入步驟 3 時自動滾動到目標文字區域
138
+ - **視覺引導**: 當前步驟高亮,未完成步驟變暗
139
+
140
+ **實作**:
141
+ ```tsx
142
+ // 上傳成功後
143
+ setCurrentStep(2)
144
+
145
+ // 輸入目標文字後
146
+ if (e.target.value.trim() && isStep2Complete) {
147
+ setCurrentStep(3)
148
+ }
149
+
150
+ // 自動滾動
151
+ useEffect(() => {
152
+ if (currentStep === 3 && targetTextRef.current) {
153
+ targetTextRef.current.scrollIntoView({
154
+ behavior: 'smooth',
155
+ block: 'center'
156
+ })
157
+ }
158
+ }, [currentStep])
159
+ ```
160
+
161
+ ### 7. ✅ 更好的 Placeholder 範例
162
+
163
+ **改進前**:
164
+ - 參考文字: "輸入參考音訊中說的完整內容..."
165
+ - 目標文字: "輸入您想讓複製聲音說的內容..."
166
+
167
+ **改進後**:
168
+ - 參考文字: "例如:今天天氣真好,我們一起去公園散步吧..."
169
+ - 目標文字: "例如:歡迎來到我的頻道���今天要跟大家分享..."
170
+
171
+ **優點**: 具體範例讓使用者更清楚輸入格式和長度
172
+
173
+ ## UX 設計原則應用
174
+
175
+ ### 1. 即時反饋 (Immediate Feedback)
176
+ - ✅ 輸入時即時邊框顏色變化
177
+ - ✅ 字元計數即時更新
178
+ - ✅ 步驟完成立即顯示勾選
179
+
180
+ ### 2. 清晰的視覺層次 (Visual Hierarchy)
181
+ - ✅ 當前步驟高亮 (藍色)
182
+ - ✅ 完成步驟次要 (綠色)
183
+ - ✅ 未完成步驟最低 (灰色)
184
+
185
+ ### 3. 錯誤預防 (Error Prevention)
186
+ - ✅ 提示框說明要求
187
+ - ✅ 範例 Placeholder
188
+ - ✅ 禁用按鈕時說明原因
189
+
190
+ ### 4. 可見性原則 (Visibility)
191
+ - ✅ 步驟進度清楚可見
192
+ - ✅ 每個欄位的完成狀態可見
193
+ - ✅ 錯誤訊息顯示在相關位置
194
+
195
+ ### 5. 引導與協助 (Guidance)
196
+ - ✅ 藍色提示框提供幫助
197
+ - ✅ 自動步驟切換減少認知負擔
198
+ - ✅ 視覺引導使用者注意力
199
+
200
+ ## 技術實現
201
+
202
+ ### 新增狀態管理
203
+ ```tsx
204
+ const [currentStep, setCurrentStep] = useState<1 | 2 | 3>(1)
205
+ const [refTextError, setRefTextError] = useState<string>('')
206
+ const [targetTextError, setTargetTextError] = useState<string>('')
207
+ ```
208
+
209
+ ### 步驟完成檢查
210
+ ```tsx
211
+ const isStep1Complete = refAudioId !== ''
212
+ const isStep2Complete = isStep1Complete && (xVectorOnly || refText.trim() !== '')
213
+ const canGenerate = isStep2Complete && targetText.trim() !== ''
214
+ ```
215
+
216
+ ### 新增圖標
217
+ ```tsx
218
+ import { AlertCircle, HelpCircle } from 'lucide-react'
219
+ ```
220
+
221
+ ## 檔案修改
222
+
223
+ - `frontend/src/App.tsx`: 主要改進實作
224
+ - 添加步驟指示器組件
225
+ - 添加即時驗證邏輯
226
+ - 添加智能提示系統
227
+ - 改進錯誤顯示機制
228
+
229
+ ## 使用者體驗改善指標
230
+
231
+ ### 改進前的問題
232
+ 1. ❌ 不知道要按什麼順序操作
233
+ 2. ❌ 不清楚為什麼按鈕無法點擊
234
+ 3. ❌ 錯誤發生後才知道輸入有問題
235
+ 4. ❌ 不確定輸入格式是否正確
236
+ 5. ❌ 完成一步後不知道下一步
237
+
238
+ ### 改進後的優勢
239
+ 1. ✅ 清楚的三步驟視覺引導
240
+ 2. ✅ 按鈕禁用時顯示原因
241
+ 3. ✅ 輸入時即時顯示對錯
242
+ 4. ✅ 每個欄位都有範例和說明
243
+ 5. ✅ 自動引導到下一步驟
244
+
245
+ ## 無障礙性改進
246
+
247
+ - ✅ 所有互動元素都有 `aria-label`
248
+ - ✅ 錯誤訊息與對應欄位關聯
249
+ - ✅ 顏色搭配圖標,不僅依賴顏色
250
+ - ✅ 鍵盤導航順序符合視覺流程
251
+
252
+ ## 響應式設計
253
+
254
+ - **桌面 (≥768px)**: 顯示完整步驟文字標籤
255
+ - **平板/手機 (<768px)**: 只顯示步驟編號,節省空間
256
+
257
+ ## 測試建議
258
+
259
+ ### 使用者測試場景
260
+
261
+ 1. **新使用者首次使用**:
262
+ - 觀察是否能在無指導下完成流程
263
+ - 記錄遇到困惑的步驟
264
+
265
+ 2. **錯誤處理**:
266
+ - 嘗試不上傳音訊就點生成
267
+ - 嘗試不輸入文字就生成
268
+ - 觀察錯誤提示是否清楚
269
+
270
+ 3. **流程順暢度**:
271
+ - 測量從上傳到生成的時間
272
+ - 觀察是否有卡住的步驟
273
+
274
+ ## 未來可以改進的方向
275
+
276
+ 1. **進階引導**:
277
+ - 首次使用時顯示引導遮罩 (Tooltip tour)
278
+ - 添加「跳過引導」選項
279
+
280
+ 2. **更多範例**:
281
+ - 提供「使用範例」按鈕快速填入示範內容
282
+ - 多語言範例切換
283
+
284
+ 3. **歷史記錄**:
285
+ - 保存最近使用的設定
286
+ - 快速重複生成功能
287
+
288
+ 4. **進度動畫**:
289
+ - 生成時顯示進度條而非只有 spinner
290
+ - 預估剩餘時間
291
+
292
+ 5. **音訊波形預覽**:
293
+ - 上傳後顯示音訊波形
294
+ - 可視化音訊長度和品質
295
+
296
+ ## 總結
297
+
298
+ 這次 UX 改進專注於「直覺性」和「即時反饋」,透過視覺化的步驟指示、智能提示系統和即時驗證,大幅降低使用者的認知負擔,讓整個語音複製流程更加順暢和易用。
frontend/src/App.tsx CHANGED
@@ -14,6 +14,8 @@ import {
14
  XCircle,
15
  Download,
16
  Play,
 
 
17
  } from 'lucide-react'
18
 
19
  // 开发环境使用 localhost,生产环境可通过环境变量配置
@@ -39,10 +41,14 @@ function App() {
39
  const [statusType, setStatusType] = useState<'info' | 'success' | 'error'>('info')
40
  const [generatedAudioId, setGeneratedAudioId] = useState<string>('')
41
  const [generatedAudioUrl, setGeneratedAudioUrl] = useState<string>('')
 
 
 
42
 
43
  const fileInputRef = useRef<HTMLInputElement>(null)
44
  const audioRef = useRef<HTMLAudioElement>(null)
45
  const refAudioRef = useRef<HTMLAudioElement>(null)
 
46
 
47
  // 清理 blob URL,避免記憶體洩漏
48
  useEffect(() => {
@@ -93,6 +99,7 @@ function App() {
93
  setRefAudioId(data.audio_id)
94
  setStatus(`音訊已上傳 (${data.duration.toFixed(1)}秒)`)
95
  setStatusType('success')
 
96
  } catch (error) {
97
  setStatus(`上傳失敗: ${error instanceof Error ? error.message : '未知錯誤'}`)
98
  setStatusType('error')
@@ -103,6 +110,19 @@ function App() {
103
  }
104
  }
105
 
 
 
 
 
 
 
 
 
 
 
 
 
 
106
  // 處理拖放
107
  const handleDrop = (e: React.DragEvent) => {
108
  e.preventDefault()
@@ -116,9 +136,17 @@ function App() {
116
 
117
  // 處理點擊上傳
118
  const handleUploadClick = () => {
 
119
  fileInputRef.current?.click()
120
  }
121
 
 
 
 
 
 
 
 
122
  // 處理生成
123
  const handleGenerate = async () => {
124
  if (!refAudioId) {
@@ -226,19 +254,106 @@ function App() {
226
 
227
  {/* Main Area */}
228
  <main className="flex flex-col gap-8 bg-background-primary px-6 md:px-12 lg:px-24 pt-12 pb-16 w-full max-w-7xl mx-auto">
229
- <h2 className="text-text-subtle text-xl font-medium">從參考音訊複製聲音</h2>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
230
 
231
  <div className="flex flex-col lg:flex-row gap-8 w-full">
232
  {/* Left Column */}
233
  <div className="flex flex-col gap-6 bg-background-tertiary border border-border rounded-xl p-6 md:p-8 w-full lg:w-1/2">
234
  {/* Reference Audio Section */}
235
  <div className="flex flex-col gap-4 w-full">
236
- <div className="flex items-center gap-3 bg-gradient-to-r from-brand-primary to-brand-secondary rounded-lg px-4 py-3 w-full shadow-sm">
237
- <div className="w-6 h-6 rounded bg-white/10 flex items-center justify-center">
238
- <Music className="w-4 h-4 text-white" />
 
 
 
239
  </div>
240
- <span className="text-white text-base font-semibold">參考音訊(上傳要複製的聲音樣本)</span>
 
 
241
  </div>
 
 
 
 
 
 
 
 
 
 
 
242
  <div
243
  onDrop={handleDrop}
244
  onDragOver={handleDragOver}
@@ -297,20 +412,62 @@ function App() {
297
 
298
  {/* Reference Text Section */}
299
  <div className="flex flex-col gap-4 w-full">
300
- <div className="flex items-center gap-3 bg-gradient-to-r from-brand-primary to-brand-secondary rounded-lg px-4 py-3 w-full shadow-sm">
301
- <div className="w-6 h-6 rounded bg-white/10 flex items-center justify-center">
302
- <FileText className="w-4 h-4 text-white" />
 
 
 
303
  </div>
304
- <span className="text-white text-base font-semibold">參考文字(參考音訊的逐字稿)</span>
 
 
305
  </div>
306
- <textarea
307
- value={refText}
308
- onChange={(e) => setRefText(e.target.value)}
309
- placeholder="輸入參考音訊中說的完整內容..."
310
- disabled={xVectorOnly}
311
- className="bg-background-secondary border border-border-light rounded-lg p-4 h-[140px] w-full text-text-secondary text-base placeholder-text-subtle focus:outline-none focus:border-brand-primary focus:ring-2 focus:ring-brand-primary/20 transition-all duration-200 resize-none disabled:opacity-50 disabled:cursor-not-allowed"
312
- aria-label="參考輸入"
313
- />
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
314
  </div>
315
 
316
  {/* X-Vector Section */}
@@ -334,20 +491,63 @@ function App() {
334
  {/* Right Column */}
335
  <div className="flex flex-col gap-6 bg-background-tertiary border border-border rounded-xl p-6 md:p-8 w-full lg:w-1/2">
336
  {/* Target Text Section */}
337
- <div className="flex flex-col gap-4 w-full">
338
- <div className="flex items-center gap-3 bg-gradient-to-r from-brand-primary to-brand-secondary rounded-lg px-4 py-3 w-full shadow-sm">
339
- <div className="w-6 h-6 rounded bg-white/10 flex items-center justify-center">
340
- <MessageSquare className="w-4 h-4 text-white" />
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
341
  </div>
342
- <span className="text-white text-base font-semibold">目標文字(要用複製聲音合成的文字)</span>
343
  </div>
344
- <textarea
345
- value={targetText}
346
- onChange={(e) => setTargetText(e.target.value)}
347
- placeholder="輸入您想讓複製聲音說的內容..."
348
- className="bg-background-secondary border border-border-light rounded-lg p-4 h-[180px] w-full text-text-secondary text-base placeholder-text-subtle focus:outline-none focus:border-brand-primary focus:ring-2 focus:ring-brand-primary/20 transition-all duration-200 resize-none"
349
- aria-label="目標文字輸入"
350
- />
351
  </div>
352
 
353
  {/* Language Selection */}
@@ -369,24 +569,33 @@ function App() {
369
  </div>
370
 
371
  {/* Generate Button */}
372
- <button
373
- onClick={handleGenerate}
374
- disabled={loading || !refAudioId || !targetText.trim() || (!xVectorOnly && !refText.trim())}
375
- className="flex items-center justify-center gap-3 bg-gradient-to-b from-brand-primary to-brand-secondary rounded-lg px-8 py-5 shadow-[0_6px_20px_rgba(99,102,241,0.27)] w-full hover:shadow-[0_8px_28px_rgba(99,102,241,0.4)] hover:transform hover:scale-[1.01] transition-all duration-200 disabled:opacity-50 disabled:cursor-not-allowed disabled:hover:scale-100 disabled:hover:shadow-[0_6px_20px_rgba(99,102,241,0.27)] focus:outline-none focus:ring-2 focus:ring-brand-primary focus:ring-offset-2 focus:ring-offset-background-tertiary cursor-pointer"
376
- aria-label="開始複製並生成語音"
377
- >
378
- {loading ? (
379
- <>
380
- <Loader2 className="w-6 h-6 text-white animate-spin" />
381
- <span className="text-white text-lg font-semibold">生成中...</span>
382
- </>
383
- ) : (
384
- <>
385
- <Zap className="w-6 h-6 text-white" />
386
- <span className="text-white text-lg font-semibold">複製並生成</span>
387
- </>
388
  )}
389
- </button>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
390
 
391
  {/* Output Section */}
392
  <div className="flex flex-col gap-4 w-full">
 
14
  XCircle,
15
  Download,
16
  Play,
17
+ AlertCircle,
18
+ HelpCircle,
19
  } from 'lucide-react'
20
 
21
  // 开发环境使用 localhost,生产环境可通过环境变量配置
 
41
  const [statusType, setStatusType] = useState<'info' | 'success' | 'error'>('info')
42
  const [generatedAudioId, setGeneratedAudioId] = useState<string>('')
43
  const [generatedAudioUrl, setGeneratedAudioUrl] = useState<string>('')
44
+ const [currentStep, setCurrentStep] = useState<1 | 2 | 3>(1)
45
+ const [refTextError, setRefTextError] = useState<string>('')
46
+ const [targetTextError, setTargetTextError] = useState<string>('')
47
 
48
  const fileInputRef = useRef<HTMLInputElement>(null)
49
  const audioRef = useRef<HTMLAudioElement>(null)
50
  const refAudioRef = useRef<HTMLAudioElement>(null)
51
+ const targetTextRef = useRef<HTMLDivElement>(null)
52
 
53
  // 清理 blob URL,避免記憶體洩漏
54
  useEffect(() => {
 
99
  setRefAudioId(data.audio_id)
100
  setStatus(`音訊已上傳 (${data.duration.toFixed(1)}秒)`)
101
  setStatusType('success')
102
+ setCurrentStep(2) // 自動進到下一步
103
  } catch (error) {
104
  setStatus(`上傳失敗: ${error instanceof Error ? error.message : '未知錯誤'}`)
105
  setStatusType('error')
 
110
  }
111
  }
112
 
113
+ // 檢查步驟完成狀態
114
+ const isStep1Complete = refAudioId !== ''
115
+ const isStep2Complete = isStep1Complete && (xVectorOnly || refText.trim() !== '')
116
+ const canGenerate = isStep2Complete && targetText.trim() !== ''
117
+
118
+ // 取得禁用原因
119
+ const getDisabledReason = (): string => {
120
+ if (!refAudioId) return '請先上傳參考音訊'
121
+ if (!xVectorOnly && !refText.trim()) return '請輸入參考文字'
122
+ if (!targetText.trim()) return '請輸入目標文字'
123
+ return ''
124
+ }
125
+
126
  // 處理拖放
127
  const handleDrop = (e: React.DragEvent) => {
128
  e.preventDefault()
 
136
 
137
  // 處理點擊上傳
138
  const handleUploadClick = () => {
139
+ setCurrentStep(1)
140
  fileInputRef.current?.click()
141
  }
142
 
143
+ // 當步驟改變時滾動到相應位置
144
+ useEffect(() => {
145
+ if (currentStep === 3 && targetTextRef.current) {
146
+ targetTextRef.current.scrollIntoView({ behavior: 'smooth', block: 'center' })
147
+ }
148
+ }, [currentStep])
149
+
150
  // 處理生成
151
  const handleGenerate = async () => {
152
  if (!refAudioId) {
 
254
 
255
  {/* Main Area */}
256
  <main className="flex flex-col gap-8 bg-background-primary px-6 md:px-12 lg:px-24 pt-12 pb-16 w-full max-w-7xl mx-auto">
257
+ {/* 步驟指示器 */}
258
+ <div className="flex items-center justify-center gap-4 w-full">
259
+ {/* 步驟 1 */}
260
+ <div className="flex items-center gap-3">
261
+ <div className={`flex items-center justify-center w-10 h-10 rounded-full border-2 transition-all duration-200 ${
262
+ isStep1Complete
263
+ ? 'border-green-500 bg-green-500/10'
264
+ : currentStep === 1
265
+ ? 'border-brand-primary bg-brand-primary/10'
266
+ : 'border-border bg-background-tertiary'
267
+ }`}>
268
+ {isStep1Complete ? (
269
+ <CheckCircle className="w-5 h-5 text-green-500" />
270
+ ) : (
271
+ <span className={`text-sm font-semibold ${currentStep === 1 ? 'text-brand-primary' : 'text-text-disabled'}`}>1</span>
272
+ )}
273
+ </div>
274
+ <span className={`text-sm font-medium hidden md:block ${
275
+ isStep1Complete ? 'text-green-500' : currentStep === 1 ? 'text-text-secondary' : 'text-text-disabled'
276
+ }`}>上傳音訊</span>
277
+ </div>
278
+
279
+ {/* 連接線 */}
280
+ <div className={`h-0.5 w-12 md:w-20 transition-colors duration-200 ${
281
+ isStep1Complete ? 'bg-green-500' : 'bg-border'
282
+ }`} />
283
+
284
+ {/* 步驟 2 */}
285
+ <div className="flex items-center gap-3">
286
+ <div className={`flex items-center justify-center w-10 h-10 rounded-full border-2 transition-all duration-200 ${
287
+ isStep2Complete
288
+ ? 'border-green-500 bg-green-500/10'
289
+ : currentStep === 2
290
+ ? 'border-brand-primary bg-brand-primary/10'
291
+ : 'border-border bg-background-tertiary'
292
+ }`}>
293
+ {isStep2Complete ? (
294
+ <CheckCircle className="w-5 h-5 text-green-500" />
295
+ ) : (
296
+ <span className={`text-sm font-semibold ${currentStep === 2 ? 'text-brand-primary' : 'text-text-disabled'}`}>2</span>
297
+ )}
298
+ </div>
299
+ <span className={`text-sm font-medium hidden md:block ${
300
+ isStep2Complete ? 'text-green-500' : currentStep === 2 ? 'text-text-secondary' : 'text-text-disabled'
301
+ }`}>設定參考</span>
302
+ </div>
303
+
304
+ {/* 連接線 */}
305
+ <div className={`h-0.5 w-12 md:w-20 transition-colors duration-200 ${
306
+ isStep2Complete ? 'bg-green-500' : 'bg-border'
307
+ }`} />
308
+
309
+ {/* 步驟 3 */}
310
+ <div className="flex items-center gap-3">
311
+ <div className={`flex items-center justify-center w-10 h-10 rounded-full border-2 transition-all duration-200 ${
312
+ generatedAudioUrl
313
+ ? 'border-green-500 bg-green-500/10'
314
+ : currentStep === 3
315
+ ? 'border-brand-primary bg-brand-primary/10'
316
+ : 'border-border bg-background-tertiary'
317
+ }`}>
318
+ {generatedAudioUrl ? (
319
+ <CheckCircle className="w-5 h-5 text-green-500" />
320
+ ) : (
321
+ <span className={`text-sm font-semibold ${currentStep === 3 ? 'text-brand-primary' : 'text-text-disabled'}`}>3</span>
322
+ )}
323
+ </div>
324
+ <span className={`text-sm font-medium hidden md:block ${
325
+ generatedAudioUrl ? 'text-green-500' : currentStep === 3 ? 'text-text-secondary' : 'text-text-disabled'
326
+ }`}>生成語音</span>
327
+ </div>
328
+ </div>
329
 
330
  <div className="flex flex-col lg:flex-row gap-8 w-full">
331
  {/* Left Column */}
332
  <div className="flex flex-col gap-6 bg-background-tertiary border border-border rounded-xl p-6 md:p-8 w-full lg:w-1/2">
333
  {/* Reference Audio Section */}
334
  <div className="flex flex-col gap-4 w-full">
335
+ <div className="flex items-center justify-between gap-3 bg-gradient-to-r from-brand-primary to-brand-secondary rounded-lg px-4 py-3 w-full shadow-sm">
336
+ <div className="flex items-center gap-3">
337
+ <div className="w-6 h-6 rounded bg-white/10 flex items-center justify-center">
338
+ <Music className="w-4 h-4 text-white" />
339
+ </div>
340
+ <span className="text-white text-base font-semibold">參考音訊(上傳要複製的聲音樣本)</span>
341
  </div>
342
+ {isStep1Complete && (
343
+ <CheckCircle className="w-5 h-5 text-green-400" />
344
+ )}
345
  </div>
346
+
347
+ {/* 提示訊息 */}
348
+ {!refAudioFile && (
349
+ <div className="flex items-start gap-2 bg-blue-500/10 border border-blue-500/30 rounded-lg p-3">
350
+ <HelpCircle className="w-4 h-4 text-blue-400 flex-shrink-0 mt-0.5" />
351
+ <p className="text-blue-400 text-sm">
352
+ 上傳 <strong>3-10 秒</strong>的清晰語音,單一說話者,無背景噪音效果最佳
353
+ </p>
354
+ </div>
355
+ )}
356
+
357
  <div
358
  onDrop={handleDrop}
359
  onDragOver={handleDragOver}
 
412
 
413
  {/* Reference Text Section */}
414
  <div className="flex flex-col gap-4 w-full">
415
+ <div className="flex items-center justify-between gap-3 bg-gradient-to-r from-brand-primary to-brand-secondary rounded-lg px-4 py-3 w-full shadow-sm">
416
+ <div className="flex items-center gap-3">
417
+ <div className="w-6 h-6 rounded bg-white/10 flex items-center justify-center">
418
+ <FileText className="w-4 h-4 text-white" />
419
+ </div>
420
+ <span className="text-white text-base font-semibold">參考文字(參考音訊的逐字稿)</span>
421
  </div>
422
+ {!xVectorOnly && refText.trim() && (
423
+ <CheckCircle className="w-5 h-5 text-green-400" />
424
+ )}
425
  </div>
426
+
427
+ {/* 提示訊息 */}
428
+ {!xVectorOnly && (
429
+ <div className="flex items-start gap-2 bg-blue-500/10 border border-blue-500/30 rounded-lg p-3">
430
+ <HelpCircle className="w-4 h-4 text-blue-400 flex-shrink-0 mt-0.5" />
431
+ <p className="text-blue-400 text-sm">
432
+ 輸入參考音訊中說的<strong>完整內容</strong>,逐稿越準確,生成品質越好
433
+ </p>
434
+ </div>
435
+ )}
436
+
437
+ <div className="relative">
438
+ <textarea
439
+ value={refText}
440
+ onChange={(e) => {
441
+ setRefText(e.target.value)
442
+ setRefTextError('')
443
+ }}
444
+ onBlur={() => {
445
+ if (!xVectorOnly && !refText.trim() && refAudioId) {
446
+ setRefTextError('需要參考文字才能獲得最佳品質')
447
+ }
448
+ }}
449
+ placeholder="例如:今天天氣真好,我們一起去公園散步吧..."
450
+ disabled={xVectorOnly}
451
+ className={`bg-background-secondary border rounded-lg p-4 h-[140px] w-full text-text-secondary text-base placeholder-text-subtle focus:outline-none focus:ring-2 transition-all duration-200 resize-none disabled:opacity-50 disabled:cursor-not-allowed ${
452
+ refTextError
453
+ ? 'border-red-500 focus:border-red-500 focus:ring-red-500/20'
454
+ : !xVectorOnly && refText.trim()
455
+ ? 'border-green-500 focus:border-green-500 focus:ring-green-500/20'
456
+ : 'border-border-light focus:border-brand-primary focus:ring-brand-primary/20'
457
+ }`}
458
+ aria-label="參考文字輸入"
459
+ />
460
+ <div className="absolute bottom-3 right-3 text-text-subtle text-xs">
461
+ {refText.length} 字元
462
+ </div>
463
+ </div>
464
+
465
+ {refTextError && (
466
+ <div className="flex items-center gap-2 text-red-400 text-sm">
467
+ <AlertCircle className="w-4 h-4" />
468
+ <span>{refTextError}</span>
469
+ </div>
470
+ )}
471
  </div>
472
 
473
  {/* X-Vector Section */}
 
491
  {/* Right Column */}
492
  <div className="flex flex-col gap-6 bg-background-tertiary border border-border rounded-xl p-6 md:p-8 w-full lg:w-1/2">
493
  {/* Target Text Section */}
494
+ <div ref={targetTextRef} className="flex flex-col gap-4 w-full">
495
+ <div className="flex items-center justify-between gap-3 bg-gradient-to-r from-brand-primary to-brand-secondary rounded-lg px-4 py-3 w-full shadow-sm">
496
+ <div className="flex items-center gap-3">
497
+ <div className="w-6 h-6 rounded bg-white/10 flex items-center justify-center">
498
+ <MessageSquare className="w-4 h-4 text-white" />
499
+ </div>
500
+ <span className="text-white text-base font-semibold">目標文字(要用複製聲音合成的文字)</span>
501
+ </div>
502
+ {targetText.trim() && (
503
+ <CheckCircle className="w-5 h-5 text-green-400" />
504
+ )}
505
+ </div>
506
+
507
+ {/* 提示訊息 */}
508
+ <div className="flex items-start gap-2 bg-blue-500/10 border border-blue-500/30 rounded-lg p-3">
509
+ <HelpCircle className="w-4 h-4 text-blue-400 flex-shrink-0 mt-0.5" />
510
+ <p className="text-blue-400 text-sm">
511
+ 輸入您想要複製聲音說的內容,支援中、英、日、韓等多種語言
512
+ </p>
513
+ </div>
514
+
515
+ <div className="relative">
516
+ <textarea
517
+ value={targetText}
518
+ onChange={(e) => {
519
+ setTargetText(e.target.value)
520
+ setTargetTextError('')
521
+ if (e.target.value.trim() && isStep2Complete) {
522
+ setCurrentStep(3)
523
+ }
524
+ }}
525
+ onBlur={() => {
526
+ if (!targetText.trim() && isStep2Complete) {
527
+ setTargetTextError('請輸入要生成的文字內容')
528
+ }
529
+ }}
530
+ placeholder="例如:歡迎來到我的頻道,今天要跟大家分享..."
531
+ className={`bg-background-secondary border rounded-lg p-4 h-[180px] w-full text-text-secondary text-base placeholder-text-subtle focus:outline-none focus:ring-2 transition-all duration-200 resize-none ${
532
+ targetTextError
533
+ ? 'border-red-500 focus:border-red-500 focus:ring-red-500/20'
534
+ : targetText.trim()
535
+ ? 'border-green-500 focus:border-green-500 focus:ring-green-500/20'
536
+ : 'border-border-light focus:border-brand-primary focus:ring-brand-primary/20'
537
+ }`}
538
+ aria-label="目標文字輸入"
539
+ />
540
+ <div className="absolute bottom-3 right-3 text-text-subtle text-xs">
541
+ {targetText.length} 字元
542
  </div>
 
543
  </div>
544
+
545
+ {targetTextError && (
546
+ <div className="flex items-center gap-2 text-red-400 text-sm">
547
+ <AlertCircle className="w-4 h-4" />
548
+ <span>{targetTextError}</span>
549
+ </div>
550
+ )}
551
  </div>
552
 
553
  {/* Language Selection */}
 
569
  </div>
570
 
571
  {/* Generate Button */}
572
+ <div className="flex flex-col gap-3 w-full">
573
+ {!canGenerate && !loading && (
574
+ <div className="flex items-center gap-2 bg-yellow-500/10 border border-yellow-500/30 rounded-lg p-3">
575
+ <Info className="w-4 h-4 text-yellow-400 flex-shrink-0" />
576
+ <span className="text-yellow-400 text-sm">{getDisabledReason()}</span>
577
+ </div>
 
 
 
 
 
 
 
 
 
 
578
  )}
579
+
580
+ <button
581
+ onClick={handleGenerate}
582
+ disabled={loading || !canGenerate}
583
+ className="flex items-center justify-center gap-3 bg-gradient-to-b from-brand-primary to-brand-secondary rounded-lg px-8 py-5 shadow-[0_6px_20px_rgba(99,102,241,0.27)] w-full hover:shadow-[0_8px_28px_rgba(99,102,241,0.4)] hover:transform hover:scale-[1.01] transition-all duration-200 disabled:opacity-50 disabled:cursor-not-allowed disabled:hover:scale-100 disabled:hover:shadow-[0_6px_20px_rgba(99,102,241,0.27)] focus:outline-none focus:ring-2 focus:ring-brand-primary focus:ring-offset-2 focus:ring-offset-background-tertiary cursor-pointer"
584
+ aria-label="開始複製並生成語音"
585
+ >
586
+ {loading ? (
587
+ <>
588
+ <Loader2 className="w-6 h-6 text-white animate-spin" />
589
+ <span className="text-white text-lg font-semibold">生成中...</span>
590
+ </>
591
+ ) : (
592
+ <>
593
+ <Zap className="w-6 h-6 text-white" />
594
+ <span className="text-white text-lg font-semibold">開始生成語音</span>
595
+ </>
596
+ )}
597
+ </button>
598
+ </div>
599
 
600
  {/* Output Section */}
601
  <div className="flex flex-col gap-4 w-full">