Spaces:
Runtime error
Runtime error
Fix torch_dtype deprecation warning and add GPU guidance
Browse files- Replace deprecated torch_dtype with dtype parameter
- Add GPU availability check with device selection
- Use float32 on CPU, float16 on GPU for optimal performance
- Add hardware requirements note in README
- Provide instructions for enabling GPU in Space Settings
- Add device and dtype logging for debugging
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
README.md
CHANGED
|
@@ -23,3 +23,11 @@ This Space demonstrates Japanese to English translation using the ALMA-7B-Ja-V2
|
|
| 23 |
|
| 24 |
**Usage:**
|
| 25 |
Paste your Japanese dialogue text and click Translate, or try one of the example conversations to see the translation quality.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
**Usage:**
|
| 25 |
Paste your Japanese dialogue text and click Translate, or try one of the example conversations to see the translation quality.
|
| 26 |
+
|
| 27 |
+
**⚠️ Hardware Note:**
|
| 28 |
+
This Space requires GPU hardware for optimal performance. The model will run on CPU but translation will be very slow (1-2 minutes per request).
|
| 29 |
+
|
| 30 |
+
To enable GPU:
|
| 31 |
+
1. Go to Space Settings
|
| 32 |
+
2. Select a GPU tier (T4 Small recommended - $0.60/hour)
|
| 33 |
+
3. Restart the Space
|
app.py
CHANGED
|
@@ -11,13 +11,25 @@ print(f"Loading {MODEL_NAME}...")
|
|
| 11 |
|
| 12 |
try:
|
| 13 |
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
model = AutoModelForCausalLM.from_pretrained(
|
| 15 |
MODEL_NAME,
|
| 16 |
-
|
| 17 |
device_map="auto",
|
| 18 |
low_cpu_mem_usage=True
|
| 19 |
)
|
| 20 |
print(f"Model loaded successfully on device: {model.device}")
|
|
|
|
| 21 |
except Exception as e:
|
| 22 |
print(f"Error loading model: {e}")
|
| 23 |
raise
|
|
|
|
| 11 |
|
| 12 |
try:
|
| 13 |
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
|
| 14 |
+
|
| 15 |
+
# Check if GPU is available
|
| 16 |
+
if torch.cuda.is_available():
|
| 17 |
+
print(f"GPU detected: {torch.cuda.get_device_name(0)}")
|
| 18 |
+
device = "cuda"
|
| 19 |
+
dtype = torch.float16
|
| 20 |
+
else:
|
| 21 |
+
print("No GPU detected, using CPU (inference will be slow)")
|
| 22 |
+
device = "cpu"
|
| 23 |
+
dtype = torch.float32
|
| 24 |
+
|
| 25 |
model = AutoModelForCausalLM.from_pretrained(
|
| 26 |
MODEL_NAME,
|
| 27 |
+
dtype=dtype, # Use 'dtype' instead of deprecated 'torch_dtype'
|
| 28 |
device_map="auto",
|
| 29 |
low_cpu_mem_usage=True
|
| 30 |
)
|
| 31 |
print(f"Model loaded successfully on device: {model.device}")
|
| 32 |
+
print(f"Model dtype: {model.dtype}")
|
| 33 |
except Exception as e:
|
| 34 |
print(f"Error loading model: {e}")
|
| 35 |
raise
|