Spaces:

fluidapex
/

hyperglyph_alma0-7b-j-test

Runtime error

fluidapex Claude commited on Oct 17, 2025

Commit

5855c36

1 Parent(s): dfc0de2

Fix torch_dtype deprecation warning and add GPU guidance

- Replace deprecated torch_dtype with dtype parameter
- Add GPU availability check with device selection
- Use float32 on CPU, float16 on GPU for optimal performance
- Add hardware requirements note in README
- Provide instructions for enabling GPU in Space Settings
- Add device and dtype logging for debugging

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (2) hide show

README.md +8 -0
app.py +13 -1

README.md CHANGED Viewed

@@ -23,3 +23,11 @@ This Space demonstrates Japanese to English translation using the ALMA-7B-Ja-V2
 **Usage:**
 Paste your Japanese dialogue text and click Translate, or try one of the example conversations to see the translation quality.

 **Usage:**
 Paste your Japanese dialogue text and click Translate, or try one of the example conversations to see the translation quality.
+**⚠️ Hardware Note:**
+This Space requires GPU hardware for optimal performance. The model will run on CPU but translation will be very slow (1-2 minutes per request).
+To enable GPU:
+1. Go to Space Settings
+2. Select a GPU tier (T4 Small recommended - $0.60/hour)
+3. Restart the Space

app.py CHANGED Viewed

@@ -11,13 +11,25 @@ print(f"Loading {MODEL_NAME}...")
 try:
     tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
     model = AutoModelForCausalLM.from_pretrained(
         MODEL_NAME,
-        torch_dtype=torch.float16,
         device_map="auto",
         low_cpu_mem_usage=True
     )
     print(f"Model loaded successfully on device: {model.device}")
 except Exception as e:
     print(f"Error loading model: {e}")
     raise

 try:
     tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
+    # Check if GPU is available
+    if torch.cuda.is_available():
+        print(f"GPU detected: {torch.cuda.get_device_name(0)}")
+        device = "cuda"
+        dtype = torch.float16
+    else:
+        print("No GPU detected, using CPU (inference will be slow)")
+        device = "cpu"
+        dtype = torch.float32
     model = AutoModelForCausalLM.from_pretrained(
         MODEL_NAME,
+        dtype=dtype,  # Use 'dtype' instead of deprecated 'torch_dtype'
         device_map="auto",
         low_cpu_mem_usage=True
     )
     print(f"Model loaded successfully on device: {model.device}")
+    print(f"Model dtype: {model.dtype}")
 except Exception as e:
     print(f"Error loading model: {e}")
     raise