fluidapex Claude commited on
Commit
5855c36
·
1 Parent(s): dfc0de2

Fix torch_dtype deprecation warning and add GPU guidance

Browse files

- Replace deprecated torch_dtype with dtype parameter
- Add GPU availability check with device selection
- Use float32 on CPU, float16 on GPU for optimal performance
- Add hardware requirements note in README
- Provide instructions for enabling GPU in Space Settings
- Add device and dtype logging for debugging

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (2) hide show
  1. README.md +8 -0
  2. app.py +13 -1
README.md CHANGED
@@ -23,3 +23,11 @@ This Space demonstrates Japanese to English translation using the ALMA-7B-Ja-V2
23
 
24
  **Usage:**
25
  Paste your Japanese dialogue text and click Translate, or try one of the example conversations to see the translation quality.
 
 
 
 
 
 
 
 
 
23
 
24
  **Usage:**
25
  Paste your Japanese dialogue text and click Translate, or try one of the example conversations to see the translation quality.
26
+
27
+ **⚠️ Hardware Note:**
28
+ This Space requires GPU hardware for optimal performance. The model will run on CPU but translation will be very slow (1-2 minutes per request).
29
+
30
+ To enable GPU:
31
+ 1. Go to Space Settings
32
+ 2. Select a GPU tier (T4 Small recommended - $0.60/hour)
33
+ 3. Restart the Space
app.py CHANGED
@@ -11,13 +11,25 @@ print(f"Loading {MODEL_NAME}...")
11
 
12
  try:
13
  tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
 
 
 
 
 
 
 
 
 
 
 
14
  model = AutoModelForCausalLM.from_pretrained(
15
  MODEL_NAME,
16
- torch_dtype=torch.float16,
17
  device_map="auto",
18
  low_cpu_mem_usage=True
19
  )
20
  print(f"Model loaded successfully on device: {model.device}")
 
21
  except Exception as e:
22
  print(f"Error loading model: {e}")
23
  raise
 
11
 
12
  try:
13
  tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
14
+
15
+ # Check if GPU is available
16
+ if torch.cuda.is_available():
17
+ print(f"GPU detected: {torch.cuda.get_device_name(0)}")
18
+ device = "cuda"
19
+ dtype = torch.float16
20
+ else:
21
+ print("No GPU detected, using CPU (inference will be slow)")
22
+ device = "cpu"
23
+ dtype = torch.float32
24
+
25
  model = AutoModelForCausalLM.from_pretrained(
26
  MODEL_NAME,
27
+ dtype=dtype, # Use 'dtype' instead of deprecated 'torch_dtype'
28
  device_map="auto",
29
  low_cpu_mem_usage=True
30
  )
31
  print(f"Model loaded successfully on device: {model.device}")
32
+ print(f"Model dtype: {model.dtype}")
33
  except Exception as e:
34
  print(f"Error loading model: {e}")
35
  raise