Update usage example with infinity

#13
by michaelfeil - opened Nov 12, 2024
base: refs/heads/main
←
from: refs/pr/13
Discussion Files changed
-0
michaelfeil
Nov 12, 2024
docker run --gpus all -v $PWD/data:/app/.cache -p "7997":"7997" \
michaelf34/infinity:0.0.68 \
v2 --model-id Alibaba-NLP/gte-base-en-v1.5 --revision "main" --dtype bfloat16 --batch-size 32 --device cuda --engine torch --port 7997
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO     2024-11-12 23:40:58,030 infinity_emb INFO:        infinity_server.py:89
         Creating 1engines:                                                     
         engines=['Alibaba-NLP/gte-base-en-v1.5']                               
INFO     2024-11-12 23:40:58,035 infinity_emb INFO: Anonymized   telemetry.py:30
         telemetry can be disabled via environment variable                     
         `DO_NOT_TRACK=1`.                                                      
INFO     2024-11-12 23:40:58,042 infinity_emb INFO:           select_model.py:64
         model=`Alibaba-NLP/gte-base-en-v1.5` selected, using                   
         engine=`torch` and device=`cuda`                                       
INFO     2024-11-12 23:41:00,320                      SentenceTransformer.py:216
         sentence_transformers.SentenceTransformer                              
         INFO: Load pretrained SentenceTransformer:                             
         Alibaba-NLP/gte-base-en-v1.5                                           
A new version of the following files was downloaded from https://huggingface.co/Alibaba-NLP/new-impl:
- configuration.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
A new version of the following files was downloaded from https://huggingface.co/Alibaba-NLP/new-impl:
- modeling.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


INFO     2024-11-12 23:43:33,218 infinity_emb INFO: Adding    acceleration.py:56
         optimizations via Huggingface optimum.                                 
The class `optimum.bettertransformers.transformation.BetterTransformer` is deprecated and will be removed in a future release.
WARNING  2024-11-12 23:43:33,220 infinity_emb WARNING:        acceleration.py:67
         BetterTransformer is not available for model: <class                   
         'transformers_modules.Alibaba-NLP.new-impl.40ced75c3                   
         017eb27626c9d4ea981bde21a2662f4.modeling.NewModel'>                    
         Continue without bettertransformer modeling code.                      
INFO     2024-11-12 23:43:33,469 infinity_emb INFO: Getting   select_model.py:97
         timings for batch_size=32 and avg tokens per                           
         sentence=1                                                             
                 3.29     ms tokenization                                       
                 6.17     ms inference                                          
                 0.14     ms post-processing                                    
                 9.60     ms total                                              
         embeddings/sec: 3332.34                                                
INFO     2024-11-12 23:43:33,674 infinity_emb INFO: Getting  select_model.py:103
         timings for batch_size=32 and avg tokens per                           
         sentence=512                                                           
                 16.20    ms tokenization                                       
                 71.20    ms inference                                          
                 0.21     ms post-processing                                    
                 87.61    ms total                                              
         embeddings/sec: 365.26
Update usage example with infinity3d61b474
thenlper changed pull request status to merged Nov 15, 2024
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
· Sign up or log in to comment