OpenLLM/examples/bentofile.yaml at cedfa80c452b05e7c1961ef006bcce87a9de022f - OpenLLM - Gitea: Git with a cup of tea

mirror/OpenLLM

mirror of https://github.com/bentoml/OpenLLM.git synced 2026-02-23 02:07:52 -05:00

Files

Aaron Pham 539f250c0f feat(vllm): bump to 0.2.2 (#695 )

* feat(vllm): bump to 0.2.2

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update changelog

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: move up to CUDA 12.1

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* fix: remove auto-gptq installation

since the builder image doesn't have access to GPU

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* fix: update containerization warning

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

2023-11-19 02:52:32 -05:00

7 lines

106 B

YAML

Raw Blame History

 service: 'api_server.py:svc'
 include:
   - 'api_server.py'
 python:
   packages:
     - openllm[vllm]>=0.4.15