mirror of
https://github.com/bentoml/OpenLLM.git
synced 2026-02-23 02:07:52 -05:00
* feat(vllm): bump to 0.2.2 Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: update changelog Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: move up to CUDA 12.1 Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * fix: remove auto-gptq installation since the builder image doesn't have access to GPU Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * fix: update containerization warning Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
7 lines
106 B
YAML
7 lines
106 B
YAML
service: 'api_server.py:svc'
|
|
include:
|
|
- 'api_server.py'
|
|
python:
|
|
packages:
|
|
- openllm[vllm]>=0.4.15
|