OpenLLM

mirror of https://github.com/bentoml/OpenLLM.git synced 2026-01-23 15:01:32 -05:00

Files

Aaron Pham 6f724416c0 perf: build quantization and better transformer behaviour (#28 )

Fixes quantization_config and low_cpu_mem_usage to be available on PyTorch implementation only

See changelog for more details on #28

2023-06-17 08:56:14 -04:00

feat: quantization (#27 )

2023-06-16 18:10:50 -04:00

2023-06-16 00:19:01 -04:00

__init__.py

tests: fastpath (#17 )

2023-06-12 14:18:26 -04:00

test_configuration.py

2023-06-17 08:56:14 -04:00