mirror of
https://github.com/bentoml/OpenLLM.git
synced 2025-12-23 23:57:46 -05:00
feat: add citation (#103)
This commit is contained in:
65
CITATION.cff
Normal file
65
CITATION.cff
Normal file
@@ -0,0 +1,65 @@
|
||||
cff-version: 1.2.0
|
||||
title: 'OpenLLM: Operating LLMs in production'
|
||||
message: >-
|
||||
If you use this software, please cite it using these
|
||||
metadata.
|
||||
type: software
|
||||
authors:
|
||||
- given-names: Aaron
|
||||
family-names: Pham
|
||||
email: aarnphm@bentoml.com
|
||||
orcid: 'https://orcid.org/0009-0008-3180-5115'
|
||||
- given-names: Chaoyu
|
||||
family-names: Yang
|
||||
email: chaoyu@bentoml.com
|
||||
- given-names: Sean
|
||||
family-names: Sheng
|
||||
email: ssheng@bentoml.com
|
||||
- given-names: Shenyang
|
||||
family-names: ' Zhao'
|
||||
email: larme@bentoml.com
|
||||
- given-names: Sauyon
|
||||
family-names: Lee
|
||||
email: sauyon@bentoml.com
|
||||
- given-names: Bo
|
||||
family-names: Jiang
|
||||
email: jiang@bentoml.com
|
||||
- given-names: Fog
|
||||
family-names: Dong
|
||||
email: fog@bentoml.com
|
||||
- given-names: Xipeng
|
||||
family-names: Guan
|
||||
email: xipeng@bentoml.com
|
||||
- given-names: Frost
|
||||
family-names: Ming
|
||||
email: frost@bentoml.com
|
||||
repository-code: 'https://github.com/bentoml/OpenLLM'
|
||||
url: 'https://bentoml.com/'
|
||||
abstract: >-
|
||||
OpenLLM is an open platform for operating large language
|
||||
models (LLMs) in production. With OpenLLM, you can run
|
||||
inference with any open-source large-language models,
|
||||
deploy to the cloud or on-premises, and build powerful AI
|
||||
apps. It has built-in support for a wide range of
|
||||
open-source LLMs and model runtime, including StableLM,
|
||||
Falcon, Dolly, Flan-T5, ChatGLM, StarCoder and more.
|
||||
OpenLLM helps serve LLMs over RESTful API or gRPC with one
|
||||
command or query via WebUI, CLI, our Python/Javascript
|
||||
client, or any HTTP client. It provides first-class
|
||||
support for LangChain, BentoML and Hugging Face that
|
||||
allows you to easily create your own AI apps by composing
|
||||
LLMs with other models and services. Last but not least,
|
||||
it automatically generates LLM server OCI-compatible
|
||||
Container Images or easily deploys as a serverless
|
||||
endpoint via BentoCloud.
|
||||
keywords:
|
||||
- MLOps
|
||||
- LLMOps
|
||||
- LLM
|
||||
- Infrastructure
|
||||
- Transformers
|
||||
- LLM Serving
|
||||
- Model Serving
|
||||
- Serverless Deployment
|
||||
license: Apache-2.0
|
||||
date-released: '2023-06-13'
|
||||
34
README.md
34
README.md
@@ -327,7 +327,8 @@ OPENLLM_FLAN_T5_FRAMEWORK=tf openllm start flan-t5
|
||||
|
||||
### Fine-tuning support (Experimental)
|
||||
|
||||
One can serve OpenLLM models with any PEFT-compatible layers with `--adapter-id`:
|
||||
One can serve OpenLLM models with any PEFT-compatible layers with
|
||||
`--adapter-id`:
|
||||
|
||||
```bash
|
||||
openllm start opt --model-id facebook/opt-6.7b --adapter-id aarnphm/opt-6-7b-quotes
|
||||
@@ -345,21 +346,26 @@ To use multiple adapters, use the following format:
|
||||
openllm start opt --model-id facebook/opt-6.7b --adapter-id aarnphm/opt-6.7b-lora --adapter-id aarnphm/opt-6.7b-lora:french_lora
|
||||
```
|
||||
|
||||
By default, the first adapter-id will be the default Lora layer, but optionally users can change what Lora layer to use for inference via `/v1/adapters`:
|
||||
By default, the first adapter-id will be the default Lora layer, but optionally
|
||||
users can change what Lora layer to use for inference via `/v1/adapters`:
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:3000/v1/adapters --json '{"adapter_name": "vn_lora"}'
|
||||
```
|
||||
|
||||
Note that for multiple adapter-name and adapter-id, it is recommended to update to use the default adapter before sending the inference, to avoid any performance degradation
|
||||
Note that for multiple adapter-name and adapter-id, it is recommended to update
|
||||
to use the default adapter before sending the inference, to avoid any
|
||||
performance degradation
|
||||
|
||||
To include this into the Bento, one can also provide a `--adapter-id` into `openllm build`:
|
||||
To include this into the Bento, one can also provide a `--adapter-id` into
|
||||
`openllm build`:
|
||||
|
||||
```bash
|
||||
openllm build opt --model-id facebook/opt-6.7b --adapter-id ...
|
||||
```
|
||||
```
|
||||
|
||||
> **Note**: We will gradually roll out support for fine-tuning all models. Currently, only OPT has fully adapters support.
|
||||
> **Note**: We will gradually roll out support for fine-tuning all models.
|
||||
> Currently, only OPT has fully adapters support.
|
||||
|
||||
### Integrating a New Model
|
||||
|
||||
@@ -582,3 +588,19 @@ capabilities or have any questions, don't hesitate to reach out in our
|
||||
Checkout our
|
||||
[Developer Guide](https://github.com/bentoml/OpenLLM/blob/main/DEVELOPMENT.md)
|
||||
if you wish to contribute to OpenLLM's codebase.
|
||||
|
||||
## 📔 Citation
|
||||
|
||||
If you use OpenLLM in your research, we provide a [citation](./CITATION.cff) to
|
||||
use:
|
||||
|
||||
```bibtex
|
||||
@software{Pham_OpenLLM_Operating_LLMs_2023,
|
||||
author = {Pham, Aaron and Yang, Chaoyu and Sheng, Sean and Zhao, Shenyang and Lee, Sauyon and Jiang, Bo and Dong, Fog and Guan, Xipeng and Ming, Frost},
|
||||
license = {Apache-2.0},
|
||||
month = jun,
|
||||
title = {{OpenLLM: Operating LLMs in production}},
|
||||
url = {https://github.com/bentoml/OpenLLM},
|
||||
year = {2023}
|
||||
}
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user