Added a “Troubleshooting & Tips” section to the GCP Vertex documentation.
This section provides guidance for self-hosted users on common issues
they may encounter when setting up Google Vertex AI integration in Khoj.
Topics covered include permissions, region compatibility, prompt size
limits, API key testing, and secure key management with environment
variables. The goal is to improve the onboarding experience and reduce
setup errors for contributors and self-hosters using Vertex AI models
like Claude and Gemini.
Signed off by: brightally6@gmail.com
- Specify E2B api key and template to use via env variables
- Try load, use e2b library when E2B api key set
- Fallback to try use terrarium sandbox otherwise
- Enable more python packages in e2b sandbox like rdkit via custom e2b template
- Use Async E2B Sandbox
- Parallelize file IO with sandbox
- Add documentation on how to enable E2B as code sandbox instead of Terrarium
- Set KHOJ_ALLOWED_DOMAIN to the domain that Khoj is accessible on
from the host machine. This can be the internal i.p or domain of the
host machine.
It can be used by your load balancer/reverse_proxy to access Khoj.
For example, if the load balancer service is in the khoj docker
network, KHOJ_DOMAIN will be `server' (i.e service name).
- Set KHOJ_DOMAIN to your externally accessible DOMAIN or I.P to avoid
CSRF trusted origin or unset cookie issue when trying to access the
khoj admin panel.
Resolves#1114
This change adds the ability to use OpenAI, Azure OpenAI or any embedding model exposed behind an OpenAI compatible API (like Ollama, LiteLLM, vLLM etc.).
Khoj previously only supported HuggingFace embedding models running locally on device or via HuggingFaceW inference API endpoint. This allows using commercial embedding models to index your content with Khoj.
This allows online search to work out of the box again
for self-hosting users, as no auth/api key setup required.
Docker users do not need to change anything in their setup flow.
Direct installers can setup Searxng locally or use public instances if
they do not want to use any of the other providers (like Jina, Serper)
Resolves#749. Resolves#990
- Previous was incorrectly plural but was defining only a single model
- Rename chat model table field to name
- Update documentation
- Update references functions and variables to match new name
* Rename OpenAIProcessorConversationConfig to more apt AiModelAPI
The DB model name had drifted from what it is being used for,
a general chat api provider that supports other chat api providers like
anthropic and google chat models apart from openai based chat models.
This change renames the DB model and updates the docs to remove this
confusion.
Using Ai Model Api we catch most use-cases including chat, stt, image generation etc.
- Integrate with Ollama or other openai compatible APIs by simply
setting `OPENAI_API_BASE' environment variable in docker-compose etc.
- Update docs on integrating with Ollama, openai proxies on first run
- Auto populate all chat models supported by openai compatible APIs
- Auto set vision enabled for all commercial models
- Minor
- Add huggingface cache to khoj_models volume. This is where chat
models and (now) sentence transformer models are stored by default
- Reduce verbosity of yarn install of web app. Otherwise hit docker
log size limit & stops showing remaining logs after web app install
- Suggest `ollama pull <model_name>` to start it in background
This was previously required, but now it's only usefuly for more
advanced settings, not typical for self-hosting users.
With recent updates, the user's selected chat model is used for both
Khoj's train of thought and response. This makes it easy to
switch your preferred chat model directly from the user settings
page and not have to update this in the admin panel as well.
Reflect these code changse in the docs, by removing the unnecessary
step for self-hosted users to create a server chat setting when using
an OpenAI proxy service like Ollama, LiteLLM etc.