--- title: 🤖 Large language models (LLMs) --- ## Overview Embedchain comes with built-in support for various popular large language models. We handle the complexity of integrating these models for you, allowing you to easily customize your language model interactions through a user-friendly interface. ## OpenAI To use OpenAI LLM models, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys). Once you have obtained the key, you can use it like this: ```python import os from embedchain import App os.environ['OPENAI_API_KEY'] = 'xxx' app = App() app.add("https://en.wikipedia.org/wiki/OpenAI") app.query("What is OpenAI?") ``` If you are looking to configure the different parameters of the LLM, you can do so by loading the app using a [yaml config](https://github.com/embedchain/embedchain/blob/main/configs/chroma.yaml) file. ```python main.py import os from embedchain import App os.environ['OPENAI_API_KEY'] = 'xxx' # load llm configuration from config.yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: openai config: model: 'gpt-3.5-turbo' temperature: 0.5 max_tokens: 1000 top_p: 1 stream: false ``` ### Function Calling To enable [function calling](https://platform.openai.com/docs/guides/function-calling) in your application using embedchain and OpenAI, you need to pass functions into `OpenAILlm` class as an array of functions. Here are several ways in which you can achieve that: Examples: ```python import os from embedchain import App from embedchain.llm.openai import OpenAILlm import requests from pydantic import BaseModel, Field, ValidationError, field_validator os.environ["OPENAI_API_KEY"] = "sk-xxx" class QA(BaseModel): """ A question and answer pair. """ question: str = Field( ..., description="The question.", example="What is a mountain?" ) answer: str = Field( ..., description="The answer.", example="A mountain is a hill." ) person_who_is_asking: str = Field( ..., description="The person who is asking the question.", example="John" ) @field_validator("question") def question_must_end_with_a_question_mark(cls, v): """ Validate that the question ends with a question mark. """ if not v.endswith("?"): raise ValueError("question must end with a question mark") return v @field_validator("answer") def answer_must_end_with_a_period(cls, v): """ Validate that the answer ends with a period. """ if not v.endswith("."): raise ValueError("answer must end with a period") return v llm = OpenAILlm(config=None,functions=[QA]) app = App(llm=llm) result = app.query("Hey I am Sid. What is a mountain? A mountain is a hill.") print(result) ``` ```python import os from embedchain import App from embedchain.llm.openai import OpenAILlm import requests from pydantic import BaseModel, Field, ValidationError, field_validator os.environ["OPENAI_API_KEY"] = "sk-xxx" json_schema = { "name": "get_qa", "description": "A question and answer pair and the user who is asking the question.", "parameters": { "type": "object", "properties": { "question": {"type": "string", "description": "The question."}, "answer": {"type": "string", "description": "The answer."}, "person_who_is_asking": { "type": "string", "description": "The person who is asking the question.", } }, "required": ["question", "answer", "person_who_is_asking"], }, } llm = OpenAILlm(config=None,functions=[json_schema]) app = App(llm=llm) result = app.query("Hey I am Sid. What is a mountain? A mountain is a hill.") print(result) ``` ```python import os from embedchain import App from embedchain.llm.openai import OpenAILlm import requests from pydantic import BaseModel, Field, ValidationError, field_validator os.environ["OPENAI_API_KEY"] = "sk-xxx" def find_info_of_pokemon(pokemon: str): """ Find the information of the given pokemon. Args: pokemon: The pokemon. """ req = requests.get(f"https://pokeapi.co/api/v2/pokemon/{pokemon}") if req.status_code == 404: raise ValueError("pokemon not found") return req.json() llm = OpenAILlm(config=None,functions=[find_info_of_pokemon]) app = App(llm=llm) result = app.query("Tell me more about the pokemon pikachu.") print(result) ``` ## Google AI To use Google AI model, you have to set the `GOOGLE_API_KEY` environment variable. You can obtain the Google API key from the [Google Maker Suite](https://makersuite.google.com/app/apikey) ```python main.py import os from embedchain import App os.environ["GOOGLE_API_KEY"] = "xxx" app = App.from_config(config_path="config.yaml") app.add("https://www.forbes.com/profile/elon-musk") response = app.query("What is the net worth of Elon Musk?") if app.llm.config.stream: # if stream is enabled, response is a generator for chunk in response: print(chunk) else: print(response) ``` ```yaml config.yaml llm: provider: google config: model: gemini-pro max_tokens: 1000 temperature: 0.5 top_p: 1 stream: false embedder: provider: google config: model: 'models/embedding-001' task_type: "retrieval_document" title: "Embeddings for Embedchain" ``` ## Azure OpenAI To use Azure OpenAI model, you have to set some of the azure openai related environment variables as given in the code block below: ```python main.py import os from embedchain import App os.environ["OPENAI_API_TYPE"] = "azure" os.environ["OPENAI_API_BASE"] = "https://xxx.openai.azure.com/" os.environ["OPENAI_API_KEY"] = "xxx" os.environ["OPENAI_API_VERSION"] = "xxx" app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: azure_openai config: model: gpt-3.5-turbo deployment_name: your_llm_deployment_name temperature: 0.5 max_tokens: 1000 top_p: 1 stream: false embedder: provider: azure_openai config: model: text-embedding-ada-002 deployment_name: you_embedding_model_deployment_name ``` You can find the list of models and deployment name on the [Azure OpenAI Platform](https://oai.azure.com/portal). ## Anthropic To use anthropic's model, please set the `ANTHROPIC_API_KEY` which you find on their [Account Settings Page](https://console.anthropic.com/account/keys). ```python main.py import os from embedchain import App os.environ["ANTHROPIC_API_KEY"] = "xxx" # load llm configuration from config.yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: anthropic config: model: 'claude-instant-1' temperature: 0.5 max_tokens: 1000 top_p: 1 stream: false ``` ## Cohere Install related dependencies using the following command: ```bash pip install --upgrade 'embedchain[cohere]' ``` Set the `COHERE_API_KEY` as environment variable which you can find on their [Account settings page](https://dashboard.cohere.com/api-keys). Once you have the API key, you are all set to use it with Embedchain. ```python main.py import os from embedchain import App os.environ["COHERE_API_KEY"] = "xxx" # load llm configuration from config.yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: cohere config: model: large temperature: 0.5 max_tokens: 1000 top_p: 1 ``` ## Together Install related dependencies using the following command: ```bash pip install --upgrade 'embedchain[together]' ``` Set the `TOGETHER_API_KEY` as environment variable which you can find on their [Account settings page](https://api.together.xyz/settings/api-keys). Once you have the API key, you are all set to use it with Embedchain. ```python main.py import os from embedchain import App os.environ["TOGETHER_API_KEY"] = "xxx" # load llm configuration from config.yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: together config: model: togethercomputer/RedPajama-INCITE-7B-Base temperature: 0.5 max_tokens: 1000 top_p: 1 ``` ## Ollama Setup Ollama using https://github.com/jmorganca/ollama ```python main.py import os from embedchain import App # load llm configuration from config.yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: ollama config: model: 'llama2' temperature: 0.5 top_p: 1 stream: true ``` ## vLLM Setup vLLM by following instructions given in [their docs](https://docs.vllm.ai/en/latest/getting_started/installation.html). ```python main.py import os from embedchain import App # load llm configuration from config.yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: vllm config: model: 'meta-llama/Llama-2-70b-hf' temperature: 0.5 top_p: 1 top_k: 10 stream: true trust_remote_code: true ``` ## GPT4ALL Install related dependencies using the following command: ```bash pip install --upgrade 'embedchain[opensource]' ``` GPT4all is a free-to-use, locally running, privacy-aware chatbot. No GPU or internet required. You can use this with Embedchain using the following code: ```python main.py from embedchain import App # load llm configuration from config.yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: gpt4all config: model: 'orca-mini-3b-gguf2-q4_0.gguf' temperature: 0.5 max_tokens: 1000 top_p: 1 stream: false embedder: provider: gpt4all ``` ## JinaChat First, set `JINACHAT_API_KEY` in environment variable which you can obtain from [their platform](https://chat.jina.ai/api). Once you have the key, load the app using the config yaml file: ```python main.py import os from embedchain import App os.environ["JINACHAT_API_KEY"] = "xxx" # load llm configuration from config.yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: jina config: temperature: 0.5 max_tokens: 1000 top_p: 1 stream: false ``` ## Hugging Face Install related dependencies using the following command: ```bash pip install --upgrade 'embedchain[huggingface-hub]' ``` First, set `HUGGINGFACE_ACCESS_TOKEN` in environment variable which you can obtain from [their platform](https://huggingface.co/settings/tokens). Once you have the token, load the app using the config yaml file: ```python main.py import os from embedchain import App os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx" # load llm configuration from config.yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: huggingface config: model: 'google/flan-t5-xxl' temperature: 0.5 max_tokens: 1000 top_p: 0.5 stream: false ``` ### Custom Endpoints You can also use [Hugging Face Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index#-inference-endpoints) to access custom endpoints. First, set the `HUGGINGFACE_ACCESS_TOKEN` as above. Then, load the app using the config yaml file: ```python main.py import os from embedchain import App os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx" # load llm configuration from config.yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: huggingface config: endpoint: https://api-inference.huggingface.co/models/gpt2 # replace with your personal endpoint ``` If your endpoint requires additional parameters, you can pass them in the `model_kwargs` field: ``` llm: provider: huggingface config: endpoint: model_kwargs: max_new_tokens: 100 temperature: 0.5 ``` Currently only supports `text-generation` and `text2text-generation` for now [[ref](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html?highlight=huggingfaceendpoint#)]. See langchain's [hugging face endpoint](https://python.langchain.com/docs/integrations/chat/huggingface#huggingfaceendpoint) for more information. ## Llama2 Llama2 is integrated through [Replicate](https://replicate.com/). Set `REPLICATE_API_TOKEN` in environment variable which you can obtain from [their platform](https://replicate.com/account/api-tokens). Once you have the token, load the app using the config yaml file: ```python main.py import os from embedchain import App os.environ["REPLICATE_API_TOKEN"] = "xxx" # load llm configuration from config.yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: llama2 config: model: 'a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5' temperature: 0.5 max_tokens: 1000 top_p: 0.5 stream: false ``` ## Vertex AI Setup Google Cloud Platform application credentials by following the instruction on [GCP](https://cloud.google.com/docs/authentication/external/set-up-adc). Once setup is done, use the following code to create an app using VertexAI as provider: ```python main.py from embedchain import App # load llm configuration from config.yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: vertexai config: model: 'chat-bison' temperature: 0.5 top_p: 0.5 ``` ## Mistral AI Obtain the Mistral AI api key from their [console](https://console.mistral.ai/). ```python main.py os.environ["MISTRAL_API_KEY"] = "xxx" app = App.from_config(config_path="config.yaml") app.add("https://www.forbes.com/profile/elon-musk") response = app.query("what is the net worth of Elon Musk?") # As of January 16, 2024, Elon Musk's net worth is $225.4 billion. response = app.chat("which companies does elon own?") # Elon Musk owns Tesla, SpaceX, Boring Company, Twitter, and X. response = app.chat("what question did I ask you already?") # You have asked me several times already which companies Elon Musk owns, specifically Tesla, SpaceX, Boring Company, Twitter, and X. ``` ```yaml config.yaml llm: provider: mistralai config: model: mistral-tiny temperature: 0.5 max_tokens: 1000 top_p: 1 embedder: provider: mistralai config: model: mistral-embed ``` ## AWS Bedrock ### Setup - Before using the AWS Bedrock LLM, make sure you have the appropriate model access from [Bedrock Console](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess). - You will also need `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` to authenticate the API with AWS. You can find these in your [AWS Console](https://us-east-1.console.aws.amazon.com/iam/home?region=us-east-1#/users). ### Usage ```python main.py import os from embedchain import App os.environ["AWS_ACCESS_KEY_ID"] = "xxx" os.environ["AWS_SECRET_ACCESS_KEY"] = "xxx" app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml llm: provider: aws_bedrock config: model: amazon.titan-text-express-v1 # check notes below for model_kwargs model_kwargs: temperature: 0.5 topP: 1 maxTokenCount: 1000 ```
The model arguments are different for each providers. Please refer to the [AWS Bedrock Documentation](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/providers) to find the appropriate arguments for your model.