---
title: 🤖 Large language models (LLMs)
---

## Overview

Embedchain comes with built-in support for various popular large language models. We handle the complexity of integrating these models for you, allowing you to easily customize your language model interactions through a user-friendly interface.

<CardGroup cols={4}>
  <Card title="OpenAI" href="#openai"></Card>
  <Card title="Google AI" href="#google-ai"></Card>
  <Card title="Azure OpenAI" href="#azure-openai"></Card>
  <Card title="Anthropic" href="#anthropic"></Card>
  <Card title="Cohere" href="#cohere"></Card>
  <Card title="Together" href="#together"></Card>
  <Card title="Ollama" href="#ollama"></Card>
  <Card title="vLLM" href="#vllm"></Card>
  <Card title="GPT4All" href="#gpt4all"></Card>
  <Card title="JinaChat" href="#jinachat"></Card>
  <Card title="Hugging Face" href="#hugging-face"></Card>
  <Card title="Llama2" href="#llama2"></Card>
  <Card title="Vertex AI" href="#vertex-ai"></Card>
  <Card title="Mistral AI" href="#mistral-ai"></Card>
  <Card title="AWS Bedrock" href="#aws-bedrock"></Card>
</CardGroup>

## OpenAI

To use OpenAI LLM models, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys).

Once you have obtained the key, you can use it like this:

```python
import os
from embedchain import App

os.environ['OPENAI_API_KEY'] = 'xxx'

app = App()
app.add("https://en.wikipedia.org/wiki/OpenAI")
app.query("What is OpenAI?")
```

If you are looking to configure the different parameters of the LLM, you can do so by loading the app using a [yaml config](https://github.com/embedchain/embedchain/blob/main/configs/chroma.yaml) file.

<CodeGroup>

```python main.py
import os
from embedchain import App

os.environ['OPENAI_API_KEY'] = 'xxx'

# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: openai
  config:
    model: 'gpt-3.5-turbo'
    temperature: 0.5
    max_tokens: 1000
    top_p: 1
    stream: false
```
</CodeGroup>

### Function Calling
To enable [function calling](https://platform.openai.com/docs/guides/function-calling) in your application using embedchain and OpenAI, you need to pass functions into `OpenAILlm` class as an array of functions. Here are several ways in which you can achieve that:

Examples:
<Accordion title="Using Pydantic Models">
  ```python
import os
from embedchain import App
from embedchain.llm.openai import OpenAILlm
import requests
from pydantic import BaseModel, Field, ValidationError, field_validator

os.environ["OPENAI_API_KEY"] = "sk-xxx"

class QA(BaseModel):
    """
    A question and answer pair.
    """

    question: str = Field(
        ..., description="The question.", example="What is a mountain?"
    )
    answer: str = Field(
        ..., description="The answer.", example="A mountain is a hill."
    )
    person_who_is_asking: str = Field(
        ..., description="The person who is asking the question.", example="John"
    )

    @field_validator("question")
    def question_must_end_with_a_question_mark(cls, v):
        """
        Validate that the question ends with a question mark.
        """
        if not v.endswith("?"):
            raise ValueError("question must end with a question mark")
        return v

    @field_validator("answer")
    def answer_must_end_with_a_period(cls, v):
        """
        Validate that the answer ends with a period.
        """
        if not v.endswith("."):
            raise ValueError("answer must end with a period")
        return v

llm = OpenAILlm(config=None,functions=[QA])
app = App(llm=llm)

result = app.query("Hey I am Sid. What is a mountain? A mountain is a hill.")

print(result)
  ```
  </Accordion>
  
  <Accordion title="Using OpenAI JSON schema">
```python
import os
from embedchain import App
from embedchain.llm.openai import OpenAILlm
import requests
from pydantic import BaseModel, Field, ValidationError, field_validator

os.environ["OPENAI_API_KEY"] = "sk-xxx"

json_schema = {
    "name": "get_qa",
    "description": "A question and answer pair and the user who is asking the question.",
    "parameters": {
        "type": "object",
        "properties": {
            "question": {"type": "string", "description": "The question."},
            "answer": {"type": "string", "description": "The answer."},
            "person_who_is_asking": {
                "type": "string",
                "description": "The person who is asking the question.",
            }
        },
        "required": ["question", "answer", "person_who_is_asking"],
    },
}

llm = OpenAILlm(config=None,functions=[json_schema])
app = App(llm=llm)

result = app.query("Hey I am Sid. What is a mountain? A mountain is a hill.")

print(result)
  ```
  </Accordion>
  <Accordion title="Using actual python functions">
  ```python
import os
from embedchain import App
from embedchain.llm.openai import OpenAILlm
import requests
from pydantic import BaseModel, Field, ValidationError, field_validator

os.environ["OPENAI_API_KEY"] = "sk-xxx"

def find_info_of_pokemon(pokemon: str):
    """
    Find the information of the given pokemon.
    Args:
        pokemon: The pokemon.
    """
    req = requests.get(f"https://pokeapi.co/api/v2/pokemon/{pokemon}")
    if req.status_code == 404:
        raise ValueError("pokemon not found")
    return req.json()

llm = OpenAILlm(config=None,functions=[find_info_of_pokemon])
app = App(llm=llm)

result = app.query("Tell me more about the pokemon pikachu.")

print(result)
```
</Accordion>

## Google AI

To use Google AI model, you have to set the `GOOGLE_API_KEY` environment variable. You can obtain the Google API key from the [Google Maker Suite](https://makersuite.google.com/app/apikey)

<CodeGroup>
```python main.py
import os
from embedchain import App

os.environ["GOOGLE_API_KEY"] = "xxx"

app = App.from_config(config_path="config.yaml")

app.add("https://www.forbes.com/profile/elon-musk")

response = app.query("What is the net worth of Elon Musk?")
if app.llm.config.stream: # if stream is enabled, response is a generator
    for chunk in response:
        print(chunk)
else:
    print(response)
```

```yaml config.yaml
llm:
  provider: google
  config:
    model: gemini-pro
    max_tokens: 1000
    temperature: 0.5
    top_p: 1
    stream: false

embedder:
  provider: google
  config:
    model: 'models/embedding-001'
    task_type: "retrieval_document"
    title: "Embeddings for Embedchain"
```
</CodeGroup>

## Azure OpenAI

To use Azure OpenAI model, you have to set some of the azure openai related environment variables as given in the code block below:

<CodeGroup>

```python main.py
import os
from embedchain import App

os.environ["OPENAI_API_TYPE"] = "azure"
os.environ["OPENAI_API_BASE"] = "https://xxx.openai.azure.com/"
os.environ["OPENAI_API_KEY"] = "xxx"
os.environ["OPENAI_API_VERSION"] = "xxx"

app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: azure_openai
  config:
    model: gpt-3.5-turbo
    deployment_name: your_llm_deployment_name
    temperature: 0.5
    max_tokens: 1000
    top_p: 1
    stream: false

embedder:
  provider: azure_openai
  config:
    model: text-embedding-ada-002
    deployment_name: you_embedding_model_deployment_name
```
</CodeGroup>

You can find the list of models and deployment name on the [Azure OpenAI Platform](https://oai.azure.com/portal).

## Anthropic

To use anthropic's model, please set the `ANTHROPIC_API_KEY` which you find on their [Account Settings Page](https://console.anthropic.com/account/keys).

<CodeGroup>

```python main.py
import os
from embedchain import App

os.environ["ANTHROPIC_API_KEY"] = "xxx"

# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: anthropic
  config:
    model: 'claude-instant-1'
    temperature: 0.5
    max_tokens: 1000
    top_p: 1
    stream: false
```

</CodeGroup>

## Cohere

Install related dependencies using the following command:

```bash
pip install --upgrade 'embedchain[cohere]'
```

Set the `COHERE_API_KEY` as environment variable which you can find on their [Account settings page](https://dashboard.cohere.com/api-keys).

Once you have the API key, you are all set to use it with Embedchain.

<CodeGroup>

```python main.py
import os
from embedchain import App

os.environ["COHERE_API_KEY"] = "xxx"

# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: cohere
  config:
    model: large
    temperature: 0.5
    max_tokens: 1000
    top_p: 1
```

</CodeGroup>

## Together

Install related dependencies using the following command:

```bash
pip install --upgrade 'embedchain[together]'
```

Set the `TOGETHER_API_KEY` as environment variable which you can find on their [Account settings page](https://api.together.xyz/settings/api-keys).

Once you have the API key, you are all set to use it with Embedchain.

<CodeGroup>

```python main.py
import os
from embedchain import App

os.environ["TOGETHER_API_KEY"] = "xxx"

# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: together
  config:
    model: togethercomputer/RedPajama-INCITE-7B-Base
    temperature: 0.5
    max_tokens: 1000
    top_p: 1
```

</CodeGroup>

## Ollama

Setup Ollama using https://github.com/jmorganca/ollama

<CodeGroup>

```python main.py
import os
from embedchain import App

# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: ollama
  config:
    model: 'llama2'
    temperature: 0.5
    top_p: 1
    stream: true
```

</CodeGroup>


## vLLM

Setup vLLM by following instructions given in [their docs](https://docs.vllm.ai/en/latest/getting_started/installation.html).

<CodeGroup>

```python main.py
import os
from embedchain import App

# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: vllm
  config:
    model: 'meta-llama/Llama-2-70b-hf'
    temperature: 0.5
    top_p: 1
    top_k: 10
    stream: true
    trust_remote_code: true
```

</CodeGroup>

## GPT4ALL

Install related dependencies using the following command:

```bash
pip install --upgrade 'embedchain[opensource]'
```

GPT4all is a free-to-use, locally running, privacy-aware chatbot. No GPU or internet required. You can use this with Embedchain using the following code:

<CodeGroup>

```python main.py
from embedchain import App

# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: gpt4all
  config:
    model: 'orca-mini-3b-gguf2-q4_0.gguf'
    temperature: 0.5
    max_tokens: 1000
    top_p: 1
    stream: false

embedder:
  provider: gpt4all
```
</CodeGroup>


## JinaChat

First, set `JINACHAT_API_KEY` in environment variable which you can obtain from [their platform](https://chat.jina.ai/api).

Once you have the key, load the app using the config yaml file:

<CodeGroup>

```python main.py
import os
from embedchain import App

os.environ["JINACHAT_API_KEY"] = "xxx"
# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: jina
  config:
    temperature: 0.5
    max_tokens: 1000
    top_p: 1
    stream: false
```
</CodeGroup>


## Hugging Face


Install related dependencies using the following command:

```bash
pip install --upgrade 'embedchain[huggingface-hub]'
```

First, set `HUGGINGFACE_ACCESS_TOKEN` in environment variable which you can obtain from [their platform](https://huggingface.co/settings/tokens).

Once you have the token, load the app using the config yaml file:

<CodeGroup>

```python main.py
import os
from embedchain import App

os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx"

# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: huggingface
  config:
    model: 'google/flan-t5-xxl'
    temperature: 0.5
    max_tokens: 1000
    top_p: 0.5
    stream: false
```
</CodeGroup>

### Custom Endpoints


You can also use [Hugging Face Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index#-inference-endpoints) to access custom endpoints. First, set the `HUGGINGFACE_ACCESS_TOKEN` as above.

Then, load the app using the config yaml file:

<CodeGroup>

```python main.py
import os
from embedchain import App

os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx"

# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: huggingface
  config:
    endpoint: https://api-inference.huggingface.co/models/gpt2 # replace with your personal endpoint
```
</CodeGroup>

If your endpoint requires additional parameters, you can pass them in the `model_kwargs` field:

```
llm:
  provider: huggingface
  config:
    endpoint: <YOUR_ENDPOINT_URL_HERE>
    model_kwargs:
      max_new_tokens: 100
      temperature: 0.5
```

Currently only supports `text-generation` and `text2text-generation` for now [[ref](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html?highlight=huggingfaceendpoint#)].

See langchain's [hugging face endpoint](https://python.langchain.com/docs/integrations/chat/huggingface#huggingfaceendpoint) for more information. 

## Llama2

Llama2 is integrated through [Replicate](https://replicate.com/).  Set `REPLICATE_API_TOKEN` in environment variable which you can obtain from [their platform](https://replicate.com/account/api-tokens).

Once you have the token, load the app using the config yaml file:

<CodeGroup>

```python main.py
import os
from embedchain import App

os.environ["REPLICATE_API_TOKEN"] = "xxx"

# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: llama2
  config:
    model: 'a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5'
    temperature: 0.5
    max_tokens: 1000
    top_p: 0.5
    stream: false
```
</CodeGroup>

## Vertex AI

Setup Google Cloud Platform application credentials by following the instruction on [GCP](https://cloud.google.com/docs/authentication/external/set-up-adc). Once setup is done, use the following code to create an app using VertexAI as provider:

<CodeGroup>

```python main.py
from embedchain import App

# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: vertexai
  config:
    model: 'chat-bison'
    temperature: 0.5
    top_p: 0.5
```
</CodeGroup>


## Mistral AI

Obtain the Mistral AI api key from their [console](https://console.mistral.ai/).

<CodeGroup>
 
 ```python main.py
os.environ["MISTRAL_API_KEY"] = "xxx"

app = App.from_config(config_path="config.yaml")

app.add("https://www.forbes.com/profile/elon-musk")

response = app.query("what is the net worth of Elon Musk?")
# As of January 16, 2024, Elon Musk's net worth is $225.4 billion.

response = app.chat("which companies does elon own?")
# Elon Musk owns Tesla, SpaceX, Boring Company, Twitter, and X.

response = app.chat("what question did I ask you already?")
# You have asked me several times already which companies Elon Musk owns, specifically Tesla, SpaceX, Boring Company, Twitter, and X.
```
  
```yaml config.yaml
llm:
  provider: mistralai
  config:
    model: mistral-tiny
    temperature: 0.5
    max_tokens: 1000
    top_p: 1
embedder:
  provider: mistralai
  config:
    model: mistral-embed
```
</CodeGroup>


## AWS Bedrock

### Setup
- Before using the AWS Bedrock LLM, make sure you have the appropriate model access from [Bedrock Console](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess).
- You will also need `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` to authenticate the API with AWS. You can find these in your [AWS Console](https://us-east-1.console.aws.amazon.com/iam/home?region=us-east-1#/users).


### Usage

<CodeGroup>

```python main.py
import os
from embedchain import App

os.environ["AWS_ACCESS_KEY_ID"] = "xxx"
os.environ["AWS_SECRET_ACCESS_KEY"] = "xxx"

app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: aws_bedrock
  config:
    model: amazon.titan-text-express-v1
    # check notes below for model_kwargs
    model_kwargs:
      temperature: 0.5
      topP: 1
      maxTokenCount: 1000
```
</CodeGroup>

<br />
<Note>
  The model arguments are different for each providers. Please refer to the [AWS Bedrock Documentation](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/providers) to find the appropriate arguments for your model.
</Note>

<br/ >
<Snippet file="missing-llm-tip.mdx" />