123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901 |
- ---
- title: 🤖 Large language models (LLMs)
- ---
- ## Overview
- Embedchain comes with built-in support for various popular large language models. We handle the complexity of integrating these models for you, allowing you to easily customize your language model interactions through a user-friendly interface.
- <CardGroup cols={4}>
- <Card title="OpenAI" href="#openai"></Card>
- <Card title="Google AI" href="#google-ai"></Card>
- <Card title="Azure OpenAI" href="#azure-openai"></Card>
- <Card title="Anthropic" href="#anthropic"></Card>
- <Card title="Cohere" href="#cohere"></Card>
- <Card title="Together" href="#together"></Card>
- <Card title="Ollama" href="#ollama"></Card>
- <Card title="vLLM" href="#vllm"></Card>
- <Card title="Clarifai" href="#clarifai"></Card>
- <Card title="GPT4All" href="#gpt4all"></Card>
- <Card title="JinaChat" href="#jinachat"></Card>
- <Card title="Hugging Face" href="#hugging-face"></Card>
- <Card title="Llama2" href="#llama2"></Card>
- <Card title="Vertex AI" href="#vertex-ai"></Card>
- <Card title="Mistral AI" href="#mistral-ai"></Card>
- <Card title="AWS Bedrock" href="#aws-bedrock"></Card>
- <Card title="Groq" href="#groq"></Card>
- <Card title="NVIDIA AI" href="#nvidia-ai"></Card>
- </CardGroup>
- ## OpenAI
- To use OpenAI LLM models, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys).
- Once you have obtained the key, you can use it like this:
- ```python
- import os
- from embedchain import App
- os.environ['OPENAI_API_KEY'] = 'xxx'
- app = App()
- app.add("https://en.wikipedia.org/wiki/OpenAI")
- app.query("What is OpenAI?")
- ```
- If you are looking to configure the different parameters of the LLM, you can do so by loading the app using a [yaml config](https://github.com/embedchain/embedchain/blob/main/configs/chroma.yaml) file.
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ['OPENAI_API_KEY'] = 'xxx'
- # load llm configuration from config.yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- llm:
- provider: openai
- config:
- model: 'gpt-3.5-turbo'
- temperature: 0.5
- max_tokens: 1000
- top_p: 1
- stream: false
- ```
- </CodeGroup>
- ### Function Calling
- Embedchain supports OpenAI [Function calling](https://platform.openai.com/docs/guides/function-calling) with a single function. It accepts inputs in accordance with the [Langchain interface](https://python.langchain.com/docs/modules/model_io/chat/function_calling#legacy-args-functions-and-function_call).
- <Accordion title="Pydantic Model">
- ```python
- from pydantic import BaseModel
- class multiply(BaseModel):
- """Multiply two integers together."""
- a: int = Field(..., description="First integer")
- b: int = Field(..., description="Second integer")
- ```
- </Accordion>
- <Accordion title="Python function">
- ```python
- def multiply(a: int, b: int) -> int:
- """Multiply two integers together.
- Args:
- a: First integer
- b: Second integer
- """
- return a * b
- ```
- </Accordion>
- <Accordion title="OpenAI tool dictionary">
- ```python
- multiply = {
- "type": "function",
- "function": {
- "name": "multiply",
- "description": "Multiply two integers together.",
- "parameters": {
- "type": "object",
- "properties": {
- "a": {
- "description": "First integer",
- "type": "integer"
- },
- "b": {
- "description": "Second integer",
- "type": "integer"
- }
- },
- "required": [
- "a",
- "b"
- ]
- }
- }
- }
- ```
- </Accordion>
- With any of the previous inputs, the OpenAI LLM can be queried to provide the appropriate arguments for the function.
- ```python
- import os
- from embedchain import App
- from embedchain.llm.openai import OpenAILlm
- os.environ["OPENAI_API_KEY"] = "sk-xxx"
- llm = OpenAILlm(tools=multiply)
- app = App(llm=llm)
- result = app.query("What is the result of 125 multiplied by fifteen?")
- ```
- ## Google AI
- To use Google AI model, you have to set the `GOOGLE_API_KEY` environment variable. You can obtain the Google API key from the [Google Maker Suite](https://makersuite.google.com/app/apikey)
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ["GOOGLE_API_KEY"] = "xxx"
- app = App.from_config(config_path="config.yaml")
- app.add("https://www.forbes.com/profile/elon-musk")
- response = app.query("What is the net worth of Elon Musk?")
- if app.llm.config.stream: # if stream is enabled, response is a generator
- for chunk in response:
- print(chunk)
- else:
- print(response)
- ```
- ```yaml config.yaml
- llm:
- provider: google
- config:
- model: gemini-pro
- max_tokens: 1000
- temperature: 0.5
- top_p: 1
- stream: false
- embedder:
- provider: google
- config:
- model: 'models/embedding-001'
- task_type: "retrieval_document"
- title: "Embeddings for Embedchain"
- ```
- </CodeGroup>
- ## Azure OpenAI
- To use Azure OpenAI model, you have to set some of the azure openai related environment variables as given in the code block below:
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ["OPENAI_API_TYPE"] = "azure"
- os.environ["AZURE_OPENAI_ENDPOINT"] = "https://xxx.openai.azure.com/"
- os.environ["AZURE_OPENAI_KEY"] = "xxx"
- os.environ["OPENAI_API_VERSION"] = "xxx"
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- llm:
- provider: azure_openai
- config:
- model: gpt-3.5-turbo
- deployment_name: your_llm_deployment_name
- temperature: 0.5
- max_tokens: 1000
- top_p: 1
- stream: false
- embedder:
- provider: azure_openai
- config:
- model: text-embedding-ada-002
- deployment_name: you_embedding_model_deployment_name
- ```
- </CodeGroup>
- You can find the list of models and deployment name on the [Azure OpenAI Platform](https://oai.azure.com/portal).
- ## Anthropic
- To use anthropic's model, please set the `ANTHROPIC_API_KEY` which you find on their [Account Settings Page](https://console.anthropic.com/account/keys).
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ["ANTHROPIC_API_KEY"] = "xxx"
- # load llm configuration from config.yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- llm:
- provider: anthropic
- config:
- model: 'claude-instant-1'
- temperature: 0.5
- max_tokens: 1000
- top_p: 1
- stream: false
- ```
- </CodeGroup>
- ## Cohere
- Install related dependencies using the following command:
- ```bash
- pip install --upgrade 'embedchain[cohere]'
- ```
- Set the `COHERE_API_KEY` as environment variable which you can find on their [Account settings page](https://dashboard.cohere.com/api-keys).
- Once you have the API key, you are all set to use it with Embedchain.
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ["COHERE_API_KEY"] = "xxx"
- # load llm configuration from config.yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- llm:
- provider: cohere
- config:
- model: large
- temperature: 0.5
- max_tokens: 1000
- top_p: 1
- ```
- </CodeGroup>
- ## Together
- Install related dependencies using the following command:
- ```bash
- pip install --upgrade 'embedchain[together]'
- ```
- Set the `TOGETHER_API_KEY` as environment variable which you can find on their [Account settings page](https://api.together.xyz/settings/api-keys).
- Once you have the API key, you are all set to use it with Embedchain.
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ["TOGETHER_API_KEY"] = "xxx"
- # load llm configuration from config.yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- llm:
- provider: together
- config:
- model: togethercomputer/RedPajama-INCITE-7B-Base
- temperature: 0.5
- max_tokens: 1000
- top_p: 1
- ```
- </CodeGroup>
- ## Ollama
- Setup Ollama using https://github.com/jmorganca/ollama
- <CodeGroup>
- ```python main.py
- import os
- os.environ["OLLAMA_HOST"] = "http://127.0.0.1:11434"
- from embedchain import App
- # load llm configuration from config.yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- llm:
- provider: ollama
- config:
- model: 'llama2'
- temperature: 0.5
- top_p: 1
- stream: true
- base_url: 'http://localhost:11434'
- embedder:
- provider: ollama
- config:
- model: znbang/bge:small-en-v1.5-q8_0
- base_url: http://localhost:11434
- ```
- </CodeGroup>
- ## vLLM
- Setup vLLM by following instructions given in [their docs](https://docs.vllm.ai/en/latest/getting_started/installation.html).
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- # load llm configuration from config.yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- llm:
- provider: vllm
- config:
- model: 'meta-llama/Llama-2-70b-hf'
- temperature: 0.5
- top_p: 1
- top_k: 10
- stream: true
- trust_remote_code: true
- ```
- </CodeGroup>
- ## Clarifai
- Install related dependencies using the following command:
- ```bash
- pip install --upgrade 'embedchain[clarifai]'
- ```
- set the `CLARIFAI_PAT` as environment variable which you can find in the [security page](https://clarifai.com/settings/security). Optionally you can also pass the PAT key as parameters in LLM/Embedder class.
- Now you are all set with exploring Embedchain.
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ["CLARIFAI_PAT"] = "XXX"
- # load llm configuration from config.yaml file
- app = App.from_config(config_path="config.yaml")
- #Now let's add some data.
- app.add("https://www.forbes.com/profile/elon-musk")
- #Query the app
- response = app.query("what college degrees does elon musk have?")
- ```
- Head to [Clarifai Platform](https://clarifai.com/explore/models?page=1&perPage=24&filterData=%5B%7B%22field%22%3A%22use_cases%22%2C%22value%22%3A%5B%22llm%22%5D%7D%5D) to browse various State-of-the-Art LLM models for your use case.
- For passing model inference parameters use `model_kwargs` argument in the config file. Also you can use `api_key` argument to pass `CLARIFAI_PAT` in the config.
- ```yaml config.yaml
- llm:
- provider: clarifai
- config:
- model: "https://clarifai.com/mistralai/completion/models/mistral-7B-Instruct"
- model_kwargs:
- temperature: 0.5
- max_tokens: 1000
- embedder:
- provider: clarifai
- config:
- model: "https://clarifai.com/clarifai/main/models/BAAI-bge-base-en-v15"
- ```
- </CodeGroup>
- ## GPT4ALL
- Install related dependencies using the following command:
- ```bash
- pip install --upgrade 'embedchain[opensource]'
- ```
- GPT4all is a free-to-use, locally running, privacy-aware chatbot. No GPU or internet required. You can use this with Embedchain using the following code:
- <CodeGroup>
- ```python main.py
- from embedchain import App
- # load llm configuration from config.yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- llm:
- provider: gpt4all
- config:
- model: 'orca-mini-3b-gguf2-q4_0.gguf'
- temperature: 0.5
- max_tokens: 1000
- top_p: 1
- stream: false
- embedder:
- provider: gpt4all
- ```
- </CodeGroup>
- ## JinaChat
- First, set `JINACHAT_API_KEY` in environment variable which you can obtain from [their platform](https://chat.jina.ai/api).
- Once you have the key, load the app using the config yaml file:
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ["JINACHAT_API_KEY"] = "xxx"
- # load llm configuration from config.yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- llm:
- provider: jina
- config:
- temperature: 0.5
- max_tokens: 1000
- top_p: 1
- stream: false
- ```
- </CodeGroup>
- ## Hugging Face
- Install related dependencies using the following command:
- ```bash
- pip install --upgrade 'embedchain[huggingface-hub]'
- ```
- First, set `HUGGINGFACE_ACCESS_TOKEN` in environment variable which you can obtain from [their platform](https://huggingface.co/settings/tokens).
- You can load the LLMs from Hugging Face using three ways:
- - [Hugging Face Hub](#hugging-face-hub)
- - [Hugging Face Local Pipelines](#hugging-face-local-pipelines)
- - [Hugging Face Inference Endpoint](#hugging-face-inference-endpoint)
- ### Hugging Face Hub
- To load the model from Hugging Face Hub, use the following code:
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx"
- config = {
- "app": {"config": {"id": "my-app"}},
- "llm": {
- "provider": "huggingface",
- "config": {
- "model": "bigscience/bloom-1b7",
- "top_p": 0.5,
- "max_length": 200,
- "temperature": 0.1,
- },
- },
- }
- app = App.from_config(config=config)
- ```
- </CodeGroup>
- ### Hugging Face Local Pipelines
- If you want to load the locally downloaded model from Hugging Face, you can do so by following the code provided below:
- <CodeGroup>
- ```python main.py
- from embedchain import App
- config = {
- "app": {"config": {"id": "my-app"}},
- "llm": {
- "provider": "huggingface",
- "config": {
- "model": "Trendyol/Trendyol-LLM-7b-chat-v0.1",
- "local": True, # Necessary if you want to run model locally
- "top_p": 0.5,
- "max_tokens": 1000,
- "temperature": 0.1,
- },
- }
- }
- app = App.from_config(config=config)
- ```
- </CodeGroup>
- ### Hugging Face Inference Endpoint
- You can also use [Hugging Face Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index#-inference-endpoints) to access custom endpoints. First, set the `HUGGINGFACE_ACCESS_TOKEN` as above.
- Then, load the app using the config yaml file:
- <CodeGroup>
- ```python main.py
- from embedchain import App
- config = {
- "app": {"config": {"id": "my-app"}},
- "llm": {
- "provider": "huggingface",
- "config": {
- "endpoint": "https://api-inference.huggingface.co/models/gpt2",
- "model_params": {"temprature": 0.1, "max_new_tokens": 100}
- },
- },
- }
- app = App.from_config(config=config)
- ```
- </CodeGroup>
- Currently only supports `text-generation` and `text2text-generation` for now [[ref](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html?highlight=huggingfaceendpoint#)].
- See langchain's [hugging face endpoint](https://python.langchain.com/docs/integrations/chat/huggingface#huggingfaceendpoint) for more information.
- ## Llama2
- Llama2 is integrated through [Replicate](https://replicate.com/). Set `REPLICATE_API_TOKEN` in environment variable which you can obtain from [their platform](https://replicate.com/account/api-tokens).
- Once you have the token, load the app using the config yaml file:
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ["REPLICATE_API_TOKEN"] = "xxx"
- # load llm configuration from config.yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- llm:
- provider: llama2
- config:
- model: 'a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5'
- temperature: 0.5
- max_tokens: 1000
- top_p: 0.5
- stream: false
- ```
- </CodeGroup>
- ## Vertex AI
- Setup Google Cloud Platform application credentials by following the instruction on [GCP](https://cloud.google.com/docs/authentication/external/set-up-adc). Once setup is done, use the following code to create an app using VertexAI as provider:
- <CodeGroup>
- ```python main.py
- from embedchain import App
- # load llm configuration from config.yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- llm:
- provider: vertexai
- config:
- model: 'chat-bison'
- temperature: 0.5
- top_p: 0.5
- ```
- </CodeGroup>
- ## Mistral AI
- Obtain the Mistral AI api key from their [console](https://console.mistral.ai/).
- <CodeGroup>
-
- ```python main.py
- os.environ["MISTRAL_API_KEY"] = "xxx"
- app = App.from_config(config_path="config.yaml")
- app.add("https://www.forbes.com/profile/elon-musk")
- response = app.query("what is the net worth of Elon Musk?")
- # As of January 16, 2024, Elon Musk's net worth is $225.4 billion.
- response = app.chat("which companies does elon own?")
- # Elon Musk owns Tesla, SpaceX, Boring Company, Twitter, and X.
- response = app.chat("what question did I ask you already?")
- # You have asked me several times already which companies Elon Musk owns, specifically Tesla, SpaceX, Boring Company, Twitter, and X.
- ```
-
- ```yaml config.yaml
- llm:
- provider: mistralai
- config:
- model: mistral-tiny
- temperature: 0.5
- max_tokens: 1000
- top_p: 1
- embedder:
- provider: mistralai
- config:
- model: mistral-embed
- ```
- </CodeGroup>
- ## AWS Bedrock
- ### Setup
- - Before using the AWS Bedrock LLM, make sure you have the appropriate model access from [Bedrock Console](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess).
- - You will also need to authenticate the `boto3` client by using a method in the [AWS documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#configuring-credentials)
- - You can optionally export an `AWS_REGION`
- ### Usage
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ["AWS_ACCESS_KEY_ID"] = "xxx"
- os.environ["AWS_SECRET_ACCESS_KEY"] = "xxx"
- os.environ["AWS_REGION"] = "us-west-2"
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- llm:
- provider: aws_bedrock
- config:
- model: amazon.titan-text-express-v1
- # check notes below for model_kwargs
- model_kwargs:
- temperature: 0.5
- topP: 1
- maxTokenCount: 1000
- ```
- </CodeGroup>
- <br />
- <Note>
- The model arguments are different for each providers. Please refer to the [AWS Bedrock Documentation](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/providers) to find the appropriate arguments for your model.
- </Note>
- <br/ >
- ## Groq
- [Groq](https://groq.com/) is the creator of the world's first Language Processing Unit (LPU), providing exceptional speed performance for AI workloads running on their LPU Inference Engine.
- ### Usage
- In order to use LLMs from Groq, go to their [platform](https://console.groq.com/keys) and get the API key.
- Set the API key as `GROQ_API_KEY` environment variable or pass in your app configuration to use the model as given below in the example.
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- # Set your API key here or pass as the environment variable
- groq_api_key = "gsk_xxxx"
- config = {
- "llm": {
- "provider": "groq",
- "config": {
- "model": "mixtral-8x7b-32768",
- "api_key": groq_api_key,
- "stream": True
- }
- }
- }
- app = App.from_config(config=config)
- # Add your data source here
- app.add("https://docs.embedchain.ai/sitemap.xml", data_type="sitemap")
- app.query("Write a poem about Embedchain")
- # In the realm of data, vast and wide,
- # Embedchain stands with knowledge as its guide.
- # A platform open, for all to try,
- # Building bots that can truly fly.
- # With REST API, data in reach,
- # Deployment a breeze, as easy as a speech.
- # Updating data sources, anytime, anyday,
- # Embedchain's power, never sway.
- # A knowledge base, an assistant so grand,
- # Connecting to platforms, near and far.
- # Discord, WhatsApp, Slack, and more,
- # Embedchain's potential, never a bore.
- ```
- </CodeGroup>
- ## NVIDIA AI
- [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) let you quickly use NVIDIA's AI models, such as Mixtral 8x7B, Llama 2 etc, through our API. These models are available in the [NVIDIA NGC catalog](https://catalog.ngc.nvidia.com/ai-foundation-models), fully optimized and ready to use on NVIDIA's AI platform. They are designed for high speed and easy customization, ensuring smooth performance on any accelerated setup.
- ### Usage
- In order to use LLMs from NVIDIA AI, create an account on [NVIDIA NGC Service](https://catalog.ngc.nvidia.com/).
- Generate an API key from their dashboard. Set the API key as `NVIDIA_API_KEY` environment variable. Note that the `NVIDIA_API_KEY` will start with `nvapi-`.
- Below is an example of how to use LLM model and embedding model from NVIDIA AI:
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ['NVIDIA_API_KEY'] = 'nvapi-xxxx'
- config = {
- "app": {
- "config": {
- "id": "my-app",
- },
- },
- "llm": {
- "provider": "nvidia",
- "config": {
- "model": "nemotron_steerlm_8b",
- },
- },
- "embedder": {
- "provider": "nvidia",
- "config": {
- "model": "nvolveqa_40k",
- "vector_dimension": 1024,
- },
- },
- }
- app = App.from_config(config=config)
- app.add("https://www.forbes.com/profile/elon-musk")
- answer = app.query("What is the net worth of Elon Musk today?")
- # Answer: The net worth of Elon Musk is subject to fluctuations based on the market value of his holdings in various companies.
- # As of March 1, 2024, his net worth is estimated to be approximately $210 billion. However, this figure can change rapidly due to stock market fluctuations and other factors.
- # Additionally, his net worth may include other assets such as real estate and art, which are not reflected in his stock portfolio.
- ```
- </CodeGroup>
- ## Token Usage
- You can get the cost of the query by setting `token_usage` to `True` in the config file. This will return the token details: `prompt_tokens`, `completion_tokens`, `total_tokens`, `total_cost`, `cost_currency`.
- The list of paid LLMs that support token usage are:
- - OpenAI
- - Vertex AI
- - Anthropic
- - Cohere
- - Together
- - Groq
- - Mistral AI
- - NVIDIA AI
- Here is an example of how to use token usage:
- <CodeGroup>
-
- ```python main.py
- os.environ["OPENAI_API_KEY"] = "xxx"
- app = App.from_config(config_path="config.yaml")
- app.add("https://www.forbes.com/profile/elon-musk")
- response = app.query("what is the net worth of Elon Musk?")
- # {'answer': 'Elon Musk's net worth is $209.9 billion as of 6/9/24.',
- # 'usage': {'prompt_tokens': 1228,
- # 'completion_tokens': 21,
- # 'total_tokens': 1249,
- # 'total_cost': 0.001884,
- # 'cost_currency': 'USD'}
- # }
- response = app.chat("Which companies did Elon Musk found?")
- # {'answer': 'Elon Musk founded six companies, including Tesla, which is an electric car maker, SpaceX, a rocket producer, and the Boring Company, a tunneling startup.',
- # 'usage': {'prompt_tokens': 1616,
- # 'completion_tokens': 34,
- # 'total_tokens': 1650,
- # 'total_cost': 0.002492,
- # 'cost_currency': 'USD'}
- # }
- ```
-
- ```yaml config.yaml
- llm:
- provider: openai
- config:
- model: gpt-3.5-turbo
- temperature: 0.5
- max_tokens: 1000
- token_usage: true
- ```
- </CodeGroup>
- If a model is missing and you'd like to add it to `model_prices_and_context_window.json`, please feel free to open a PR.
- <br/ >
- <Snippet file="missing-llm-tip.mdx" />
|