llms.mdx 22 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901
  1. ---
  2. title: 🤖 Large language models (LLMs)
  3. ---
  4. ## Overview
  5. Embedchain comes with built-in support for various popular large language models. We handle the complexity of integrating these models for you, allowing you to easily customize your language model interactions through a user-friendly interface.
  6. <CardGroup cols={4}>
  7. <Card title="OpenAI" href="#openai"></Card>
  8. <Card title="Google AI" href="#google-ai"></Card>
  9. <Card title="Azure OpenAI" href="#azure-openai"></Card>
  10. <Card title="Anthropic" href="#anthropic"></Card>
  11. <Card title="Cohere" href="#cohere"></Card>
  12. <Card title="Together" href="#together"></Card>
  13. <Card title="Ollama" href="#ollama"></Card>
  14. <Card title="vLLM" href="#vllm"></Card>
  15. <Card title="Clarifai" href="#clarifai"></Card>
  16. <Card title="GPT4All" href="#gpt4all"></Card>
  17. <Card title="JinaChat" href="#jinachat"></Card>
  18. <Card title="Hugging Face" href="#hugging-face"></Card>
  19. <Card title="Llama2" href="#llama2"></Card>
  20. <Card title="Vertex AI" href="#vertex-ai"></Card>
  21. <Card title="Mistral AI" href="#mistral-ai"></Card>
  22. <Card title="AWS Bedrock" href="#aws-bedrock"></Card>
  23. <Card title="Groq" href="#groq"></Card>
  24. <Card title="NVIDIA AI" href="#nvidia-ai"></Card>
  25. </CardGroup>
  26. ## OpenAI
  27. To use OpenAI LLM models, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys).
  28. Once you have obtained the key, you can use it like this:
  29. ```python
  30. import os
  31. from embedchain import App
  32. os.environ['OPENAI_API_KEY'] = 'xxx'
  33. app = App()
  34. app.add("https://en.wikipedia.org/wiki/OpenAI")
  35. app.query("What is OpenAI?")
  36. ```
  37. If you are looking to configure the different parameters of the LLM, you can do so by loading the app using a [yaml config](https://github.com/embedchain/embedchain/blob/main/configs/chroma.yaml) file.
  38. <CodeGroup>
  39. ```python main.py
  40. import os
  41. from embedchain import App
  42. os.environ['OPENAI_API_KEY'] = 'xxx'
  43. # load llm configuration from config.yaml file
  44. app = App.from_config(config_path="config.yaml")
  45. ```
  46. ```yaml config.yaml
  47. llm:
  48. provider: openai
  49. config:
  50. model: 'gpt-3.5-turbo'
  51. temperature: 0.5
  52. max_tokens: 1000
  53. top_p: 1
  54. stream: false
  55. ```
  56. </CodeGroup>
  57. ### Function Calling
  58. Embedchain supports OpenAI [Function calling](https://platform.openai.com/docs/guides/function-calling) with a single function. It accepts inputs in accordance with the [Langchain interface](https://python.langchain.com/docs/modules/model_io/chat/function_calling#legacy-args-functions-and-function_call).
  59. <Accordion title="Pydantic Model">
  60. ```python
  61. from pydantic import BaseModel
  62. class multiply(BaseModel):
  63. """Multiply two integers together."""
  64. a: int = Field(..., description="First integer")
  65. b: int = Field(..., description="Second integer")
  66. ```
  67. </Accordion>
  68. <Accordion title="Python function">
  69. ```python
  70. def multiply(a: int, b: int) -> int:
  71. """Multiply two integers together.
  72. Args:
  73. a: First integer
  74. b: Second integer
  75. """
  76. return a * b
  77. ```
  78. </Accordion>
  79. <Accordion title="OpenAI tool dictionary">
  80. ```python
  81. multiply = {
  82. "type": "function",
  83. "function": {
  84. "name": "multiply",
  85. "description": "Multiply two integers together.",
  86. "parameters": {
  87. "type": "object",
  88. "properties": {
  89. "a": {
  90. "description": "First integer",
  91. "type": "integer"
  92. },
  93. "b": {
  94. "description": "Second integer",
  95. "type": "integer"
  96. }
  97. },
  98. "required": [
  99. "a",
  100. "b"
  101. ]
  102. }
  103. }
  104. }
  105. ```
  106. </Accordion>
  107. With any of the previous inputs, the OpenAI LLM can be queried to provide the appropriate arguments for the function.
  108. ```python
  109. import os
  110. from embedchain import App
  111. from embedchain.llm.openai import OpenAILlm
  112. os.environ["OPENAI_API_KEY"] = "sk-xxx"
  113. llm = OpenAILlm(tools=multiply)
  114. app = App(llm=llm)
  115. result = app.query("What is the result of 125 multiplied by fifteen?")
  116. ```
  117. ## Google AI
  118. To use Google AI model, you have to set the `GOOGLE_API_KEY` environment variable. You can obtain the Google API key from the [Google Maker Suite](https://makersuite.google.com/app/apikey)
  119. <CodeGroup>
  120. ```python main.py
  121. import os
  122. from embedchain import App
  123. os.environ["GOOGLE_API_KEY"] = "xxx"
  124. app = App.from_config(config_path="config.yaml")
  125. app.add("https://www.forbes.com/profile/elon-musk")
  126. response = app.query("What is the net worth of Elon Musk?")
  127. if app.llm.config.stream: # if stream is enabled, response is a generator
  128. for chunk in response:
  129. print(chunk)
  130. else:
  131. print(response)
  132. ```
  133. ```yaml config.yaml
  134. llm:
  135. provider: google
  136. config:
  137. model: gemini-pro
  138. max_tokens: 1000
  139. temperature: 0.5
  140. top_p: 1
  141. stream: false
  142. embedder:
  143. provider: google
  144. config:
  145. model: 'models/embedding-001'
  146. task_type: "retrieval_document"
  147. title: "Embeddings for Embedchain"
  148. ```
  149. </CodeGroup>
  150. ## Azure OpenAI
  151. To use Azure OpenAI model, you have to set some of the azure openai related environment variables as given in the code block below:
  152. <CodeGroup>
  153. ```python main.py
  154. import os
  155. from embedchain import App
  156. os.environ["OPENAI_API_TYPE"] = "azure"
  157. os.environ["AZURE_OPENAI_ENDPOINT"] = "https://xxx.openai.azure.com/"
  158. os.environ["AZURE_OPENAI_KEY"] = "xxx"
  159. os.environ["OPENAI_API_VERSION"] = "xxx"
  160. app = App.from_config(config_path="config.yaml")
  161. ```
  162. ```yaml config.yaml
  163. llm:
  164. provider: azure_openai
  165. config:
  166. model: gpt-3.5-turbo
  167. deployment_name: your_llm_deployment_name
  168. temperature: 0.5
  169. max_tokens: 1000
  170. top_p: 1
  171. stream: false
  172. embedder:
  173. provider: azure_openai
  174. config:
  175. model: text-embedding-ada-002
  176. deployment_name: you_embedding_model_deployment_name
  177. ```
  178. </CodeGroup>
  179. You can find the list of models and deployment name on the [Azure OpenAI Platform](https://oai.azure.com/portal).
  180. ## Anthropic
  181. To use anthropic's model, please set the `ANTHROPIC_API_KEY` which you find on their [Account Settings Page](https://console.anthropic.com/account/keys).
  182. <CodeGroup>
  183. ```python main.py
  184. import os
  185. from embedchain import App
  186. os.environ["ANTHROPIC_API_KEY"] = "xxx"
  187. # load llm configuration from config.yaml file
  188. app = App.from_config(config_path="config.yaml")
  189. ```
  190. ```yaml config.yaml
  191. llm:
  192. provider: anthropic
  193. config:
  194. model: 'claude-instant-1'
  195. temperature: 0.5
  196. max_tokens: 1000
  197. top_p: 1
  198. stream: false
  199. ```
  200. </CodeGroup>
  201. ## Cohere
  202. Install related dependencies using the following command:
  203. ```bash
  204. pip install --upgrade 'embedchain[cohere]'
  205. ```
  206. Set the `COHERE_API_KEY` as environment variable which you can find on their [Account settings page](https://dashboard.cohere.com/api-keys).
  207. Once you have the API key, you are all set to use it with Embedchain.
  208. <CodeGroup>
  209. ```python main.py
  210. import os
  211. from embedchain import App
  212. os.environ["COHERE_API_KEY"] = "xxx"
  213. # load llm configuration from config.yaml file
  214. app = App.from_config(config_path="config.yaml")
  215. ```
  216. ```yaml config.yaml
  217. llm:
  218. provider: cohere
  219. config:
  220. model: large
  221. temperature: 0.5
  222. max_tokens: 1000
  223. top_p: 1
  224. ```
  225. </CodeGroup>
  226. ## Together
  227. Install related dependencies using the following command:
  228. ```bash
  229. pip install --upgrade 'embedchain[together]'
  230. ```
  231. Set the `TOGETHER_API_KEY` as environment variable which you can find on their [Account settings page](https://api.together.xyz/settings/api-keys).
  232. Once you have the API key, you are all set to use it with Embedchain.
  233. <CodeGroup>
  234. ```python main.py
  235. import os
  236. from embedchain import App
  237. os.environ["TOGETHER_API_KEY"] = "xxx"
  238. # load llm configuration from config.yaml file
  239. app = App.from_config(config_path="config.yaml")
  240. ```
  241. ```yaml config.yaml
  242. llm:
  243. provider: together
  244. config:
  245. model: togethercomputer/RedPajama-INCITE-7B-Base
  246. temperature: 0.5
  247. max_tokens: 1000
  248. top_p: 1
  249. ```
  250. </CodeGroup>
  251. ## Ollama
  252. Setup Ollama using https://github.com/jmorganca/ollama
  253. <CodeGroup>
  254. ```python main.py
  255. import os
  256. os.environ["OLLAMA_HOST"] = "http://127.0.0.1:11434"
  257. from embedchain import App
  258. # load llm configuration from config.yaml file
  259. app = App.from_config(config_path="config.yaml")
  260. ```
  261. ```yaml config.yaml
  262. llm:
  263. provider: ollama
  264. config:
  265. model: 'llama2'
  266. temperature: 0.5
  267. top_p: 1
  268. stream: true
  269. base_url: 'http://localhost:11434'
  270. embedder:
  271. provider: ollama
  272. config:
  273. model: znbang/bge:small-en-v1.5-q8_0
  274. base_url: http://localhost:11434
  275. ```
  276. </CodeGroup>
  277. ## vLLM
  278. Setup vLLM by following instructions given in [their docs](https://docs.vllm.ai/en/latest/getting_started/installation.html).
  279. <CodeGroup>
  280. ```python main.py
  281. import os
  282. from embedchain import App
  283. # load llm configuration from config.yaml file
  284. app = App.from_config(config_path="config.yaml")
  285. ```
  286. ```yaml config.yaml
  287. llm:
  288. provider: vllm
  289. config:
  290. model: 'meta-llama/Llama-2-70b-hf'
  291. temperature: 0.5
  292. top_p: 1
  293. top_k: 10
  294. stream: true
  295. trust_remote_code: true
  296. ```
  297. </CodeGroup>
  298. ## Clarifai
  299. Install related dependencies using the following command:
  300. ```bash
  301. pip install --upgrade 'embedchain[clarifai]'
  302. ```
  303. set the `CLARIFAI_PAT` as environment variable which you can find in the [security page](https://clarifai.com/settings/security). Optionally you can also pass the PAT key as parameters in LLM/Embedder class.
  304. Now you are all set with exploring Embedchain.
  305. <CodeGroup>
  306. ```python main.py
  307. import os
  308. from embedchain import App
  309. os.environ["CLARIFAI_PAT"] = "XXX"
  310. # load llm configuration from config.yaml file
  311. app = App.from_config(config_path="config.yaml")
  312. #Now let's add some data.
  313. app.add("https://www.forbes.com/profile/elon-musk")
  314. #Query the app
  315. response = app.query("what college degrees does elon musk have?")
  316. ```
  317. Head to [Clarifai Platform](https://clarifai.com/explore/models?page=1&perPage=24&filterData=%5B%7B%22field%22%3A%22use_cases%22%2C%22value%22%3A%5B%22llm%22%5D%7D%5D) to browse various State-of-the-Art LLM models for your use case.
  318. For passing model inference parameters use `model_kwargs` argument in the config file. Also you can use `api_key` argument to pass `CLARIFAI_PAT` in the config.
  319. ```yaml config.yaml
  320. llm:
  321. provider: clarifai
  322. config:
  323. model: "https://clarifai.com/mistralai/completion/models/mistral-7B-Instruct"
  324. model_kwargs:
  325. temperature: 0.5
  326. max_tokens: 1000
  327. embedder:
  328. provider: clarifai
  329. config:
  330. model: "https://clarifai.com/clarifai/main/models/BAAI-bge-base-en-v15"
  331. ```
  332. </CodeGroup>
  333. ## GPT4ALL
  334. Install related dependencies using the following command:
  335. ```bash
  336. pip install --upgrade 'embedchain[opensource]'
  337. ```
  338. GPT4all is a free-to-use, locally running, privacy-aware chatbot. No GPU or internet required. You can use this with Embedchain using the following code:
  339. <CodeGroup>
  340. ```python main.py
  341. from embedchain import App
  342. # load llm configuration from config.yaml file
  343. app = App.from_config(config_path="config.yaml")
  344. ```
  345. ```yaml config.yaml
  346. llm:
  347. provider: gpt4all
  348. config:
  349. model: 'orca-mini-3b-gguf2-q4_0.gguf'
  350. temperature: 0.5
  351. max_tokens: 1000
  352. top_p: 1
  353. stream: false
  354. embedder:
  355. provider: gpt4all
  356. ```
  357. </CodeGroup>
  358. ## JinaChat
  359. First, set `JINACHAT_API_KEY` in environment variable which you can obtain from [their platform](https://chat.jina.ai/api).
  360. Once you have the key, load the app using the config yaml file:
  361. <CodeGroup>
  362. ```python main.py
  363. import os
  364. from embedchain import App
  365. os.environ["JINACHAT_API_KEY"] = "xxx"
  366. # load llm configuration from config.yaml file
  367. app = App.from_config(config_path="config.yaml")
  368. ```
  369. ```yaml config.yaml
  370. llm:
  371. provider: jina
  372. config:
  373. temperature: 0.5
  374. max_tokens: 1000
  375. top_p: 1
  376. stream: false
  377. ```
  378. </CodeGroup>
  379. ## Hugging Face
  380. Install related dependencies using the following command:
  381. ```bash
  382. pip install --upgrade 'embedchain[huggingface-hub]'
  383. ```
  384. First, set `HUGGINGFACE_ACCESS_TOKEN` in environment variable which you can obtain from [their platform](https://huggingface.co/settings/tokens).
  385. You can load the LLMs from Hugging Face using three ways:
  386. - [Hugging Face Hub](#hugging-face-hub)
  387. - [Hugging Face Local Pipelines](#hugging-face-local-pipelines)
  388. - [Hugging Face Inference Endpoint](#hugging-face-inference-endpoint)
  389. ### Hugging Face Hub
  390. To load the model from Hugging Face Hub, use the following code:
  391. <CodeGroup>
  392. ```python main.py
  393. import os
  394. from embedchain import App
  395. os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx"
  396. config = {
  397. "app": {"config": {"id": "my-app"}},
  398. "llm": {
  399. "provider": "huggingface",
  400. "config": {
  401. "model": "bigscience/bloom-1b7",
  402. "top_p": 0.5,
  403. "max_length": 200,
  404. "temperature": 0.1,
  405. },
  406. },
  407. }
  408. app = App.from_config(config=config)
  409. ```
  410. </CodeGroup>
  411. ### Hugging Face Local Pipelines
  412. If you want to load the locally downloaded model from Hugging Face, you can do so by following the code provided below:
  413. <CodeGroup>
  414. ```python main.py
  415. from embedchain import App
  416. config = {
  417. "app": {"config": {"id": "my-app"}},
  418. "llm": {
  419. "provider": "huggingface",
  420. "config": {
  421. "model": "Trendyol/Trendyol-LLM-7b-chat-v0.1",
  422. "local": True, # Necessary if you want to run model locally
  423. "top_p": 0.5,
  424. "max_tokens": 1000,
  425. "temperature": 0.1,
  426. },
  427. }
  428. }
  429. app = App.from_config(config=config)
  430. ```
  431. </CodeGroup>
  432. ### Hugging Face Inference Endpoint
  433. You can also use [Hugging Face Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index#-inference-endpoints) to access custom endpoints. First, set the `HUGGINGFACE_ACCESS_TOKEN` as above.
  434. Then, load the app using the config yaml file:
  435. <CodeGroup>
  436. ```python main.py
  437. from embedchain import App
  438. config = {
  439. "app": {"config": {"id": "my-app"}},
  440. "llm": {
  441. "provider": "huggingface",
  442. "config": {
  443. "endpoint": "https://api-inference.huggingface.co/models/gpt2",
  444. "model_params": {"temprature": 0.1, "max_new_tokens": 100}
  445. },
  446. },
  447. }
  448. app = App.from_config(config=config)
  449. ```
  450. </CodeGroup>
  451. Currently only supports `text-generation` and `text2text-generation` for now [[ref](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html?highlight=huggingfaceendpoint#)].
  452. See langchain's [hugging face endpoint](https://python.langchain.com/docs/integrations/chat/huggingface#huggingfaceendpoint) for more information.
  453. ## Llama2
  454. Llama2 is integrated through [Replicate](https://replicate.com/). Set `REPLICATE_API_TOKEN` in environment variable which you can obtain from [their platform](https://replicate.com/account/api-tokens).
  455. Once you have the token, load the app using the config yaml file:
  456. <CodeGroup>
  457. ```python main.py
  458. import os
  459. from embedchain import App
  460. os.environ["REPLICATE_API_TOKEN"] = "xxx"
  461. # load llm configuration from config.yaml file
  462. app = App.from_config(config_path="config.yaml")
  463. ```
  464. ```yaml config.yaml
  465. llm:
  466. provider: llama2
  467. config:
  468. model: 'a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5'
  469. temperature: 0.5
  470. max_tokens: 1000
  471. top_p: 0.5
  472. stream: false
  473. ```
  474. </CodeGroup>
  475. ## Vertex AI
  476. Setup Google Cloud Platform application credentials by following the instruction on [GCP](https://cloud.google.com/docs/authentication/external/set-up-adc). Once setup is done, use the following code to create an app using VertexAI as provider:
  477. <CodeGroup>
  478. ```python main.py
  479. from embedchain import App
  480. # load llm configuration from config.yaml file
  481. app = App.from_config(config_path="config.yaml")
  482. ```
  483. ```yaml config.yaml
  484. llm:
  485. provider: vertexai
  486. config:
  487. model: 'chat-bison'
  488. temperature: 0.5
  489. top_p: 0.5
  490. ```
  491. </CodeGroup>
  492. ## Mistral AI
  493. Obtain the Mistral AI api key from their [console](https://console.mistral.ai/).
  494. <CodeGroup>
  495. ```python main.py
  496. os.environ["MISTRAL_API_KEY"] = "xxx"
  497. app = App.from_config(config_path="config.yaml")
  498. app.add("https://www.forbes.com/profile/elon-musk")
  499. response = app.query("what is the net worth of Elon Musk?")
  500. # As of January 16, 2024, Elon Musk's net worth is $225.4 billion.
  501. response = app.chat("which companies does elon own?")
  502. # Elon Musk owns Tesla, SpaceX, Boring Company, Twitter, and X.
  503. response = app.chat("what question did I ask you already?")
  504. # You have asked me several times already which companies Elon Musk owns, specifically Tesla, SpaceX, Boring Company, Twitter, and X.
  505. ```
  506. ```yaml config.yaml
  507. llm:
  508. provider: mistralai
  509. config:
  510. model: mistral-tiny
  511. temperature: 0.5
  512. max_tokens: 1000
  513. top_p: 1
  514. embedder:
  515. provider: mistralai
  516. config:
  517. model: mistral-embed
  518. ```
  519. </CodeGroup>
  520. ## AWS Bedrock
  521. ### Setup
  522. - Before using the AWS Bedrock LLM, make sure you have the appropriate model access from [Bedrock Console](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess).
  523. - You will also need to authenticate the `boto3` client by using a method in the [AWS documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#configuring-credentials)
  524. - You can optionally export an `AWS_REGION`
  525. ### Usage
  526. <CodeGroup>
  527. ```python main.py
  528. import os
  529. from embedchain import App
  530. os.environ["AWS_ACCESS_KEY_ID"] = "xxx"
  531. os.environ["AWS_SECRET_ACCESS_KEY"] = "xxx"
  532. os.environ["AWS_REGION"] = "us-west-2"
  533. app = App.from_config(config_path="config.yaml")
  534. ```
  535. ```yaml config.yaml
  536. llm:
  537. provider: aws_bedrock
  538. config:
  539. model: amazon.titan-text-express-v1
  540. # check notes below for model_kwargs
  541. model_kwargs:
  542. temperature: 0.5
  543. topP: 1
  544. maxTokenCount: 1000
  545. ```
  546. </CodeGroup>
  547. <br />
  548. <Note>
  549. The model arguments are different for each providers. Please refer to the [AWS Bedrock Documentation](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/providers) to find the appropriate arguments for your model.
  550. </Note>
  551. <br/ >
  552. ## Groq
  553. [Groq](https://groq.com/) is the creator of the world's first Language Processing Unit (LPU), providing exceptional speed performance for AI workloads running on their LPU Inference Engine.
  554. ### Usage
  555. In order to use LLMs from Groq, go to their [platform](https://console.groq.com/keys) and get the API key.
  556. Set the API key as `GROQ_API_KEY` environment variable or pass in your app configuration to use the model as given below in the example.
  557. <CodeGroup>
  558. ```python main.py
  559. import os
  560. from embedchain import App
  561. # Set your API key here or pass as the environment variable
  562. groq_api_key = "gsk_xxxx"
  563. config = {
  564. "llm": {
  565. "provider": "groq",
  566. "config": {
  567. "model": "mixtral-8x7b-32768",
  568. "api_key": groq_api_key,
  569. "stream": True
  570. }
  571. }
  572. }
  573. app = App.from_config(config=config)
  574. # Add your data source here
  575. app.add("https://docs.embedchain.ai/sitemap.xml", data_type="sitemap")
  576. app.query("Write a poem about Embedchain")
  577. # In the realm of data, vast and wide,
  578. # Embedchain stands with knowledge as its guide.
  579. # A platform open, for all to try,
  580. # Building bots that can truly fly.
  581. # With REST API, data in reach,
  582. # Deployment a breeze, as easy as a speech.
  583. # Updating data sources, anytime, anyday,
  584. # Embedchain's power, never sway.
  585. # A knowledge base, an assistant so grand,
  586. # Connecting to platforms, near and far.
  587. # Discord, WhatsApp, Slack, and more,
  588. # Embedchain's potential, never a bore.
  589. ```
  590. </CodeGroup>
  591. ## NVIDIA AI
  592. [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) let you quickly use NVIDIA's AI models, such as Mixtral 8x7B, Llama 2 etc, through our API. These models are available in the [NVIDIA NGC catalog](https://catalog.ngc.nvidia.com/ai-foundation-models), fully optimized and ready to use on NVIDIA's AI platform. They are designed for high speed and easy customization, ensuring smooth performance on any accelerated setup.
  593. ### Usage
  594. In order to use LLMs from NVIDIA AI, create an account on [NVIDIA NGC Service](https://catalog.ngc.nvidia.com/).
  595. Generate an API key from their dashboard. Set the API key as `NVIDIA_API_KEY` environment variable. Note that the `NVIDIA_API_KEY` will start with `nvapi-`.
  596. Below is an example of how to use LLM model and embedding model from NVIDIA AI:
  597. <CodeGroup>
  598. ```python main.py
  599. import os
  600. from embedchain import App
  601. os.environ['NVIDIA_API_KEY'] = 'nvapi-xxxx'
  602. config = {
  603. "app": {
  604. "config": {
  605. "id": "my-app",
  606. },
  607. },
  608. "llm": {
  609. "provider": "nvidia",
  610. "config": {
  611. "model": "nemotron_steerlm_8b",
  612. },
  613. },
  614. "embedder": {
  615. "provider": "nvidia",
  616. "config": {
  617. "model": "nvolveqa_40k",
  618. "vector_dimension": 1024,
  619. },
  620. },
  621. }
  622. app = App.from_config(config=config)
  623. app.add("https://www.forbes.com/profile/elon-musk")
  624. answer = app.query("What is the net worth of Elon Musk today?")
  625. # Answer: The net worth of Elon Musk is subject to fluctuations based on the market value of his holdings in various companies.
  626. # As of March 1, 2024, his net worth is estimated to be approximately $210 billion. However, this figure can change rapidly due to stock market fluctuations and other factors.
  627. # Additionally, his net worth may include other assets such as real estate and art, which are not reflected in his stock portfolio.
  628. ```
  629. </CodeGroup>
  630. ## Token Usage
  631. You can get the cost of the query by setting `token_usage` to `True` in the config file. This will return the token details: `prompt_tokens`, `completion_tokens`, `total_tokens`, `total_cost`, `cost_currency`.
  632. The list of paid LLMs that support token usage are:
  633. - OpenAI
  634. - Vertex AI
  635. - Anthropic
  636. - Cohere
  637. - Together
  638. - Groq
  639. - Mistral AI
  640. - NVIDIA AI
  641. Here is an example of how to use token usage:
  642. <CodeGroup>
  643. ```python main.py
  644. os.environ["OPENAI_API_KEY"] = "xxx"
  645. app = App.from_config(config_path="config.yaml")
  646. app.add("https://www.forbes.com/profile/elon-musk")
  647. response = app.query("what is the net worth of Elon Musk?")
  648. # {'answer': 'Elon Musk's net worth is $209.9 billion as of 6/9/24.',
  649. # 'usage': {'prompt_tokens': 1228,
  650. # 'completion_tokens': 21,
  651. # 'total_tokens': 1249,
  652. # 'total_cost': 0.001884,
  653. # 'cost_currency': 'USD'}
  654. # }
  655. response = app.chat("Which companies did Elon Musk found?")
  656. # {'answer': 'Elon Musk founded six companies, including Tesla, which is an electric car maker, SpaceX, a rocket producer, and the Boring Company, a tunneling startup.',
  657. # 'usage': {'prompt_tokens': 1616,
  658. # 'completion_tokens': 34,
  659. # 'total_tokens': 1650,
  660. # 'total_cost': 0.002492,
  661. # 'cost_currency': 'USD'}
  662. # }
  663. ```
  664. ```yaml config.yaml
  665. llm:
  666. provider: openai
  667. config:
  668. model: gpt-3.5-turbo
  669. temperature: 0.5
  670. max_tokens: 1000
  671. token_usage: true
  672. ```
  673. </CodeGroup>
  674. If a model is missing and you'd like to add it to `model_prices_and_context_window.json`, please feel free to open a PR.
  675. <br/ >
  676. <Snippet file="missing-llm-tip.mdx" />