llms.mdx 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735
  1. ---
  2. title: 🤖 Large language models (LLMs)
  3. ---
  4. ## Overview
  5. Embedchain comes with built-in support for various popular large language models. We handle the complexity of integrating these models for you, allowing you to easily customize your language model interactions through a user-friendly interface.
  6. <CardGroup cols={4}>
  7. <Card title="OpenAI" href="#openai"></Card>
  8. <Card title="Google AI" href="#google-ai"></Card>
  9. <Card title="Azure OpenAI" href="#azure-openai"></Card>
  10. <Card title="Anthropic" href="#anthropic"></Card>
  11. <Card title="Cohere" href="#cohere"></Card>
  12. <Card title="Together" href="#together"></Card>
  13. <Card title="Ollama" href="#ollama"></Card>
  14. <Card title="vLLM" href="#vllm"></Card>
  15. <Card title="GPT4All" href="#gpt4all"></Card>
  16. <Card title="JinaChat" href="#jinachat"></Card>
  17. <Card title="Hugging Face" href="#hugging-face"></Card>
  18. <Card title="Llama2" href="#llama2"></Card>
  19. <Card title="Vertex AI" href="#vertex-ai"></Card>
  20. <Card title="Mistral AI" href="#mistral-ai"></Card>
  21. <Card title="AWS Bedrock" href="#aws-bedrock"></Card>
  22. <Card title="Groq" href="#groq"></Card>
  23. </CardGroup>
  24. ## OpenAI
  25. To use OpenAI LLM models, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys).
  26. Once you have obtained the key, you can use it like this:
  27. ```python
  28. import os
  29. from embedchain import App
  30. os.environ['OPENAI_API_KEY'] = 'xxx'
  31. app = App()
  32. app.add("https://en.wikipedia.org/wiki/OpenAI")
  33. app.query("What is OpenAI?")
  34. ```
  35. If you are looking to configure the different parameters of the LLM, you can do so by loading the app using a [yaml config](https://github.com/embedchain/embedchain/blob/main/configs/chroma.yaml) file.
  36. <CodeGroup>
  37. ```python main.py
  38. import os
  39. from embedchain import App
  40. os.environ['OPENAI_API_KEY'] = 'xxx'
  41. # load llm configuration from config.yaml file
  42. app = App.from_config(config_path="config.yaml")
  43. ```
  44. ```yaml config.yaml
  45. llm:
  46. provider: openai
  47. config:
  48. model: 'gpt-3.5-turbo'
  49. temperature: 0.5
  50. max_tokens: 1000
  51. top_p: 1
  52. stream: false
  53. ```
  54. </CodeGroup>
  55. ### Function Calling
  56. Embedchain supports OpenAI [Function calling](https://platform.openai.com/docs/guides/function-calling) with a single function. It accepts inputs in accordance with the [Langchain interface](https://python.langchain.com/docs/modules/model_io/chat/function_calling#legacy-args-functions-and-function_call).
  57. <Accordion title="Pydantic Model">
  58. ```python
  59. from pydantic import BaseModel
  60. class multiply(BaseModel):
  61. """Multiply two integers together."""
  62. a: int = Field(..., description="First integer")
  63. b: int = Field(..., description="Second integer")
  64. ```
  65. </Accordion>
  66. <Accordion title="Python function">
  67. ```python
  68. def multiply(a: int, b: int) -> int:
  69. """Multiply two integers together.
  70. Args:
  71. a: First integer
  72. b: Second integer
  73. """
  74. return a * b
  75. ```
  76. </Accordion>
  77. <Accordion title="OpenAI tool dictionary">
  78. ```python
  79. multiply = {
  80. "type": "function",
  81. "function": {
  82. "name": "multiply",
  83. "description": "Multiply two integers together.",
  84. "parameters": {
  85. "type": "object",
  86. "properties": {
  87. "a": {
  88. "description": "First integer",
  89. "type": "integer"
  90. },
  91. "b": {
  92. "description": "Second integer",
  93. "type": "integer"
  94. }
  95. },
  96. "required": [
  97. "a",
  98. "b"
  99. ]
  100. }
  101. }
  102. }
  103. ```
  104. </Accordion>
  105. With any of the previous inputs, the OpenAI LLM can be queried to provide the appropriate arguments for the function.
  106. ```python
  107. import os
  108. from embedchain import App
  109. from embedchain.llm.openai import OpenAILlm
  110. os.environ["OPENAI_API_KEY"] = "sk-xxx"
  111. llm = OpenAILlm(tools=multiply)
  112. app = App(llm=llm)
  113. result = app.query("What is the result of 125 multiplied by fifteen?")
  114. ```
  115. ## Google AI
  116. To use Google AI model, you have to set the `GOOGLE_API_KEY` environment variable. You can obtain the Google API key from the [Google Maker Suite](https://makersuite.google.com/app/apikey)
  117. <CodeGroup>
  118. ```python main.py
  119. import os
  120. from embedchain import App
  121. os.environ["GOOGLE_API_KEY"] = "xxx"
  122. app = App.from_config(config_path="config.yaml")
  123. app.add("https://www.forbes.com/profile/elon-musk")
  124. response = app.query("What is the net worth of Elon Musk?")
  125. if app.llm.config.stream: # if stream is enabled, response is a generator
  126. for chunk in response:
  127. print(chunk)
  128. else:
  129. print(response)
  130. ```
  131. ```yaml config.yaml
  132. llm:
  133. provider: google
  134. config:
  135. model: gemini-pro
  136. max_tokens: 1000
  137. temperature: 0.5
  138. top_p: 1
  139. stream: false
  140. embedder:
  141. provider: google
  142. config:
  143. model: 'models/embedding-001'
  144. task_type: "retrieval_document"
  145. title: "Embeddings for Embedchain"
  146. ```
  147. </CodeGroup>
  148. ## Azure OpenAI
  149. To use Azure OpenAI model, you have to set some of the azure openai related environment variables as given in the code block below:
  150. <CodeGroup>
  151. ```python main.py
  152. import os
  153. from embedchain import App
  154. os.environ["OPENAI_API_TYPE"] = "azure"
  155. os.environ["OPENAI_API_BASE"] = "https://xxx.openai.azure.com/"
  156. os.environ["OPENAI_API_KEY"] = "xxx"
  157. os.environ["OPENAI_API_VERSION"] = "xxx"
  158. app = App.from_config(config_path="config.yaml")
  159. ```
  160. ```yaml config.yaml
  161. llm:
  162. provider: azure_openai
  163. config:
  164. model: gpt-3.5-turbo
  165. deployment_name: your_llm_deployment_name
  166. temperature: 0.5
  167. max_tokens: 1000
  168. top_p: 1
  169. stream: false
  170. embedder:
  171. provider: azure_openai
  172. config:
  173. model: text-embedding-ada-002
  174. deployment_name: you_embedding_model_deployment_name
  175. ```
  176. </CodeGroup>
  177. You can find the list of models and deployment name on the [Azure OpenAI Platform](https://oai.azure.com/portal).
  178. ## Anthropic
  179. To use anthropic's model, please set the `ANTHROPIC_API_KEY` which you find on their [Account Settings Page](https://console.anthropic.com/account/keys).
  180. <CodeGroup>
  181. ```python main.py
  182. import os
  183. from embedchain import App
  184. os.environ["ANTHROPIC_API_KEY"] = "xxx"
  185. # load llm configuration from config.yaml file
  186. app = App.from_config(config_path="config.yaml")
  187. ```
  188. ```yaml config.yaml
  189. llm:
  190. provider: anthropic
  191. config:
  192. model: 'claude-instant-1'
  193. temperature: 0.5
  194. max_tokens: 1000
  195. top_p: 1
  196. stream: false
  197. ```
  198. </CodeGroup>
  199. ## Cohere
  200. Install related dependencies using the following command:
  201. ```bash
  202. pip install --upgrade 'embedchain[cohere]'
  203. ```
  204. Set the `COHERE_API_KEY` as environment variable which you can find on their [Account settings page](https://dashboard.cohere.com/api-keys).
  205. Once you have the API key, you are all set to use it with Embedchain.
  206. <CodeGroup>
  207. ```python main.py
  208. import os
  209. from embedchain import App
  210. os.environ["COHERE_API_KEY"] = "xxx"
  211. # load llm configuration from config.yaml file
  212. app = App.from_config(config_path="config.yaml")
  213. ```
  214. ```yaml config.yaml
  215. llm:
  216. provider: cohere
  217. config:
  218. model: large
  219. temperature: 0.5
  220. max_tokens: 1000
  221. top_p: 1
  222. ```
  223. </CodeGroup>
  224. ## Together
  225. Install related dependencies using the following command:
  226. ```bash
  227. pip install --upgrade 'embedchain[together]'
  228. ```
  229. Set the `TOGETHER_API_KEY` as environment variable which you can find on their [Account settings page](https://api.together.xyz/settings/api-keys).
  230. Once you have the API key, you are all set to use it with Embedchain.
  231. <CodeGroup>
  232. ```python main.py
  233. import os
  234. from embedchain import App
  235. os.environ["TOGETHER_API_KEY"] = "xxx"
  236. # load llm configuration from config.yaml file
  237. app = App.from_config(config_path="config.yaml")
  238. ```
  239. ```yaml config.yaml
  240. llm:
  241. provider: together
  242. config:
  243. model: togethercomputer/RedPajama-INCITE-7B-Base
  244. temperature: 0.5
  245. max_tokens: 1000
  246. top_p: 1
  247. ```
  248. </CodeGroup>
  249. ## Ollama
  250. Setup Ollama using https://github.com/jmorganca/ollama
  251. <CodeGroup>
  252. ```python main.py
  253. import os
  254. from embedchain import App
  255. # load llm configuration from config.yaml file
  256. app = App.from_config(config_path="config.yaml")
  257. ```
  258. ```yaml config.yaml
  259. llm:
  260. provider: ollama
  261. config:
  262. model: 'llama2'
  263. temperature: 0.5
  264. top_p: 1
  265. stream: true
  266. ```
  267. </CodeGroup>
  268. ## vLLM
  269. Setup vLLM by following instructions given in [their docs](https://docs.vllm.ai/en/latest/getting_started/installation.html).
  270. <CodeGroup>
  271. ```python main.py
  272. import os
  273. from embedchain import App
  274. # load llm configuration from config.yaml file
  275. app = App.from_config(config_path="config.yaml")
  276. ```
  277. ```yaml config.yaml
  278. llm:
  279. provider: vllm
  280. config:
  281. model: 'meta-llama/Llama-2-70b-hf'
  282. temperature: 0.5
  283. top_p: 1
  284. top_k: 10
  285. stream: true
  286. trust_remote_code: true
  287. ```
  288. </CodeGroup>
  289. ## GPT4ALL
  290. Install related dependencies using the following command:
  291. ```bash
  292. pip install --upgrade 'embedchain[opensource]'
  293. ```
  294. GPT4all is a free-to-use, locally running, privacy-aware chatbot. No GPU or internet required. You can use this with Embedchain using the following code:
  295. <CodeGroup>
  296. ```python main.py
  297. from embedchain import App
  298. # load llm configuration from config.yaml file
  299. app = App.from_config(config_path="config.yaml")
  300. ```
  301. ```yaml config.yaml
  302. llm:
  303. provider: gpt4all
  304. config:
  305. model: 'orca-mini-3b-gguf2-q4_0.gguf'
  306. temperature: 0.5
  307. max_tokens: 1000
  308. top_p: 1
  309. stream: false
  310. embedder:
  311. provider: gpt4all
  312. ```
  313. </CodeGroup>
  314. ## JinaChat
  315. First, set `JINACHAT_API_KEY` in environment variable which you can obtain from [their platform](https://chat.jina.ai/api).
  316. Once you have the key, load the app using the config yaml file:
  317. <CodeGroup>
  318. ```python main.py
  319. import os
  320. from embedchain import App
  321. os.environ["JINACHAT_API_KEY"] = "xxx"
  322. # load llm configuration from config.yaml file
  323. app = App.from_config(config_path="config.yaml")
  324. ```
  325. ```yaml config.yaml
  326. llm:
  327. provider: jina
  328. config:
  329. temperature: 0.5
  330. max_tokens: 1000
  331. top_p: 1
  332. stream: false
  333. ```
  334. </CodeGroup>
  335. ## Hugging Face
  336. Install related dependencies using the following command:
  337. ```bash
  338. pip install --upgrade 'embedchain[huggingface-hub]'
  339. ```
  340. First, set `HUGGINGFACE_ACCESS_TOKEN` in environment variable which you can obtain from [their platform](https://huggingface.co/settings/tokens).
  341. You can load the LLMs from Hugging Face using three ways:
  342. - [Hugging Face Hub](#hugging-face-hub)
  343. - [Hugging Face Local Pipelines](#hugging-face-local-pipelines)
  344. - [Hugging Face Inference Endpoint](#hugging-face-inference-endpoint)
  345. ### Hugging Face Hub
  346. To load the model from Hugging Face Hub, use the following code:
  347. <CodeGroup>
  348. ```python main.py
  349. import os
  350. from embedchain import App
  351. os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx"
  352. config = {
  353. "app": {"config": {"id": "my-app"}},
  354. "llm": {
  355. "provider": "huggingface",
  356. "config": {
  357. "model": "bigscience/bloom-1b7",
  358. "top_p": 0.5,
  359. "max_length": 200,
  360. "temperature": 0.1,
  361. },
  362. },
  363. }
  364. app = App.from_config(config=config)
  365. ```
  366. </CodeGroup>
  367. ### Hugging Face Local Pipelines
  368. If you want to load the locally downloaded model from Hugging Face, you can do so by following the code provided below:
  369. <CodeGroup>
  370. ```python main.py
  371. from embedchain import App
  372. config = {
  373. "app": {"config": {"id": "my-app"}},
  374. "llm": {
  375. "provider": "huggingface",
  376. "config": {
  377. "model": "Trendyol/Trendyol-LLM-7b-chat-v0.1",
  378. "local": True, # Necessary if you want to run model locally
  379. "top_p": 0.5,
  380. "max_tokens": 1000,
  381. "temperature": 0.1,
  382. },
  383. }
  384. }
  385. app = App.from_config(config=config)
  386. ```
  387. </CodeGroup>
  388. ### Hugging Face Inference Endpoint
  389. You can also use [Hugging Face Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index#-inference-endpoints) to access custom endpoints. First, set the `HUGGINGFACE_ACCESS_TOKEN` as above.
  390. Then, load the app using the config yaml file:
  391. <CodeGroup>
  392. ```python main.py
  393. from embedchain import App
  394. config = {
  395. "app": {"config": {"id": "my-app"}},
  396. "llm": {
  397. "provider": "huggingface",
  398. "config": {
  399. "endpoint": "https://api-inference.huggingface.co/models/gpt2",
  400. "model_params": {"temprature": 0.1, "max_new_tokens": 100}
  401. },
  402. },
  403. }
  404. app = App.from_config(config=config)
  405. ```
  406. </CodeGroup>
  407. Currently only supports `text-generation` and `text2text-generation` for now [[ref](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html?highlight=huggingfaceendpoint#)].
  408. See langchain's [hugging face endpoint](https://python.langchain.com/docs/integrations/chat/huggingface#huggingfaceendpoint) for more information.
  409. ## Llama2
  410. Llama2 is integrated through [Replicate](https://replicate.com/). Set `REPLICATE_API_TOKEN` in environment variable which you can obtain from [their platform](https://replicate.com/account/api-tokens).
  411. Once you have the token, load the app using the config yaml file:
  412. <CodeGroup>
  413. ```python main.py
  414. import os
  415. from embedchain import App
  416. os.environ["REPLICATE_API_TOKEN"] = "xxx"
  417. # load llm configuration from config.yaml file
  418. app = App.from_config(config_path="config.yaml")
  419. ```
  420. ```yaml config.yaml
  421. llm:
  422. provider: llama2
  423. config:
  424. model: 'a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5'
  425. temperature: 0.5
  426. max_tokens: 1000
  427. top_p: 0.5
  428. stream: false
  429. ```
  430. </CodeGroup>
  431. ## Vertex AI
  432. Setup Google Cloud Platform application credentials by following the instruction on [GCP](https://cloud.google.com/docs/authentication/external/set-up-adc). Once setup is done, use the following code to create an app using VertexAI as provider:
  433. <CodeGroup>
  434. ```python main.py
  435. from embedchain import App
  436. # load llm configuration from config.yaml file
  437. app = App.from_config(config_path="config.yaml")
  438. ```
  439. ```yaml config.yaml
  440. llm:
  441. provider: vertexai
  442. config:
  443. model: 'chat-bison'
  444. temperature: 0.5
  445. top_p: 0.5
  446. ```
  447. </CodeGroup>
  448. ## Mistral AI
  449. Obtain the Mistral AI api key from their [console](https://console.mistral.ai/).
  450. <CodeGroup>
  451. ```python main.py
  452. os.environ["MISTRAL_API_KEY"] = "xxx"
  453. app = App.from_config(config_path="config.yaml")
  454. app.add("https://www.forbes.com/profile/elon-musk")
  455. response = app.query("what is the net worth of Elon Musk?")
  456. # As of January 16, 2024, Elon Musk's net worth is $225.4 billion.
  457. response = app.chat("which companies does elon own?")
  458. # Elon Musk owns Tesla, SpaceX, Boring Company, Twitter, and X.
  459. response = app.chat("what question did I ask you already?")
  460. # You have asked me several times already which companies Elon Musk owns, specifically Tesla, SpaceX, Boring Company, Twitter, and X.
  461. ```
  462. ```yaml config.yaml
  463. llm:
  464. provider: mistralai
  465. config:
  466. model: mistral-tiny
  467. temperature: 0.5
  468. max_tokens: 1000
  469. top_p: 1
  470. embedder:
  471. provider: mistralai
  472. config:
  473. model: mistral-embed
  474. ```
  475. </CodeGroup>
  476. ## AWS Bedrock
  477. ### Setup
  478. - Before using the AWS Bedrock LLM, make sure you have the appropriate model access from [Bedrock Console](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess).
  479. - You will also need to authenticate the `boto3` client by using a method in the [AWS documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#configuring-credentials)
  480. - You can optionally export an `AWS_REGION`
  481. ### Usage
  482. <CodeGroup>
  483. ```python main.py
  484. import os
  485. from embedchain import App
  486. os.environ["AWS_ACCESS_KEY_ID"] = "xxx"
  487. os.environ["AWS_SECRET_ACCESS_KEY"] = "xxx"
  488. os.environ["AWS_REGION"] = "us-west-2"
  489. app = App.from_config(config_path="config.yaml")
  490. ```
  491. ```yaml config.yaml
  492. llm:
  493. provider: aws_bedrock
  494. config:
  495. model: amazon.titan-text-express-v1
  496. # check notes below for model_kwargs
  497. model_kwargs:
  498. temperature: 0.5
  499. topP: 1
  500. maxTokenCount: 1000
  501. ```
  502. </CodeGroup>
  503. <br />
  504. <Note>
  505. The model arguments are different for each providers. Please refer to the [AWS Bedrock Documentation](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/providers) to find the appropriate arguments for your model.
  506. </Note>
  507. <br/ >
  508. ## Groq
  509. [Groq](https://groq.com/) is the creator of the world's first Language Processing Unit (LPU), providing exceptional speed performance for AI workloads running on their LPU Inference Engine.
  510. ### Usage
  511. In order to use LLMs from Groq, go to their [platform](https://console.groq.com/keys) and get the API key.
  512. Set the API key as `GROQ_API_KEY` environment variable or pass in your app configuration to use the model as given below in the example.
  513. <CodeGroup>
  514. ```python main.py
  515. import os
  516. from embedchain import App
  517. # Set your API key here or pass as the environment variable
  518. groq_api_key = "gsk_xxxx"
  519. config = {
  520. "llm": {
  521. "provider": "groq",
  522. "config": {
  523. "model": "mixtral-8x7b-32768",
  524. "api_key": groq_api_key,
  525. "stream": True
  526. }
  527. }
  528. }
  529. app = App.from_config(config=config)
  530. # Add your data source here
  531. app.add("https://docs.embedchain.ai/sitemap.xml", data_type="sitemap")
  532. app.query("Write a poem about Embedchain")
  533. # In the realm of data, vast and wide,
  534. # Embedchain stands with knowledge as its guide.
  535. # A platform open, for all to try,
  536. # Building bots that can truly fly.
  537. # With REST API, data in reach,
  538. # Deployment a breeze, as easy as a speech.
  539. # Updating data sources, anytime, anyday,
  540. # Embedchain's power, never sway.
  541. # A knowledge base, an assistant so grand,
  542. # Connecting to platforms, near and far.
  543. # Discord, WhatsApp, Slack, and more,
  544. # Embedchain's potential, never a bore.
  545. ```
  546. </CodeGroup>
  547. <br/ >
  548. <Snippet file="missing-llm-tip.mdx" />