llms.mdx 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714
  1. ---
  2. title: 🤖 Large language models (LLMs)
  3. ---
  4. ## Overview
  5. Embedchain comes with built-in support for various popular large language models. We handle the complexity of integrating these models for you, allowing you to easily customize your language model interactions through a user-friendly interface.
  6. <CardGroup cols={4}>
  7. <Card title="OpenAI" href="#openai"></Card>
  8. <Card title="Google AI" href="#google-ai"></Card>
  9. <Card title="Azure OpenAI" href="#azure-openai"></Card>
  10. <Card title="Anthropic" href="#anthropic"></Card>
  11. <Card title="Cohere" href="#cohere"></Card>
  12. <Card title="Together" href="#together"></Card>
  13. <Card title="Ollama" href="#ollama"></Card>
  14. <Card title="vLLM" href="#vllm"></Card>
  15. <Card title="GPT4All" href="#gpt4all"></Card>
  16. <Card title="JinaChat" href="#jinachat"></Card>
  17. <Card title="Hugging Face" href="#hugging-face"></Card>
  18. <Card title="Llama2" href="#llama2"></Card>
  19. <Card title="Vertex AI" href="#vertex-ai"></Card>
  20. <Card title="Mistral AI" href="#mistral-ai"></Card>
  21. <Card title="AWS Bedrock" href="#aws-bedrock"></Card>
  22. <Card title="Groq" href="#groq"></Card>
  23. </CardGroup>
  24. ## OpenAI
  25. To use OpenAI LLM models, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys).
  26. Once you have obtained the key, you can use it like this:
  27. ```python
  28. import os
  29. from embedchain import App
  30. os.environ['OPENAI_API_KEY'] = 'xxx'
  31. app = App()
  32. app.add("https://en.wikipedia.org/wiki/OpenAI")
  33. app.query("What is OpenAI?")
  34. ```
  35. If you are looking to configure the different parameters of the LLM, you can do so by loading the app using a [yaml config](https://github.com/embedchain/embedchain/blob/main/configs/chroma.yaml) file.
  36. <CodeGroup>
  37. ```python main.py
  38. import os
  39. from embedchain import App
  40. os.environ['OPENAI_API_KEY'] = 'xxx'
  41. # load llm configuration from config.yaml file
  42. app = App.from_config(config_path="config.yaml")
  43. ```
  44. ```yaml config.yaml
  45. llm:
  46. provider: openai
  47. config:
  48. model: 'gpt-3.5-turbo'
  49. temperature: 0.5
  50. max_tokens: 1000
  51. top_p: 1
  52. stream: false
  53. ```
  54. </CodeGroup>
  55. ### Function Calling
  56. Embedchain supports OpenAI [Function calling](https://platform.openai.com/docs/guides/function-calling) with a single function. It accepts inputs in accordance with the [Langchain interface](https://python.langchain.com/docs/modules/model_io/chat/function_calling#legacy-args-functions-and-function_call).
  57. <Accordion title="Pydantic Model">
  58. ```python
  59. from pydantic import BaseModel
  60. class multiply(BaseModel):
  61. """Multiply two integers together."""
  62. a: int = Field(..., description="First integer")
  63. b: int = Field(..., description="Second integer")
  64. ```
  65. </Accordion>
  66. <Accordion title="Python function">
  67. ```python
  68. def multiply(a: int, b: int) -> int:
  69. """Multiply two integers together.
  70. Args:
  71. a: First integer
  72. b: Second integer
  73. """
  74. return a * b
  75. ```
  76. </Accordion>
  77. <Accordion title="OpenAI tool dictionary">
  78. ```python
  79. multiply = {
  80. "type": "function",
  81. "function": {
  82. "name": "multiply",
  83. "description": "Multiply two integers together.",
  84. "parameters": {
  85. "type": "object",
  86. "properties": {
  87. "a": {
  88. "description": "First integer",
  89. "type": "integer"
  90. },
  91. "b": {
  92. "description": "Second integer",
  93. "type": "integer"
  94. }
  95. },
  96. "required": [
  97. "a",
  98. "b"
  99. ]
  100. }
  101. }
  102. }
  103. ```
  104. </Accordion>
  105. With any of the previous inputs, the OpenAI LLM can be queried to provide the appropriate arguments for the function.
  106. ```python
  107. import os
  108. from embedchain import App
  109. from embedchain.llm.openai import OpenAILlm
  110. os.environ["OPENAI_API_KEY"] = "sk-xxx"
  111. llm = OpenAILlm(tools=multiply)
  112. app = App(llm=llm)
  113. result = app.query("What is the result of 125 multiplied by fifteen?")
  114. ```
  115. ## Google AI
  116. To use Google AI model, you have to set the `GOOGLE_API_KEY` environment variable. You can obtain the Google API key from the [Google Maker Suite](https://makersuite.google.com/app/apikey)
  117. <CodeGroup>
  118. ```python main.py
  119. import os
  120. from embedchain import App
  121. os.environ["GOOGLE_API_KEY"] = "xxx"
  122. app = App.from_config(config_path="config.yaml")
  123. app.add("https://www.forbes.com/profile/elon-musk")
  124. response = app.query("What is the net worth of Elon Musk?")
  125. if app.llm.config.stream: # if stream is enabled, response is a generator
  126. for chunk in response:
  127. print(chunk)
  128. else:
  129. print(response)
  130. ```
  131. ```yaml config.yaml
  132. llm:
  133. provider: google
  134. config:
  135. model: gemini-pro
  136. max_tokens: 1000
  137. temperature: 0.5
  138. top_p: 1
  139. stream: false
  140. embedder:
  141. provider: google
  142. config:
  143. model: 'models/embedding-001'
  144. task_type: "retrieval_document"
  145. title: "Embeddings for Embedchain"
  146. ```
  147. </CodeGroup>
  148. ## Azure OpenAI
  149. To use Azure OpenAI model, you have to set some of the azure openai related environment variables as given in the code block below:
  150. <CodeGroup>
  151. ```python main.py
  152. import os
  153. from embedchain import App
  154. os.environ["OPENAI_API_TYPE"] = "azure"
  155. os.environ["OPENAI_API_BASE"] = "https://xxx.openai.azure.com/"
  156. os.environ["OPENAI_API_KEY"] = "xxx"
  157. os.environ["OPENAI_API_VERSION"] = "xxx"
  158. app = App.from_config(config_path="config.yaml")
  159. ```
  160. ```yaml config.yaml
  161. llm:
  162. provider: azure_openai
  163. config:
  164. model: gpt-3.5-turbo
  165. deployment_name: your_llm_deployment_name
  166. temperature: 0.5
  167. max_tokens: 1000
  168. top_p: 1
  169. stream: false
  170. embedder:
  171. provider: azure_openai
  172. config:
  173. model: text-embedding-ada-002
  174. deployment_name: you_embedding_model_deployment_name
  175. ```
  176. </CodeGroup>
  177. You can find the list of models and deployment name on the [Azure OpenAI Platform](https://oai.azure.com/portal).
  178. ## Anthropic
  179. To use anthropic's model, please set the `ANTHROPIC_API_KEY` which you find on their [Account Settings Page](https://console.anthropic.com/account/keys).
  180. <CodeGroup>
  181. ```python main.py
  182. import os
  183. from embedchain import App
  184. os.environ["ANTHROPIC_API_KEY"] = "xxx"
  185. # load llm configuration from config.yaml file
  186. app = App.from_config(config_path="config.yaml")
  187. ```
  188. ```yaml config.yaml
  189. llm:
  190. provider: anthropic
  191. config:
  192. model: 'claude-instant-1'
  193. temperature: 0.5
  194. max_tokens: 1000
  195. top_p: 1
  196. stream: false
  197. ```
  198. </CodeGroup>
  199. ## Cohere
  200. Install related dependencies using the following command:
  201. ```bash
  202. pip install --upgrade 'embedchain[cohere]'
  203. ```
  204. Set the `COHERE_API_KEY` as environment variable which you can find on their [Account settings page](https://dashboard.cohere.com/api-keys).
  205. Once you have the API key, you are all set to use it with Embedchain.
  206. <CodeGroup>
  207. ```python main.py
  208. import os
  209. from embedchain import App
  210. os.environ["COHERE_API_KEY"] = "xxx"
  211. # load llm configuration from config.yaml file
  212. app = App.from_config(config_path="config.yaml")
  213. ```
  214. ```yaml config.yaml
  215. llm:
  216. provider: cohere
  217. config:
  218. model: large
  219. temperature: 0.5
  220. max_tokens: 1000
  221. top_p: 1
  222. ```
  223. </CodeGroup>
  224. ## Together
  225. Install related dependencies using the following command:
  226. ```bash
  227. pip install --upgrade 'embedchain[together]'
  228. ```
  229. Set the `TOGETHER_API_KEY` as environment variable which you can find on their [Account settings page](https://api.together.xyz/settings/api-keys).
  230. Once you have the API key, you are all set to use it with Embedchain.
  231. <CodeGroup>
  232. ```python main.py
  233. import os
  234. from embedchain import App
  235. os.environ["TOGETHER_API_KEY"] = "xxx"
  236. # load llm configuration from config.yaml file
  237. app = App.from_config(config_path="config.yaml")
  238. ```
  239. ```yaml config.yaml
  240. llm:
  241. provider: together
  242. config:
  243. model: togethercomputer/RedPajama-INCITE-7B-Base
  244. temperature: 0.5
  245. max_tokens: 1000
  246. top_p: 1
  247. ```
  248. </CodeGroup>
  249. ## Ollama
  250. Setup Ollama using https://github.com/jmorganca/ollama
  251. <CodeGroup>
  252. ```python main.py
  253. import os
  254. from embedchain import App
  255. # load llm configuration from config.yaml file
  256. app = App.from_config(config_path="config.yaml")
  257. ```
  258. ```yaml config.yaml
  259. llm:
  260. provider: ollama
  261. config:
  262. model: 'llama2'
  263. temperature: 0.5
  264. top_p: 1
  265. stream: true
  266. ```
  267. </CodeGroup>
  268. ## vLLM
  269. Setup vLLM by following instructions given in [their docs](https://docs.vllm.ai/en/latest/getting_started/installation.html).
  270. <CodeGroup>
  271. ```python main.py
  272. import os
  273. from embedchain import App
  274. # load llm configuration from config.yaml file
  275. app = App.from_config(config_path="config.yaml")
  276. ```
  277. ```yaml config.yaml
  278. llm:
  279. provider: vllm
  280. config:
  281. model: 'meta-llama/Llama-2-70b-hf'
  282. temperature: 0.5
  283. top_p: 1
  284. top_k: 10
  285. stream: true
  286. trust_remote_code: true
  287. ```
  288. </CodeGroup>
  289. ## GPT4ALL
  290. Install related dependencies using the following command:
  291. ```bash
  292. pip install --upgrade 'embedchain[opensource]'
  293. ```
  294. GPT4all is a free-to-use, locally running, privacy-aware chatbot. No GPU or internet required. You can use this with Embedchain using the following code:
  295. <CodeGroup>
  296. ```python main.py
  297. from embedchain import App
  298. # load llm configuration from config.yaml file
  299. app = App.from_config(config_path="config.yaml")
  300. ```
  301. ```yaml config.yaml
  302. llm:
  303. provider: gpt4all
  304. config:
  305. model: 'orca-mini-3b-gguf2-q4_0.gguf'
  306. temperature: 0.5
  307. max_tokens: 1000
  308. top_p: 1
  309. stream: false
  310. embedder:
  311. provider: gpt4all
  312. ```
  313. </CodeGroup>
  314. ## JinaChat
  315. First, set `JINACHAT_API_KEY` in environment variable which you can obtain from [their platform](https://chat.jina.ai/api).
  316. Once you have the key, load the app using the config yaml file:
  317. <CodeGroup>
  318. ```python main.py
  319. import os
  320. from embedchain import App
  321. os.environ["JINACHAT_API_KEY"] = "xxx"
  322. # load llm configuration from config.yaml file
  323. app = App.from_config(config_path="config.yaml")
  324. ```
  325. ```yaml config.yaml
  326. llm:
  327. provider: jina
  328. config:
  329. temperature: 0.5
  330. max_tokens: 1000
  331. top_p: 1
  332. stream: false
  333. ```
  334. </CodeGroup>
  335. ## Hugging Face
  336. Install related dependencies using the following command:
  337. ```bash
  338. pip install --upgrade 'embedchain[huggingface-hub]'
  339. ```
  340. First, set `HUGGINGFACE_ACCESS_TOKEN` in environment variable which you can obtain from [their platform](https://huggingface.co/settings/tokens).
  341. Once you have the token, load the app using the config yaml file:
  342. <CodeGroup>
  343. ```python main.py
  344. import os
  345. from embedchain import App
  346. os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx"
  347. # load llm configuration from config.yaml file
  348. app = App.from_config(config_path="config.yaml")
  349. ```
  350. ```yaml config.yaml
  351. llm:
  352. provider: huggingface
  353. config:
  354. model: 'google/flan-t5-xxl'
  355. temperature: 0.5
  356. max_tokens: 1000
  357. top_p: 0.5
  358. stream: false
  359. ```
  360. </CodeGroup>
  361. ### Custom Endpoints
  362. You can also use [Hugging Face Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index#-inference-endpoints) to access custom endpoints. First, set the `HUGGINGFACE_ACCESS_TOKEN` as above.
  363. Then, load the app using the config yaml file:
  364. <CodeGroup>
  365. ```python main.py
  366. import os
  367. from embedchain import App
  368. os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx"
  369. # load llm configuration from config.yaml file
  370. app = App.from_config(config_path="config.yaml")
  371. ```
  372. ```yaml config.yaml
  373. llm:
  374. provider: huggingface
  375. config:
  376. endpoint: https://api-inference.huggingface.co/models/gpt2 # replace with your personal endpoint
  377. ```
  378. </CodeGroup>
  379. If your endpoint requires additional parameters, you can pass them in the `model_kwargs` field:
  380. ```
  381. llm:
  382. provider: huggingface
  383. config:
  384. endpoint: <YOUR_ENDPOINT_URL_HERE>
  385. model_kwargs:
  386. max_new_tokens: 100
  387. temperature: 0.5
  388. ```
  389. Currently only supports `text-generation` and `text2text-generation` for now [[ref](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html?highlight=huggingfaceendpoint#)].
  390. See langchain's [hugging face endpoint](https://python.langchain.com/docs/integrations/chat/huggingface#huggingfaceendpoint) for more information.
  391. ## Llama2
  392. Llama2 is integrated through [Replicate](https://replicate.com/). Set `REPLICATE_API_TOKEN` in environment variable which you can obtain from [their platform](https://replicate.com/account/api-tokens).
  393. Once you have the token, load the app using the config yaml file:
  394. <CodeGroup>
  395. ```python main.py
  396. import os
  397. from embedchain import App
  398. os.environ["REPLICATE_API_TOKEN"] = "xxx"
  399. # load llm configuration from config.yaml file
  400. app = App.from_config(config_path="config.yaml")
  401. ```
  402. ```yaml config.yaml
  403. llm:
  404. provider: llama2
  405. config:
  406. model: 'a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5'
  407. temperature: 0.5
  408. max_tokens: 1000
  409. top_p: 0.5
  410. stream: false
  411. ```
  412. </CodeGroup>
  413. ## Vertex AI
  414. Setup Google Cloud Platform application credentials by following the instruction on [GCP](https://cloud.google.com/docs/authentication/external/set-up-adc). Once setup is done, use the following code to create an app using VertexAI as provider:
  415. <CodeGroup>
  416. ```python main.py
  417. from embedchain import App
  418. # load llm configuration from config.yaml file
  419. app = App.from_config(config_path="config.yaml")
  420. ```
  421. ```yaml config.yaml
  422. llm:
  423. provider: vertexai
  424. config:
  425. model: 'chat-bison'
  426. temperature: 0.5
  427. top_p: 0.5
  428. ```
  429. </CodeGroup>
  430. ## Mistral AI
  431. Obtain the Mistral AI api key from their [console](https://console.mistral.ai/).
  432. <CodeGroup>
  433. ```python main.py
  434. os.environ["MISTRAL_API_KEY"] = "xxx"
  435. app = App.from_config(config_path="config.yaml")
  436. app.add("https://www.forbes.com/profile/elon-musk")
  437. response = app.query("what is the net worth of Elon Musk?")
  438. # As of January 16, 2024, Elon Musk's net worth is $225.4 billion.
  439. response = app.chat("which companies does elon own?")
  440. # Elon Musk owns Tesla, SpaceX, Boring Company, Twitter, and X.
  441. response = app.chat("what question did I ask you already?")
  442. # You have asked me several times already which companies Elon Musk owns, specifically Tesla, SpaceX, Boring Company, Twitter, and X.
  443. ```
  444. ```yaml config.yaml
  445. llm:
  446. provider: mistralai
  447. config:
  448. model: mistral-tiny
  449. temperature: 0.5
  450. max_tokens: 1000
  451. top_p: 1
  452. embedder:
  453. provider: mistralai
  454. config:
  455. model: mistral-embed
  456. ```
  457. </CodeGroup>
  458. ## AWS Bedrock
  459. ### Setup
  460. - Before using the AWS Bedrock LLM, make sure you have the appropriate model access from [Bedrock Console](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess).
  461. - You will also need to authenticate the `boto3` client by using a method in the [AWS documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#configuring-credentials)
  462. - You can optionally export an `AWS_REGION`
  463. ### Usage
  464. <CodeGroup>
  465. ```python main.py
  466. import os
  467. from embedchain import App
  468. os.environ["AWS_ACCESS_KEY_ID"] = "xxx"
  469. os.environ["AWS_SECRET_ACCESS_KEY"] = "xxx"
  470. os.environ["AWS_REGION"] = "us-west-2"
  471. app = App.from_config(config_path="config.yaml")
  472. ```
  473. ```yaml config.yaml
  474. llm:
  475. provider: aws_bedrock
  476. config:
  477. model: amazon.titan-text-express-v1
  478. # check notes below for model_kwargs
  479. model_kwargs:
  480. temperature: 0.5
  481. topP: 1
  482. maxTokenCount: 1000
  483. ```
  484. </CodeGroup>
  485. <br />
  486. <Note>
  487. The model arguments are different for each providers. Please refer to the [AWS Bedrock Documentation](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/providers) to find the appropriate arguments for your model.
  488. </Note>
  489. <br/ >
  490. ## Groq
  491. [Groq](https://groq.com/) is the creator of the world's first Language Processing Unit (LPU), providing exceptional speed performance for AI workloads running on their LPU Inference Engine.
  492. ### Usage
  493. In order to use LLMs from Groq, go to their [platform](https://console.groq.com/keys) and get the API key.
  494. Set the API key as `GROQ_API_KEY` environment variable or pass in your app configuration to use the model as given below in the example.
  495. <CodeGroup>
  496. ```python main.py
  497. import os
  498. from embedchain import App
  499. # Set your API key here or pass as the environment variable
  500. groq_api_key = "gsk_xxxx"
  501. config = {
  502. "llm": {
  503. "provider": "groq",
  504. "config": {
  505. "model": "mixtral-8x7b-32768",
  506. "api_key": groq_api_key,
  507. "stream": True
  508. }
  509. }
  510. }
  511. app = App.from_config(config=config)
  512. # Add your data source here
  513. app.add("https://docs.embedchain.ai/sitemap.xml", data_type="sitemap")
  514. app.query("Write a poem about Embedchain")
  515. # In the realm of data, vast and wide,
  516. # Embedchain stands with knowledge as its guide.
  517. # A platform open, for all to try,
  518. # Building bots that can truly fly.
  519. # With REST API, data in reach,
  520. # Deployment a breeze, as easy as a speech.
  521. # Updating data sources, anytime, anyday,
  522. # Embedchain's power, never sway.
  523. # A knowledge base, an assistant so grand,
  524. # Connecting to platforms, near and far.
  525. # Discord, WhatsApp, Slack, and more,
  526. # Embedchain's potential, never a bore.
  527. ```
  528. </CodeGroup>
  529. <br/ >
  530. <Snippet file="missing-llm-tip.mdx" />