llms.mdx 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657
  1. ---
  2. title: 🤖 Large language models (LLMs)
  3. ---
  4. ## Overview
  5. Embedchain comes with built-in support for various popular large language models. We handle the complexity of integrating these models for you, allowing you to easily customize your language model interactions through a user-friendly interface.
  6. <CardGroup cols={4}>
  7. <Card title="OpenAI" href="#openai"></Card>
  8. <Card title="Google AI" href="#google-ai"></Card>
  9. <Card title="Azure OpenAI" href="#azure-openai"></Card>
  10. <Card title="Anthropic" href="#anthropic"></Card>
  11. <Card title="Cohere" href="#cohere"></Card>
  12. <Card title="Together" href="#together"></Card>
  13. <Card title="Ollama" href="#ollama"></Card>
  14. <Card title="vLLM" href="#vllm"></Card>
  15. <Card title="GPT4All" href="#gpt4all"></Card>
  16. <Card title="JinaChat" href="#jinachat"></Card>
  17. <Card title="Hugging Face" href="#hugging-face"></Card>
  18. <Card title="Llama2" href="#llama2"></Card>
  19. <Card title="Vertex AI" href="#vertex-ai"></Card>
  20. <Card title="Mistral AI" href="#mistral-ai"></Card>
  21. <Card title="AWS Bedrock" href="#aws-bedrock"></Card>
  22. </CardGroup>
  23. ## OpenAI
  24. To use OpenAI LLM models, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys).
  25. Once you have obtained the key, you can use it like this:
  26. ```python
  27. import os
  28. from embedchain import App
  29. os.environ['OPENAI_API_KEY'] = 'xxx'
  30. app = App()
  31. app.add("https://en.wikipedia.org/wiki/OpenAI")
  32. app.query("What is OpenAI?")
  33. ```
  34. If you are looking to configure the different parameters of the LLM, you can do so by loading the app using a [yaml config](https://github.com/embedchain/embedchain/blob/main/configs/chroma.yaml) file.
  35. <CodeGroup>
  36. ```python main.py
  37. import os
  38. from embedchain import App
  39. os.environ['OPENAI_API_KEY'] = 'xxx'
  40. # load llm configuration from config.yaml file
  41. app = App.from_config(config_path="config.yaml")
  42. ```
  43. ```yaml config.yaml
  44. llm:
  45. provider: openai
  46. config:
  47. model: 'gpt-3.5-turbo'
  48. temperature: 0.5
  49. max_tokens: 1000
  50. top_p: 1
  51. stream: false
  52. ```
  53. </CodeGroup>
  54. ### Function Calling
  55. Embedchain supports OpenAI [Function calling](https://platform.openai.com/docs/guides/function-calling) with a single function. It accepts inputs in accordance with the [Langchain interface](https://python.langchain.com/docs/modules/model_io/chat/function_calling#legacy-args-functions-and-function_call).
  56. <Accordion title="Pydantic Model">
  57. ```python
  58. from pydantic import BaseModel
  59. class multiply(BaseModel):
  60. """Multiply two integers together."""
  61. a: int = Field(..., description="First integer")
  62. b: int = Field(..., description="Second integer")
  63. ```
  64. </Accordion>
  65. <Accordion title="Python function">
  66. ```python
  67. def multiply(a: int, b: int) -> int:
  68. """Multiply two integers together.
  69. Args:
  70. a: First integer
  71. b: Second integer
  72. """
  73. return a * b
  74. ```
  75. </Accordion>
  76. <Accordion title="OpenAI tool dictionary">
  77. ```python
  78. multiply = {
  79. "type": "function",
  80. "function": {
  81. "name": "multiply",
  82. "description": "Multiply two integers together.",
  83. "parameters": {
  84. "type": "object",
  85. "properties": {
  86. "a": {
  87. "description": "First integer",
  88. "type": "integer"
  89. },
  90. "b": {
  91. "description": "Second integer",
  92. "type": "integer"
  93. }
  94. },
  95. "required": [
  96. "a",
  97. "b"
  98. ]
  99. }
  100. }
  101. }
  102. ```
  103. </Accordion>
  104. With any of the previous inputs, the OpenAI LLM can be queried to provide the appropriate arguments for the function.
  105. ```python
  106. import os
  107. from embedchain import App
  108. from embedchain.llm.openai import OpenAILlm
  109. os.environ["OPENAI_API_KEY"] = "sk-xxx"
  110. llm = OpenAILlm(tools=multiply)
  111. app = App(llm=llm)
  112. result = app.query("What is the result of 125 multiplied by fifteen?")
  113. ```
  114. ## Google AI
  115. To use Google AI model, you have to set the `GOOGLE_API_KEY` environment variable. You can obtain the Google API key from the [Google Maker Suite](https://makersuite.google.com/app/apikey)
  116. <CodeGroup>
  117. ```python main.py
  118. import os
  119. from embedchain import App
  120. os.environ["GOOGLE_API_KEY"] = "xxx"
  121. app = App.from_config(config_path="config.yaml")
  122. app.add("https://www.forbes.com/profile/elon-musk")
  123. response = app.query("What is the net worth of Elon Musk?")
  124. if app.llm.config.stream: # if stream is enabled, response is a generator
  125. for chunk in response:
  126. print(chunk)
  127. else:
  128. print(response)
  129. ```
  130. ```yaml config.yaml
  131. llm:
  132. provider: google
  133. config:
  134. model: gemini-pro
  135. max_tokens: 1000
  136. temperature: 0.5
  137. top_p: 1
  138. stream: false
  139. embedder:
  140. provider: google
  141. config:
  142. model: 'models/embedding-001'
  143. task_type: "retrieval_document"
  144. title: "Embeddings for Embedchain"
  145. ```
  146. </CodeGroup>
  147. ## Azure OpenAI
  148. To use Azure OpenAI model, you have to set some of the azure openai related environment variables as given in the code block below:
  149. <CodeGroup>
  150. ```python main.py
  151. import os
  152. from embedchain import App
  153. os.environ["OPENAI_API_TYPE"] = "azure"
  154. os.environ["OPENAI_API_BASE"] = "https://xxx.openai.azure.com/"
  155. os.environ["OPENAI_API_KEY"] = "xxx"
  156. os.environ["OPENAI_API_VERSION"] = "xxx"
  157. app = App.from_config(config_path="config.yaml")
  158. ```
  159. ```yaml config.yaml
  160. llm:
  161. provider: azure_openai
  162. config:
  163. model: gpt-3.5-turbo
  164. deployment_name: your_llm_deployment_name
  165. temperature: 0.5
  166. max_tokens: 1000
  167. top_p: 1
  168. stream: false
  169. embedder:
  170. provider: azure_openai
  171. config:
  172. model: text-embedding-ada-002
  173. deployment_name: you_embedding_model_deployment_name
  174. ```
  175. </CodeGroup>
  176. You can find the list of models and deployment name on the [Azure OpenAI Platform](https://oai.azure.com/portal).
  177. ## Anthropic
  178. To use anthropic's model, please set the `ANTHROPIC_API_KEY` which you find on their [Account Settings Page](https://console.anthropic.com/account/keys).
  179. <CodeGroup>
  180. ```python main.py
  181. import os
  182. from embedchain import App
  183. os.environ["ANTHROPIC_API_KEY"] = "xxx"
  184. # load llm configuration from config.yaml file
  185. app = App.from_config(config_path="config.yaml")
  186. ```
  187. ```yaml config.yaml
  188. llm:
  189. provider: anthropic
  190. config:
  191. model: 'claude-instant-1'
  192. temperature: 0.5
  193. max_tokens: 1000
  194. top_p: 1
  195. stream: false
  196. ```
  197. </CodeGroup>
  198. ## Cohere
  199. Install related dependencies using the following command:
  200. ```bash
  201. pip install --upgrade 'embedchain[cohere]'
  202. ```
  203. Set the `COHERE_API_KEY` as environment variable which you can find on their [Account settings page](https://dashboard.cohere.com/api-keys).
  204. Once you have the API key, you are all set to use it with Embedchain.
  205. <CodeGroup>
  206. ```python main.py
  207. import os
  208. from embedchain import App
  209. os.environ["COHERE_API_KEY"] = "xxx"
  210. # load llm configuration from config.yaml file
  211. app = App.from_config(config_path="config.yaml")
  212. ```
  213. ```yaml config.yaml
  214. llm:
  215. provider: cohere
  216. config:
  217. model: large
  218. temperature: 0.5
  219. max_tokens: 1000
  220. top_p: 1
  221. ```
  222. </CodeGroup>
  223. ## Together
  224. Install related dependencies using the following command:
  225. ```bash
  226. pip install --upgrade 'embedchain[together]'
  227. ```
  228. Set the `TOGETHER_API_KEY` as environment variable which you can find on their [Account settings page](https://api.together.xyz/settings/api-keys).
  229. Once you have the API key, you are all set to use it with Embedchain.
  230. <CodeGroup>
  231. ```python main.py
  232. import os
  233. from embedchain import App
  234. os.environ["TOGETHER_API_KEY"] = "xxx"
  235. # load llm configuration from config.yaml file
  236. app = App.from_config(config_path="config.yaml")
  237. ```
  238. ```yaml config.yaml
  239. llm:
  240. provider: together
  241. config:
  242. model: togethercomputer/RedPajama-INCITE-7B-Base
  243. temperature: 0.5
  244. max_tokens: 1000
  245. top_p: 1
  246. ```
  247. </CodeGroup>
  248. ## Ollama
  249. Setup Ollama using https://github.com/jmorganca/ollama
  250. <CodeGroup>
  251. ```python main.py
  252. import os
  253. from embedchain import App
  254. # load llm configuration from config.yaml file
  255. app = App.from_config(config_path="config.yaml")
  256. ```
  257. ```yaml config.yaml
  258. llm:
  259. provider: ollama
  260. config:
  261. model: 'llama2'
  262. temperature: 0.5
  263. top_p: 1
  264. stream: true
  265. ```
  266. </CodeGroup>
  267. ## vLLM
  268. Setup vLLM by following instructions given in [their docs](https://docs.vllm.ai/en/latest/getting_started/installation.html).
  269. <CodeGroup>
  270. ```python main.py
  271. import os
  272. from embedchain import App
  273. # load llm configuration from config.yaml file
  274. app = App.from_config(config_path="config.yaml")
  275. ```
  276. ```yaml config.yaml
  277. llm:
  278. provider: vllm
  279. config:
  280. model: 'meta-llama/Llama-2-70b-hf'
  281. temperature: 0.5
  282. top_p: 1
  283. top_k: 10
  284. stream: true
  285. trust_remote_code: true
  286. ```
  287. </CodeGroup>
  288. ## GPT4ALL
  289. Install related dependencies using the following command:
  290. ```bash
  291. pip install --upgrade 'embedchain[opensource]'
  292. ```
  293. GPT4all is a free-to-use, locally running, privacy-aware chatbot. No GPU or internet required. You can use this with Embedchain using the following code:
  294. <CodeGroup>
  295. ```python main.py
  296. from embedchain import App
  297. # load llm configuration from config.yaml file
  298. app = App.from_config(config_path="config.yaml")
  299. ```
  300. ```yaml config.yaml
  301. llm:
  302. provider: gpt4all
  303. config:
  304. model: 'orca-mini-3b-gguf2-q4_0.gguf'
  305. temperature: 0.5
  306. max_tokens: 1000
  307. top_p: 1
  308. stream: false
  309. embedder:
  310. provider: gpt4all
  311. ```
  312. </CodeGroup>
  313. ## JinaChat
  314. First, set `JINACHAT_API_KEY` in environment variable which you can obtain from [their platform](https://chat.jina.ai/api).
  315. Once you have the key, load the app using the config yaml file:
  316. <CodeGroup>
  317. ```python main.py
  318. import os
  319. from embedchain import App
  320. os.environ["JINACHAT_API_KEY"] = "xxx"
  321. # load llm configuration from config.yaml file
  322. app = App.from_config(config_path="config.yaml")
  323. ```
  324. ```yaml config.yaml
  325. llm:
  326. provider: jina
  327. config:
  328. temperature: 0.5
  329. max_tokens: 1000
  330. top_p: 1
  331. stream: false
  332. ```
  333. </CodeGroup>
  334. ## Hugging Face
  335. Install related dependencies using the following command:
  336. ```bash
  337. pip install --upgrade 'embedchain[huggingface-hub]'
  338. ```
  339. First, set `HUGGINGFACE_ACCESS_TOKEN` in environment variable which you can obtain from [their platform](https://huggingface.co/settings/tokens).
  340. Once you have the token, load the app using the config yaml file:
  341. <CodeGroup>
  342. ```python main.py
  343. import os
  344. from embedchain import App
  345. os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx"
  346. # load llm configuration from config.yaml file
  347. app = App.from_config(config_path="config.yaml")
  348. ```
  349. ```yaml config.yaml
  350. llm:
  351. provider: huggingface
  352. config:
  353. model: 'google/flan-t5-xxl'
  354. temperature: 0.5
  355. max_tokens: 1000
  356. top_p: 0.5
  357. stream: false
  358. ```
  359. </CodeGroup>
  360. ### Custom Endpoints
  361. You can also use [Hugging Face Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index#-inference-endpoints) to access custom endpoints. First, set the `HUGGINGFACE_ACCESS_TOKEN` as above.
  362. Then, load the app using the config yaml file:
  363. <CodeGroup>
  364. ```python main.py
  365. import os
  366. from embedchain import App
  367. os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx"
  368. # load llm configuration from config.yaml file
  369. app = App.from_config(config_path="config.yaml")
  370. ```
  371. ```yaml config.yaml
  372. llm:
  373. provider: huggingface
  374. config:
  375. endpoint: https://api-inference.huggingface.co/models/gpt2 # replace with your personal endpoint
  376. ```
  377. </CodeGroup>
  378. If your endpoint requires additional parameters, you can pass them in the `model_kwargs` field:
  379. ```
  380. llm:
  381. provider: huggingface
  382. config:
  383. endpoint: <YOUR_ENDPOINT_URL_HERE>
  384. model_kwargs:
  385. max_new_tokens: 100
  386. temperature: 0.5
  387. ```
  388. Currently only supports `text-generation` and `text2text-generation` for now [[ref](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html?highlight=huggingfaceendpoint#)].
  389. See langchain's [hugging face endpoint](https://python.langchain.com/docs/integrations/chat/huggingface#huggingfaceendpoint) for more information.
  390. ## Llama2
  391. Llama2 is integrated through [Replicate](https://replicate.com/). Set `REPLICATE_API_TOKEN` in environment variable which you can obtain from [their platform](https://replicate.com/account/api-tokens).
  392. Once you have the token, load the app using the config yaml file:
  393. <CodeGroup>
  394. ```python main.py
  395. import os
  396. from embedchain import App
  397. os.environ["REPLICATE_API_TOKEN"] = "xxx"
  398. # load llm configuration from config.yaml file
  399. app = App.from_config(config_path="config.yaml")
  400. ```
  401. ```yaml config.yaml
  402. llm:
  403. provider: llama2
  404. config:
  405. model: 'a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5'
  406. temperature: 0.5
  407. max_tokens: 1000
  408. top_p: 0.5
  409. stream: false
  410. ```
  411. </CodeGroup>
  412. ## Vertex AI
  413. Setup Google Cloud Platform application credentials by following the instruction on [GCP](https://cloud.google.com/docs/authentication/external/set-up-adc). Once setup is done, use the following code to create an app using VertexAI as provider:
  414. <CodeGroup>
  415. ```python main.py
  416. from embedchain import App
  417. # load llm configuration from config.yaml file
  418. app = App.from_config(config_path="config.yaml")
  419. ```
  420. ```yaml config.yaml
  421. llm:
  422. provider: vertexai
  423. config:
  424. model: 'chat-bison'
  425. temperature: 0.5
  426. top_p: 0.5
  427. ```
  428. </CodeGroup>
  429. ## Mistral AI
  430. Obtain the Mistral AI api key from their [console](https://console.mistral.ai/).
  431. <CodeGroup>
  432. ```python main.py
  433. os.environ["MISTRAL_API_KEY"] = "xxx"
  434. app = App.from_config(config_path="config.yaml")
  435. app.add("https://www.forbes.com/profile/elon-musk")
  436. response = app.query("what is the net worth of Elon Musk?")
  437. # As of January 16, 2024, Elon Musk's net worth is $225.4 billion.
  438. response = app.chat("which companies does elon own?")
  439. # Elon Musk owns Tesla, SpaceX, Boring Company, Twitter, and X.
  440. response = app.chat("what question did I ask you already?")
  441. # You have asked me several times already which companies Elon Musk owns, specifically Tesla, SpaceX, Boring Company, Twitter, and X.
  442. ```
  443. ```yaml config.yaml
  444. llm:
  445. provider: mistralai
  446. config:
  447. model: mistral-tiny
  448. temperature: 0.5
  449. max_tokens: 1000
  450. top_p: 1
  451. embedder:
  452. provider: mistralai
  453. config:
  454. model: mistral-embed
  455. ```
  456. </CodeGroup>
  457. ## AWS Bedrock
  458. ### Setup
  459. - Before using the AWS Bedrock LLM, make sure you have the appropriate model access from [Bedrock Console](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess).
  460. - You will also need to authenticate the `boto3` client by using a method in the [AWS documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#configuring-credentials)
  461. - You can optionally export an `AWS_REGION`
  462. ### Usage
  463. <CodeGroup>
  464. ```python main.py
  465. import os
  466. from embedchain import App
  467. os.environ["AWS_ACCESS_KEY_ID"] = "xxx"
  468. os.environ["AWS_SECRET_ACCESS_KEY"] = "xxx"
  469. os.environ["AWS_REGION"] = "us-west-2"
  470. app = App.from_config(config_path="config.yaml")
  471. ```
  472. ```yaml config.yaml
  473. llm:
  474. provider: aws_bedrock
  475. config:
  476. model: amazon.titan-text-express-v1
  477. # check notes below for model_kwargs
  478. model_kwargs:
  479. temperature: 0.5
  480. topP: 1
  481. maxTokenCount: 1000
  482. ```
  483. </CodeGroup>
  484. <br />
  485. <Note>
  486. The model arguments are different for each providers. Please refer to the [AWS Bedrock Documentation](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/providers) to find the appropriate arguments for your model.
  487. </Note>
  488. <br/ >
  489. <Snippet file="missing-llm-tip.mdx" />