llms.mdx 16 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667
  1. ---
  2. title: 🤖 Large language models (LLMs)
  3. ---
  4. ## Overview
  5. Embedchain comes with built-in support for various popular large language models. We handle the complexity of integrating these models for you, allowing you to easily customize your language model interactions through a user-friendly interface.
  6. <CardGroup cols={4}>
  7. <Card title="OpenAI" href="#openai"></Card>
  8. <Card title="Google AI" href="#google-ai"></Card>
  9. <Card title="Azure OpenAI" href="#azure-openai"></Card>
  10. <Card title="Anthropic" href="#anthropic"></Card>
  11. <Card title="Cohere" href="#cohere"></Card>
  12. <Card title="Together" href="#together"></Card>
  13. <Card title="Ollama" href="#ollama"></Card>
  14. <Card title="vLLM" href="#vllm"></Card>
  15. <Card title="GPT4All" href="#gpt4all"></Card>
  16. <Card title="JinaChat" href="#jinachat"></Card>
  17. <Card title="Hugging Face" href="#hugging-face"></Card>
  18. <Card title="Llama2" href="#llama2"></Card>
  19. <Card title="Vertex AI" href="#vertex-ai"></Card>
  20. <Card title="Mistral AI" href="#mistral-ai"></Card>
  21. </CardGroup>
  22. ## OpenAI
  23. To use OpenAI LLM models, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys).
  24. Once you have obtained the key, you can use it like this:
  25. ```python
  26. import os
  27. from embedchain import App
  28. os.environ['OPENAI_API_KEY'] = 'xxx'
  29. app = App()
  30. app.add("https://en.wikipedia.org/wiki/OpenAI")
  31. app.query("What is OpenAI?")
  32. ```
  33. If you are looking to configure the different parameters of the LLM, you can do so by loading the app using a [yaml config](https://github.com/embedchain/embedchain/blob/main/configs/chroma.yaml) file.
  34. <CodeGroup>
  35. ```python main.py
  36. import os
  37. from embedchain import App
  38. os.environ['OPENAI_API_KEY'] = 'xxx'
  39. # load llm configuration from config.yaml file
  40. app = App.from_config(config_path="config.yaml")
  41. ```
  42. ```yaml config.yaml
  43. llm:
  44. provider: openai
  45. config:
  46. model: 'gpt-3.5-turbo'
  47. temperature: 0.5
  48. max_tokens: 1000
  49. top_p: 1
  50. stream: false
  51. ```
  52. </CodeGroup>
  53. ### Function Calling
  54. To enable [function calling](https://platform.openai.com/docs/guides/function-calling) in your application using embedchain and OpenAI, you need to pass functions into `OpenAILlm` class as an array of functions. Here are several ways in which you can achieve that:
  55. Examples:
  56. <Accordion title="Using Pydantic Models">
  57. ```python
  58. import os
  59. from embedchain import App
  60. from embedchain.llm.openai import OpenAILlm
  61. import requests
  62. from pydantic import BaseModel, Field, ValidationError, field_validator
  63. os.environ["OPENAI_API_KEY"] = "sk-xxx"
  64. class QA(BaseModel):
  65. """
  66. A question and answer pair.
  67. """
  68. question: str = Field(
  69. ..., description="The question.", example="What is a mountain?"
  70. )
  71. answer: str = Field(
  72. ..., description="The answer.", example="A mountain is a hill."
  73. )
  74. person_who_is_asking: str = Field(
  75. ..., description="The person who is asking the question.", example="John"
  76. )
  77. @field_validator("question")
  78. def question_must_end_with_a_question_mark(cls, v):
  79. """
  80. Validate that the question ends with a question mark.
  81. """
  82. if not v.endswith("?"):
  83. raise ValueError("question must end with a question mark")
  84. return v
  85. @field_validator("answer")
  86. def answer_must_end_with_a_period(cls, v):
  87. """
  88. Validate that the answer ends with a period.
  89. """
  90. if not v.endswith("."):
  91. raise ValueError("answer must end with a period")
  92. return v
  93. llm = OpenAILlm(config=None,functions=[QA])
  94. app = App(llm=llm)
  95. result = app.query("Hey I am Sid. What is a mountain? A mountain is a hill.")
  96. print(result)
  97. ```
  98. </Accordion>
  99. <Accordion title="Using OpenAI JSON schema">
  100. ```python
  101. import os
  102. from embedchain import App
  103. from embedchain.llm.openai import OpenAILlm
  104. import requests
  105. from pydantic import BaseModel, Field, ValidationError, field_validator
  106. os.environ["OPENAI_API_KEY"] = "sk-xxx"
  107. json_schema = {
  108. "name": "get_qa",
  109. "description": "A question and answer pair and the user who is asking the question.",
  110. "parameters": {
  111. "type": "object",
  112. "properties": {
  113. "question": {"type": "string", "description": "The question."},
  114. "answer": {"type": "string", "description": "The answer."},
  115. "person_who_is_asking": {
  116. "type": "string",
  117. "description": "The person who is asking the question.",
  118. }
  119. },
  120. "required": ["question", "answer", "person_who_is_asking"],
  121. },
  122. }
  123. llm = OpenAILlm(config=None,functions=[json_schema])
  124. app = App(llm=llm)
  125. result = app.query("Hey I am Sid. What is a mountain? A mountain is a hill.")
  126. print(result)
  127. ```
  128. </Accordion>
  129. <Accordion title="Using actual python functions">
  130. ```python
  131. import os
  132. from embedchain import App
  133. from embedchain.llm.openai import OpenAILlm
  134. import requests
  135. from pydantic import BaseModel, Field, ValidationError, field_validator
  136. os.environ["OPENAI_API_KEY"] = "sk-xxx"
  137. def find_info_of_pokemon(pokemon: str):
  138. """
  139. Find the information of the given pokemon.
  140. Args:
  141. pokemon: The pokemon.
  142. """
  143. req = requests.get(f"https://pokeapi.co/api/v2/pokemon/{pokemon}")
  144. if req.status_code == 404:
  145. raise ValueError("pokemon not found")
  146. return req.json()
  147. llm = OpenAILlm(config=None,functions=[find_info_of_pokemon])
  148. app = App(llm=llm)
  149. result = app.query("Tell me more about the pokemon pikachu.")
  150. print(result)
  151. ```
  152. </Accordion>
  153. ## Google AI
  154. To use Google AI model, you have to set the `GOOGLE_API_KEY` environment variable. You can obtain the Google API key from the [Google Maker Suite](https://makersuite.google.com/app/apikey)
  155. <CodeGroup>
  156. ```python main.py
  157. import os
  158. from embedchain import App
  159. os.environ["GOOGLE_API_KEY"] = "xxx"
  160. app = App.from_config(config_path="config.yaml")
  161. app.add("https://www.forbes.com/profile/elon-musk")
  162. response = app.query("What is the net worth of Elon Musk?")
  163. if app.llm.config.stream: # if stream is enabled, response is a generator
  164. for chunk in response:
  165. print(chunk)
  166. else:
  167. print(response)
  168. ```
  169. ```yaml config.yaml
  170. llm:
  171. provider: google
  172. config:
  173. model: gemini-pro
  174. max_tokens: 1000
  175. temperature: 0.5
  176. top_p: 1
  177. stream: false
  178. embedder:
  179. provider: google
  180. config:
  181. model: 'models/embedding-001'
  182. task_type: "retrieval_document"
  183. title: "Embeddings for Embedchain"
  184. ```
  185. </CodeGroup>
  186. ## Azure OpenAI
  187. To use Azure OpenAI model, you have to set some of the azure openai related environment variables as given in the code block below:
  188. <CodeGroup>
  189. ```python main.py
  190. import os
  191. from embedchain import App
  192. os.environ["OPENAI_API_TYPE"] = "azure"
  193. os.environ["OPENAI_API_BASE"] = "https://xxx.openai.azure.com/"
  194. os.environ["OPENAI_API_KEY"] = "xxx"
  195. os.environ["OPENAI_API_VERSION"] = "xxx"
  196. app = App.from_config(config_path="config.yaml")
  197. ```
  198. ```yaml config.yaml
  199. llm:
  200. provider: azure_openai
  201. config:
  202. model: gpt-3.5-turbo
  203. deployment_name: your_llm_deployment_name
  204. temperature: 0.5
  205. max_tokens: 1000
  206. top_p: 1
  207. stream: false
  208. embedder:
  209. provider: azure_openai
  210. config:
  211. model: text-embedding-ada-002
  212. deployment_name: you_embedding_model_deployment_name
  213. ```
  214. </CodeGroup>
  215. You can find the list of models and deployment name on the [Azure OpenAI Platform](https://oai.azure.com/portal).
  216. ## Anthropic
  217. To use anthropic's model, please set the `ANTHROPIC_API_KEY` which you find on their [Account Settings Page](https://console.anthropic.com/account/keys).
  218. <CodeGroup>
  219. ```python main.py
  220. import os
  221. from embedchain import App
  222. os.environ["ANTHROPIC_API_KEY"] = "xxx"
  223. # load llm configuration from config.yaml file
  224. app = App.from_config(config_path="config.yaml")
  225. ```
  226. ```yaml config.yaml
  227. llm:
  228. provider: anthropic
  229. config:
  230. model: 'claude-instant-1'
  231. temperature: 0.5
  232. max_tokens: 1000
  233. top_p: 1
  234. stream: false
  235. ```
  236. </CodeGroup>
  237. ## Cohere
  238. Install related dependencies using the following command:
  239. ```bash
  240. pip install --upgrade 'embedchain[cohere]'
  241. ```
  242. Set the `COHERE_API_KEY` as environment variable which you can find on their [Account settings page](https://dashboard.cohere.com/api-keys).
  243. Once you have the API key, you are all set to use it with Embedchain.
  244. <CodeGroup>
  245. ```python main.py
  246. import os
  247. from embedchain import App
  248. os.environ["COHERE_API_KEY"] = "xxx"
  249. # load llm configuration from config.yaml file
  250. app = App.from_config(config_path="config.yaml")
  251. ```
  252. ```yaml config.yaml
  253. llm:
  254. provider: cohere
  255. config:
  256. model: large
  257. temperature: 0.5
  258. max_tokens: 1000
  259. top_p: 1
  260. ```
  261. </CodeGroup>
  262. ## Together
  263. Install related dependencies using the following command:
  264. ```bash
  265. pip install --upgrade 'embedchain[together]'
  266. ```
  267. Set the `TOGETHER_API_KEY` as environment variable which you can find on their [Account settings page](https://api.together.xyz/settings/api-keys).
  268. Once you have the API key, you are all set to use it with Embedchain.
  269. <CodeGroup>
  270. ```python main.py
  271. import os
  272. from embedchain import App
  273. os.environ["TOGETHER_API_KEY"] = "xxx"
  274. # load llm configuration from config.yaml file
  275. app = App.from_config(config_path="config.yaml")
  276. ```
  277. ```yaml config.yaml
  278. llm:
  279. provider: together
  280. config:
  281. model: togethercomputer/RedPajama-INCITE-7B-Base
  282. temperature: 0.5
  283. max_tokens: 1000
  284. top_p: 1
  285. ```
  286. </CodeGroup>
  287. ## Ollama
  288. Setup Ollama using https://github.com/jmorganca/ollama
  289. <CodeGroup>
  290. ```python main.py
  291. import os
  292. from embedchain import App
  293. # load llm configuration from config.yaml file
  294. app = App.from_config(config_path="config.yaml")
  295. ```
  296. ```yaml config.yaml
  297. llm:
  298. provider: ollama
  299. config:
  300. model: 'llama2'
  301. temperature: 0.5
  302. top_p: 1
  303. stream: true
  304. ```
  305. </CodeGroup>
  306. ## vLLM
  307. Setup vLLM by following instructions given in [their docs](https://docs.vllm.ai/en/latest/getting_started/installation.html).
  308. <CodeGroup>
  309. ```python main.py
  310. import os
  311. from embedchain import App
  312. # load llm configuration from config.yaml file
  313. app = App.from_config(config_path="config.yaml")
  314. ```
  315. ```yaml config.yaml
  316. llm:
  317. provider: vllm
  318. config:
  319. model: 'meta-llama/Llama-2-70b-hf'
  320. temperature: 0.5
  321. top_p: 1
  322. top_k: 10
  323. stream: true
  324. trust_remote_code: true
  325. ```
  326. </CodeGroup>
  327. ## GPT4ALL
  328. Install related dependencies using the following command:
  329. ```bash
  330. pip install --upgrade 'embedchain[opensource]'
  331. ```
  332. GPT4all is a free-to-use, locally running, privacy-aware chatbot. No GPU or internet required. You can use this with Embedchain using the following code:
  333. <CodeGroup>
  334. ```python main.py
  335. from embedchain import App
  336. # load llm configuration from config.yaml file
  337. app = App.from_config(config_path="config.yaml")
  338. ```
  339. ```yaml config.yaml
  340. llm:
  341. provider: gpt4all
  342. config:
  343. model: 'orca-mini-3b-gguf2-q4_0.gguf'
  344. temperature: 0.5
  345. max_tokens: 1000
  346. top_p: 1
  347. stream: false
  348. embedder:
  349. provider: gpt4all
  350. ```
  351. </CodeGroup>
  352. ## JinaChat
  353. First, set `JINACHAT_API_KEY` in environment variable which you can obtain from [their platform](https://chat.jina.ai/api).
  354. Once you have the key, load the app using the config yaml file:
  355. <CodeGroup>
  356. ```python main.py
  357. import os
  358. from embedchain import App
  359. os.environ["JINACHAT_API_KEY"] = "xxx"
  360. # load llm configuration from config.yaml file
  361. app = App.from_config(config_path="config.yaml")
  362. ```
  363. ```yaml config.yaml
  364. llm:
  365. provider: jina
  366. config:
  367. temperature: 0.5
  368. max_tokens: 1000
  369. top_p: 1
  370. stream: false
  371. ```
  372. </CodeGroup>
  373. ## Hugging Face
  374. Install related dependencies using the following command:
  375. ```bash
  376. pip install --upgrade 'embedchain[huggingface-hub]'
  377. ```
  378. First, set `HUGGINGFACE_ACCESS_TOKEN` in environment variable which you can obtain from [their platform](https://huggingface.co/settings/tokens).
  379. Once you have the token, load the app using the config yaml file:
  380. <CodeGroup>
  381. ```python main.py
  382. import os
  383. from embedchain import App
  384. os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx"
  385. # load llm configuration from config.yaml file
  386. app = App.from_config(config_path="config.yaml")
  387. ```
  388. ```yaml config.yaml
  389. llm:
  390. provider: huggingface
  391. config:
  392. model: 'google/flan-t5-xxl'
  393. temperature: 0.5
  394. max_tokens: 1000
  395. top_p: 0.5
  396. stream: false
  397. ```
  398. </CodeGroup>
  399. ### Custom Endpoints
  400. You can also use [Hugging Face Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index#-inference-endpoints) to access custom endpoints. First, set the `HUGGINGFACE_ACCESS_TOKEN` as above.
  401. Then, load the app using the config yaml file:
  402. <CodeGroup>
  403. ```python main.py
  404. import os
  405. from embedchain import App
  406. os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx"
  407. # load llm configuration from config.yaml file
  408. app = App.from_config(config_path="config.yaml")
  409. ```
  410. ```yaml config.yaml
  411. llm:
  412. provider: huggingface
  413. config:
  414. endpoint: https://api-inference.huggingface.co/models/gpt2 # replace with your personal endpoint
  415. ```
  416. </CodeGroup>
  417. If your endpoint requires additional parameters, you can pass them in the `model_kwargs` field:
  418. ```
  419. llm:
  420. provider: huggingface
  421. config:
  422. endpoint: <YOUR_ENDPOINT_URL_HERE>
  423. model_kwargs:
  424. max_new_tokens: 100
  425. temperature: 0.5
  426. ```
  427. Currently only supports `text-generation` and `text2text-generation` for now [[ref](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html?highlight=huggingfaceendpoint#)].
  428. See langchain's [hugging face endpoint](https://python.langchain.com/docs/integrations/chat/huggingface#huggingfaceendpoint) for more information.
  429. ## Llama2
  430. Llama2 is integrated through [Replicate](https://replicate.com/). Set `REPLICATE_API_TOKEN` in environment variable which you can obtain from [their platform](https://replicate.com/account/api-tokens).
  431. Once you have the token, load the app using the config yaml file:
  432. <CodeGroup>
  433. ```python main.py
  434. import os
  435. from embedchain import App
  436. os.environ["REPLICATE_API_TOKEN"] = "xxx"
  437. # load llm configuration from config.yaml file
  438. app = App.from_config(config_path="config.yaml")
  439. ```
  440. ```yaml config.yaml
  441. llm:
  442. provider: llama2
  443. config:
  444. model: 'a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5'
  445. temperature: 0.5
  446. max_tokens: 1000
  447. top_p: 0.5
  448. stream: false
  449. ```
  450. </CodeGroup>
  451. ## Vertex AI
  452. Setup Google Cloud Platform application credentials by following the instruction on [GCP](https://cloud.google.com/docs/authentication/external/set-up-adc). Once setup is done, use the following code to create an app using VertexAI as provider:
  453. <CodeGroup>
  454. ```python main.py
  455. from embedchain import App
  456. # load llm configuration from config.yaml file
  457. app = App.from_config(config_path="config.yaml")
  458. ```
  459. ```yaml config.yaml
  460. llm:
  461. provider: vertexai
  462. config:
  463. model: 'chat-bison'
  464. temperature: 0.5
  465. top_p: 0.5
  466. ```
  467. </CodeGroup>
  468. ## Mistral AI
  469. Obtain the Mistral AI api key from their [console](https://console.mistral.ai/).
  470. <CodeGroup>
  471. ```python main.py
  472. import os
  473. from embedchain import App
  474. os.environ["MISTRAL_API_KEY"] = "xxx"
  475. app = App.from_config(config_path="config.yaml")
  476. app.add("https://www.forbes.com/profile/elon-musk")
  477. response = app.query("what is the net worth of Elon Musk?")
  478. # As of January 16, 2024, Elon Musk's net worth is $225.4 billion.
  479. response = app.chat("which companies does elon own?")
  480. # Elon Musk owns Tesla, SpaceX, Boring Company, Twitter, and X.
  481. response = app.chat("what question did I ask you already?")
  482. # You have asked me several times already which companies Elon Musk owns, specifically Tesla, SpaceX, Boring Company, Twitter, and X.
  483. ```
  484. ```yaml config.yaml
  485. llm:
  486. provider: mistralai
  487. config:
  488. model: mistral-tiny
  489. temperature: 0.5
  490. max_tokens: 1000
  491. top_p: 1
  492. embedder:
  493. provider: mistralai
  494. config:
  495. model: mistral-embed
  496. ```
  497. </CodeGroup>
  498. <br/ >
  499. <Snippet file="missing-llm-tip.mdx" />