vector-databases.mdx 4.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227
  1. ---
  2. title: 🗄️ Vector databases
  3. ---
  4. ## Overview
  5. Utilizing a vector database alongside Embedchain is a seamless process. All you need to do is configure it within the YAML configuration file. We've provided examples for each supported database below:
  6. <CardGroup cols={4}>
  7. <Card title="ChromaDB" href="#chromadb"></Card>
  8. <Card title="Elasticsearch" href="#elasticsearch"></Card>
  9. <Card title="OpenSearch" href="#opensearch"></Card>
  10. <Card title="Zilliz" href="#zilliz"></Card>
  11. <Card title="LanceDB" href="#lancedb"></Card>
  12. <Card title="Pinecone" href="#pinecone"></Card>
  13. <Card title="Qdrant" href="#qdrant"></Card>
  14. <Card title="Weaviate" href="#weaviate"></Card>
  15. </CardGroup>
  16. ## ChromaDB
  17. <CodeGroup>
  18. ```python main.py
  19. from embedchain import Pipeline as App
  20. # load chroma configuration from yaml file
  21. app = App.from_config(config_path="config1.yaml")
  22. ```
  23. ```yaml config1.yaml
  24. vectordb:
  25. provider: chroma
  26. config:
  27. collection_name: 'my-collection'
  28. dir: db
  29. allow_reset: true
  30. ```
  31. ```yaml config2.yaml
  32. vectordb:
  33. provider: chroma
  34. config:
  35. collection_name: 'my-collection'
  36. host: localhost
  37. port: 5200
  38. allow_reset: true
  39. ```
  40. </CodeGroup>
  41. ## Elasticsearch
  42. Install related dependencies using the following command:
  43. ```bash
  44. pip install --upgrade 'embedchain[elasticsearch]'
  45. ```
  46. <CodeGroup>
  47. ```python main.py
  48. from embedchain import Pipeline as App
  49. # load elasticsearch configuration from yaml file
  50. app = App.from_config(config_path="config.yaml")
  51. ```
  52. ```yaml config.yaml
  53. vectordb:
  54. provider: elasticsearch
  55. config:
  56. collection_name: 'es-index'
  57. es_url: http://localhost:9200
  58. http_auth:
  59. - admin
  60. - admin
  61. api_key: xxx
  62. verify_certs: false
  63. ```
  64. </CodeGroup>
  65. ## OpenSearch
  66. Install related dependencies using the following command:
  67. ```bash
  68. pip install --upgrade 'embedchain[opensearch]'
  69. ```
  70. <CodeGroup>
  71. ```python main.py
  72. from embedchain import Pipeline as App
  73. # load opensearch configuration from yaml file
  74. app = App.from_config(config_path="config.yaml")
  75. ```
  76. ```yaml config.yaml
  77. vectordb:
  78. provider: opensearch
  79. config:
  80. collection_name: 'my-app'
  81. opensearch_url: 'https://localhost:9200'
  82. http_auth:
  83. - admin
  84. - admin
  85. vector_dimension: 1536
  86. use_ssl: false
  87. verify_certs: false
  88. ```
  89. </CodeGroup>
  90. ## Zilliz
  91. Install related dependencies using the following command:
  92. ```bash
  93. pip install --upgrade 'embedchain[milvus]'
  94. ```
  95. Set the Zilliz environment variables `ZILLIZ_CLOUD_URI` and `ZILLIZ_CLOUD_TOKEN` which you can find it on their [cloud platform](https://cloud.zilliz.com/).
  96. <CodeGroup>
  97. ```python main.py
  98. import os
  99. from embedchain import Pipeline as App
  100. os.environ['ZILLIZ_CLOUD_URI'] = 'https://xxx.zillizcloud.com'
  101. os.environ['ZILLIZ_CLOUD_TOKEN'] = 'xxx'
  102. # load zilliz configuration from yaml file
  103. app = App.from_config(config_path="config.yaml")
  104. ```
  105. ```yaml config.yaml
  106. vectordb:
  107. provider: zilliz
  108. config:
  109. collection_name: 'zilliz_app'
  110. uri: https://xxxx.api.gcp-region.zillizcloud.com
  111. token: xxx
  112. vector_dim: 1536
  113. metric_type: L2
  114. ```
  115. </CodeGroup>
  116. ## LanceDB
  117. _Coming soon_
  118. ## Pinecone
  119. Install pinecone related dependencies using the following command:
  120. ```bash
  121. pip install --upgrade 'embedchain[pinecone]'
  122. ```
  123. In order to use Pinecone as vector database, set the environment variables `PINECONE_API_KEY` and `PINECONE_ENV` which you can find on [Pinecone dashboard](https://app.pinecone.io/).
  124. <CodeGroup>
  125. ```python main.py
  126. from embedchain import Pipeline as App
  127. # load pinecone configuration from yaml file
  128. app = App.from_config(config_path="config.yaml")
  129. ```
  130. ```yaml config.yaml
  131. vectordb:
  132. provider: pinecone
  133. config:
  134. metric: cosine
  135. vector_dimension: 1536
  136. collection_name: my-pinecone-index
  137. ```
  138. </CodeGroup>
  139. ## Qdrant
  140. In order to use Qdrant as a vector database, set the environment variables `QDRANT_URL` and `QDRANT_API_KEY` which you can find on [Qdrant Dashboard](https://cloud.qdrant.io/).
  141. <CodeGroup>
  142. ```python main.py
  143. from embedchain import Pipeline as App
  144. # load qdrant configuration from yaml file
  145. app = App.from_config(config_path="config.yaml")
  146. ```
  147. ```yaml config.yaml
  148. vectordb:
  149. provider: qdrant
  150. config:
  151. collection_name: my_qdrant_index
  152. ```
  153. </CodeGroup>
  154. ## Weaviate
  155. In order to use Weaviate as a vector database, set the environment variables `WEAVIATE_ENDPOINT` and `WEAVIATE_API_KEY` which you can find on [Weaviate dashboard](https://console.weaviate.cloud/dashboard).
  156. <CodeGroup>
  157. ```python main.py
  158. from embedchain import Pipeline as App
  159. # load weaviate configuration from yaml file
  160. app = App.from_config(config_path="config.yaml")
  161. ```
  162. ```yaml config.yaml
  163. vectordb:
  164. provider: weaviate
  165. config:
  166. collection_name: my_weaviate_index
  167. ```
  168. </CodeGroup>
  169. <Snippet file="missing-vector-db-tip.mdx" />