123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232 |
- ---
- title: 🗄️ Vector databases
- ---
- ## Overview
- Utilizing a vector database alongside Embedchain is a seamless process. All you need to do is configure it within the YAML configuration file. We've provided examples for each supported database below:
- <CardGroup cols={4}>
- <Card title="ChromaDB" href="#chromadb"></Card>
- <Card title="Elasticsearch" href="#elasticsearch"></Card>
- <Card title="OpenSearch" href="#opensearch"></Card>
- <Card title="Zilliz" href="#zilliz"></Card>
- <Card title="LanceDB" href="#lancedb"></Card>
- <Card title="Pinecone" href="#pinecone"></Card>
- <Card title="Qdrant" href="#qdrant"></Card>
- <Card title="Weaviate" href="#weaviate"></Card>
- </CardGroup>
- ## ChromaDB
- <CodeGroup>
- ```python main.py
- from embedchain import App
- # load chroma configuration from yaml file
- app = App.from_config(config_path="config1.yaml")
- ```
- ```yaml config1.yaml
- vectordb:
- provider: chroma
- config:
- collection_name: 'my-collection'
- dir: db
- allow_reset: true
- ```
- ```yaml config2.yaml
- vectordb:
- provider: chroma
- config:
- collection_name: 'my-collection'
- host: localhost
- port: 5200
- allow_reset: true
- ```
- </CodeGroup>
- ## Elasticsearch
- Install related dependencies using the following command:
- ```bash
- pip install --upgrade 'embedchain[elasticsearch]'
- ```
- <Note>
- You can configure the Elasticsearch connection by providing either `es_url` or `cloud_id`. If you are using the Elasticsearch Service on Elastic Cloud, you can find the `cloud_id` on the [Elastic Cloud dashboard](https://cloud.elastic.co/deployments).
- </Note>
- You can authorize the connection to Elasticsearch by providing either `basic_auth`, `api_key`, or `bearer_auth`.
- <CodeGroup>
- ```python main.py
- from embedchain import App
- # load elasticsearch configuration from yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- vectordb:
- provider: elasticsearch
- config:
- collection_name: 'es-index'
- cloud_id: 'deployment-name:xxxx'
- basic_auth:
- - elastic
- - <your_password>
- verify_certs: false
- ```
- </CodeGroup>
- ## OpenSearch
- Install related dependencies using the following command:
- ```bash
- pip install --upgrade 'embedchain[opensearch]'
- ```
- <CodeGroup>
- ```python main.py
- from embedchain import App
- # load opensearch configuration from yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- vectordb:
- provider: opensearch
- config:
- collection_name: 'my-app'
- opensearch_url: 'https://localhost:9200'
- http_auth:
- - admin
- - admin
- vector_dimension: 1536
- use_ssl: false
- verify_certs: false
- ```
- </CodeGroup>
- ## Zilliz
- Install related dependencies using the following command:
- ```bash
- pip install --upgrade 'embedchain[milvus]'
- ```
- Set the Zilliz environment variables `ZILLIZ_CLOUD_URI` and `ZILLIZ_CLOUD_TOKEN` which you can find it on their [cloud platform](https://cloud.zilliz.com/).
- <CodeGroup>
- ```python main.py
- import os
- from embedchain import App
- os.environ['ZILLIZ_CLOUD_URI'] = 'https://xxx.zillizcloud.com'
- os.environ['ZILLIZ_CLOUD_TOKEN'] = 'xxx'
- # load zilliz configuration from yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- vectordb:
- provider: zilliz
- config:
- collection_name: 'zilliz_app'
- uri: https://xxxx.api.gcp-region.zillizcloud.com
- token: xxx
- vector_dim: 1536
- metric_type: L2
- ```
- </CodeGroup>
- ## LanceDB
- _Coming soon_
- ## Pinecone
- Install pinecone related dependencies using the following command:
- ```bash
- pip install --upgrade 'embedchain[pinecone]'
- ```
- In order to use Pinecone as vector database, set the environment variables `PINECONE_API_KEY` and `PINECONE_ENV` which you can find on [Pinecone dashboard](https://app.pinecone.io/).
- <CodeGroup>
- ```python main.py
- from embedchain import App
- # load pinecone configuration from yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- vectordb:
- provider: pinecone
- config:
- metric: cosine
- vector_dimension: 1536
- collection_name: my-pinecone-index
- ```
- </CodeGroup>
- ## Qdrant
- In order to use Qdrant as a vector database, set the environment variables `QDRANT_URL` and `QDRANT_API_KEY` which you can find on [Qdrant Dashboard](https://cloud.qdrant.io/).
- <CodeGroup>
- ```python main.py
- from embedchain import App
- # load qdrant configuration from yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- vectordb:
- provider: qdrant
- config:
- collection_name: my_qdrant_index
- ```
- </CodeGroup>
- ## Weaviate
- In order to use Weaviate as a vector database, set the environment variables `WEAVIATE_ENDPOINT` and `WEAVIATE_API_KEY` which you can find on [Weaviate dashboard](https://console.weaviate.cloud/dashboard).
- <CodeGroup>
- ```python main.py
- from embedchain import App
- # load weaviate configuration from yaml file
- app = App.from_config(config_path="config.yaml")
- ```
- ```yaml config.yaml
- vectordb:
- provider: weaviate
- config:
- collection_name: my_weaviate_index
- ```
- </CodeGroup>
- <Snippet file="missing-vector-db-tip.mdx" />
|