--- title: 🗄️ Vector databases --- ## Overview Utilizing a vector database alongside Embedchain is a seamless process. All you need to do is configure it within the YAML configuration file. We've provided examples for each supported database below: ## ChromaDB ```python main.py from embedchain import App # load chroma configuration from yaml file app = App.from_config(config_path="config1.yaml") ``` ```yaml config1.yaml vectordb: provider: chroma config: collection_name: 'my-collection' dir: db allow_reset: true ``` ```yaml config2.yaml vectordb: provider: chroma config: collection_name: 'my-collection' host: localhost port: 5200 allow_reset: true ``` ## Elasticsearch Install related dependencies using the following command: ```bash pip install --upgrade 'embedchain[elasticsearch]' ``` You can configure the Elasticsearch connection by providing either `es_url` or `cloud_id`. If you are using the Elasticsearch Service on Elastic Cloud, you can find the `cloud_id` on the [Elastic Cloud dashboard](https://cloud.elastic.co/deployments). You can authorize the connection to Elasticsearch by providing either `basic_auth`, `api_key`, or `bearer_auth`. ```python main.py from embedchain import App # load elasticsearch configuration from yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml vectordb: provider: elasticsearch config: collection_name: 'es-index' cloud_id: 'deployment-name:xxxx' basic_auth: - elastic - verify_certs: false ``` ## OpenSearch Install related dependencies using the following command: ```bash pip install --upgrade 'embedchain[opensearch]' ``` ```python main.py from embedchain import App # load opensearch configuration from yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml vectordb: provider: opensearch config: collection_name: 'my-app' opensearch_url: 'https://localhost:9200' http_auth: - admin - admin vector_dimension: 1536 use_ssl: false verify_certs: false ``` ## Zilliz Install related dependencies using the following command: ```bash pip install --upgrade 'embedchain[milvus]' ``` Set the Zilliz environment variables `ZILLIZ_CLOUD_URI` and `ZILLIZ_CLOUD_TOKEN` which you can find it on their [cloud platform](https://cloud.zilliz.com/). ```python main.py import os from embedchain import App os.environ['ZILLIZ_CLOUD_URI'] = 'https://xxx.zillizcloud.com' os.environ['ZILLIZ_CLOUD_TOKEN'] = 'xxx' # load zilliz configuration from yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml vectordb: provider: zilliz config: collection_name: 'zilliz_app' uri: https://xxxx.api.gcp-region.zillizcloud.com token: xxx vector_dim: 1536 metric_type: L2 ``` ## LanceDB _Coming soon_ ## Pinecone Install pinecone related dependencies using the following command: ```bash pip install --upgrade 'embedchain[pinecone]' ``` In order to use Pinecone as vector database, set the environment variable `PINECONE_API_KEY` which you can find on [Pinecone dashboard](https://app.pinecone.io/). ```python main.py from embedchain import App # load pinecone configuration from yaml file app = App.from_config(config_path="pod_config.yaml") # or app = App.from_config(config_path="serverless_config.yaml") ``` ```yaml pod_config.yaml vectordb: provider: pinecone config: metric: cosine vector_dimension: 1536 collection_name: my-pinecone-index pod_config: environment: gcp-starter metadata_config: indexed: - "url" - "hash" ``` ```yaml serverless_config.yaml vectordb: provider: pinecone config: metric: cosine vector_dimension: 1536 collection_name: my-pinecone-index serverless_config: cloud: aws region: us-west-2 ```
You can find more information about Pinecone configuration [here](https://docs.pinecone.io/docs/manage-indexes#create-a-pod-based-index). You can also optionally provide `index_name` as a config param in yaml file to specify the index name. If not provided, the index name will be `{collection_name}-{vector_dimension}`. ## Qdrant In order to use Qdrant as a vector database, set the environment variables `QDRANT_URL` and `QDRANT_API_KEY` which you can find on [Qdrant Dashboard](https://cloud.qdrant.io/). ```python main.py from embedchain import App # load qdrant configuration from yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml vectordb: provider: qdrant config: collection_name: my_qdrant_index ``` ## Weaviate In order to use Weaviate as a vector database, set the environment variables `WEAVIATE_ENDPOINT` and `WEAVIATE_API_KEY` which you can find on [Weaviate dashboard](https://console.weaviate.cloud/dashboard). ```python main.py from embedchain import App # load weaviate configuration from yaml file app = App.from_config(config_path="config.yaml") ``` ```yaml config.yaml vectordb: provider: weaviate config: collection_name: my_weaviate_index ```