123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118 |
- ---
- title: '💾 Vector Database'
- ---
- We support `Chroma`, `Elasticsearch` and `OpenSearch` as vector databases.
- `Chroma` is used as a default database.
- ## Elasticsearch
- ### Minimal Example
- In order to use `Elasticsearch` as vector database we need to use App type `CustomApp`.
- 1. Set the environment variables in a `.env` file.
- ```
- OPENAI_API_KEY=sk-SECRETKEY
- ELASTICSEARCH_API_KEY=SECRETKEY==
- ELASTICSEARCH_URL=https://secret-domain.europe-west3.gcp.cloud.es.io:443
- ```
- Please note that the key needs certain privileges. For testing you can just toggle off `restrict privileges` under `/app/management/security/api_keys/` in your web interface.
- 2. Load the app
- ```python
- from embedchain import CustomApp
- from embedchain.embedder.openai import OpenAIEmbedder
- from embedchain.llm.openai import OpenAILlm
- from embedchain.vectordb.elasticsearch import ElasticsearchDB
- es_app = CustomApp(
- llm=OpenAILlm(),
- embedder=OpenAIEmbedder(),
- db=ElasticsearchDB(),
- )
- ```
- ### More custom settings
- You can get a URL for elasticsearch in the cloud, or run it locally.
- The following example shows you how to configure embedchain to work with a locally running elasticsearch.
- Instead of using an API key, we use http login credentials. The localhost url can be defined in .env or in the config.
- ```python
- import os
- from embedchain import CustomApp
- from embedchain.config import CustomAppConfig, ElasticsearchDBConfig
- from embedchain.embedder.openai import OpenAIEmbedder
- from embedchain.llm.openai import OpenAILlm
- from embedchain.vectordb.elasticsearch import ElasticsearchDB
- es_config = ElasticsearchDBConfig(
- # elasticsearch url or list of nodes url with different hosts and ports.
- es_url='https://localhost:9200',
- # pass named parameters supported by Python Elasticsearch client
- http_auth=("elastic", "secret"),
- ca_certs="~/binaries/elasticsearch-8.7.0/config/certs/http_ca.crt" # your cert path
- # verify_certs=False # Alternative, if you aren't using certs
- ) # pass named parameters supported by elasticsearch-py
- es_app = CustomApp(
- config=CustomAppConfig(log_level="INFO"),
- llm=OpenAILlm(),
- embedder=OpenAIEmbedder(),
- db=ElasticsearchDB(config=es_config),
- )
- ```
- 3. This should log your connection details to the console.
- 4. Alternatively to a URL, you `ElasticsearchDBConfig` accepts `es_url` as a list of nodes url with different hosts and ports.
- 5. Additionally we can pass named parameters supported by Python Elasticsearch client.
- ## OpenSearch 🔍
- To use OpenSearch as a vector database with a CustomApp, follow these simple steps:
- 1. Set the `OPENAI_API_KEY` environment variable:
- ```
- OPENAI_API_KEY=sk-xxxx
- ```
- 2. Define the OpenSearch configuration in your Python code:
- ```python
- from embedchain import CustomApp
- from embedchain.config import OpenSearchDBConfig
- from embedchain.embedder.openai import OpenAIEmbedder
- from embedchain.llm.openai import OpenAILlm
- from embedchain.vectordb.opensearch import OpenSearchDB
- opensearch_url = "https://localhost:9200"
- http_auth = ("username", "password")
- db_config = OpenSearchDBConfig(
- opensearch_url=opensearch_url,
- http_auth=http_auth,
- collection_name="embedchain-app",
- use_ssl=True,
- timeout=30,
- )
- db = OpenSearchDB(config=db_config)
- ```
- 2. Instantiate the app and add data:
- ```python
- app = CustomApp(llm=OpenAILlm(), embedder=OpenAIEmbedder(), db=db)
- app.add("https://en.wikipedia.org/wiki/Elon_Musk")
- app.add("https://www.forbes.com/profile/elon-musk")
- app.add("https://www.britannica.com/biography/Elon-Musk")
- ```
- 3. You're all set! Start querying using the following command:
- ```python
- app.query("What is the net worth of Elon Musk?")
- ```
|