vector_database.mdx 2.5 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
  1. ---
  2. title: '💾 Vector Database'
  3. ---
  4. We support `Chroma` and `Elasticsearch` as two vector database.
  5. `Chroma` is used as a default database.
  6. ## Elasticsearch
  7. ### Minimal Example
  8. In order to use `Elasticsearch` as vector database we need to use App type `CustomApp`.
  9. 1. Set the environment variables in a `.env` file.
  10. ```
  11. OPENAI_API_KEY=sk-SECRETKEY
  12. ELASTICSEARCH_API_KEY=SECRETKEY==
  13. ELASTICSEARCH_URL=https://secret-domain.europe-west3.gcp.cloud.es.io:443
  14. ```
  15. Please note that the key needs certain privileges. For testing you can just toggle off `restrict privileges` under `/app/management/security/api_keys/` in your web interface.
  16. 2. Load the app
  17. ```python
  18. from embedchain import CustomApp
  19. from embedchain.embedder.openai import OpenAiEmbedder
  20. from embedchain.llm.openai import OpenAILlm
  21. from embedchain.vectordb.elasticsearch import ElasticsearchDB
  22. es_app = CustomApp(
  23. llm=OpenAILlm(),
  24. embedder=OpenAiEmbedder(),
  25. db=ElasticsearchDB(),
  26. )
  27. ```
  28. ### More custom settings
  29. You can get a URL for elasticsearch in the cloud, or run it locally.
  30. The following example shows you how to configure embedchain to work with a locally running elasticsearch.
  31. Instead of using an API key, we use http login credentials. The localhost url can be defined in .env or in the config.
  32. ```python
  33. import os
  34. from embedchain import CustomApp
  35. from embedchain.config import CustomAppConfig, ElasticsearchDBConfig
  36. from embedchain.embedder.openai import OpenAiEmbedder
  37. from embedchain.llm.openai import OpenAILlm
  38. from embedchain.vectordb.elasticsearch import ElasticsearchDB
  39. es_config = ElasticsearchDBConfig(
  40. # elasticsearch url or list of nodes url with different hosts and ports.
  41. es_url='https://localhost:9200',
  42. # pass named parameters supported by Python Elasticsearch client
  43. http_auth=("elastic", "secret"),
  44. ca_certs="~/binaries/elasticsearch-8.7.0/config/certs/http_ca.crt" # your cert path
  45. # verify_certs=False # Alternative, if you aren't using certs
  46. ) # pass named parameters supported by elasticsearch-py
  47. es_app = CustomApp(
  48. config=CustomAppConfig(log_level="INFO"),
  49. llm=OpenAILlm(),
  50. embedder=OpenAiEmbedder(),
  51. db=ElasticsearchDB(config=es_config),
  52. )
  53. ```
  54. 3. This should log your connection details to the console.
  55. 4. Alternatively to a URL, you `ElasticsearchDBConfig` accepts `es_url` as a list of nodes url with different hosts and ports.
  56. 5. Additionally we can pass named parameters supported by Python Elasticsearch client.