query_configuration.mdx 3.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475
  1. ---
  2. title: '🔍 Query configurations'
  3. ---
  4. ## AppConfig
  5. | option | description | type | default |
  6. |-----------|-----------------------|---------------------------------|------------------------|
  7. | log_level | log level | string | WARNING |
  8. | embedding_fn| embedding function | chromadb.utils.embedding_functions | \{text-embedding-ada-002\} |
  9. | db | vector database (experimental) | BaseVectorDB | ChromaDB |
  10. | collection_name | initial collection name for the database | string | embedchain_store |
  11. | collect_metrics | collect anonymous telemetry data to improve embedchain | boolean | true |
  12. ## AddConfig
  13. |option|description|type|default|
  14. |---|---|---|---|
  15. |chunker|chunker config|ChunkerConfig|Default values for chunker depends on the `data_type`. Please refer [ChunkerConfig](#chunker-config)|
  16. |loader|loader config|LoaderConfig|None|
  17. Yes, you are passing `ChunkerConfig` to `AddConfig`, like so:
  18. ```python
  19. chunker_config = ChunkerConfig(chunk_size=100)
  20. add_config = AddConfig(chunker=chunker_config)
  21. app.add_local("text", "lorem ipsum", config=add_config)
  22. ```
  23. ### ChunkerConfig
  24. |option|description|type|default|
  25. |---|---|---|---|
  26. |chunk_size|Maximum size of chunks to return|int|Default value for various `data_type` mentioned below|
  27. |chunk_overlap|Overlap in characters between chunks|int|Default value for various `data_type` mentioned below|
  28. |length_function|Function that measures the length of given chunks|typing.Callable|Default value for various `data_type` mentioned below|
  29. Default values of chunker config parameters for different `data_type`:
  30. |data_type|chunk_size|chunk_overlap|length_function|
  31. |---|---|---|---|
  32. |docx|1000|0|len|
  33. |text|300|0|len|
  34. |qna_pair|300|0|len|
  35. |web_page|500|0|len|
  36. |pdf_file|1000|0|len|
  37. |youtube_video|2000|0|len|
  38. |docs_site|500|50|len|
  39. |notion|300|0|len|
  40. ### LoaderConfig
  41. _coming soon_
  42. ## QueryConfig
  43. |option|description|type|default|
  44. |---|---|---|---|
  45. |number_documents|Absolute number of documents to pull from the database as context.|int|1
  46. |template|custom template for prompt. If history is used with query, $history has to be included as well.|Template|Template("Use the following pieces of context to answer the query at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. \$context Query: \$query Helpful Answer:")|
  47. |model|name of the model used.|string|depends on app type|
  48. |temperature|Controls the randomness of the model's output. Higher values (closer to 1) make output more random, lower values make it more deterministic.|float|0|
  49. |max_tokens|Controls how many tokens are used. Exact implementation (whether it counts prompt and/or response) depends on the model.|int|1000|
  50. |top_p|Controls the diversity of words. Higher values (closer to 1) make word selection more diverse, lower values make words less diverse.|float|1|
  51. |history|include conversation history from your client or database.|any (recommendation: list[str])|None|
  52. |stream|control if response is streamed back to the user.|bool|False|
  53. ## ChatConfig
  54. All options for query and...
  55. _coming soon_
  56. `history` is not supported, as that is handled is handled automatically, the config option is not supported.