search.mdx 3.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111
  1. ---
  2. title: '🔍 search'
  3. ---
  4. `.search()` enables you to uncover the most pertinent context by performing a semantic search across your data sources based on a given query. Refer to the function signature below:
  5. ### Parameters
  6. <ParamField path="query" type="str">
  7. Question
  8. </ParamField>
  9. <ParamField path="num_documents" type="int" optional>
  10. Number of relevant documents to fetch. Defaults to `3`
  11. </ParamField>
  12. <ParamField path="where" type="dict" optional>
  13. Key value pair for metadata filtering.
  14. </ParamField>
  15. <ParamField path="raw_filter" type="dict" optional>
  16. Pass raw filter query based on your vector database.
  17. Currently, `raw_filter` param is only supported for Pinecone vector database.
  18. </ParamField>
  19. ### Returns
  20. <ResponseField name="answer" type="dict">
  21. Return list of dictionaries that contain the relevant chunk and their source information.
  22. </ResponseField>
  23. ## Usage
  24. ### Basic
  25. Refer to the following example on how to use the search api:
  26. ```python Code example
  27. from embedchain import App
  28. app = App()
  29. app.add("https://www.forbes.com/profile/elon-musk")
  30. context = app.search("What is the net worth of Elon?", num_documents=2)
  31. print(context)
  32. ```
  33. ### Advanced
  34. #### Metadata filtering using `where` params
  35. Here is an advanced example of `search()` API with metadata filtering on pinecone database:
  36. ```python
  37. import os
  38. from embedchain import App
  39. os.environ["PINECONE_API_KEY"] = "xxx"
  40. config = {
  41. "vectordb": {
  42. "provider": "pinecone",
  43. "config": {
  44. "metric": "dotproduct",
  45. "vector_dimension": 1536,
  46. "index_name": "ec-test",
  47. "serverless_config": {"cloud": "aws", "region": "us-west-2"},
  48. },
  49. }
  50. }
  51. app = App.from_config(config=config)
  52. app.add("https://www.forbes.com/profile/bill-gates", metadata={"type": "forbes", "person": "gates"})
  53. app.add("https://en.wikipedia.org/wiki/Bill_Gates", metadata={"type": "wiki", "person": "gates"})
  54. results = app.search("What is the net worth of Bill Gates?", where={"person": "gates"})
  55. print("Num of search results: ", len(results))
  56. ```
  57. #### Metadata filtering using `raw_filter` params
  58. Following is an example of metadata filtering by passing the raw filter query that pinecone vector database follows:
  59. ```python
  60. import os
  61. from embedchain import App
  62. os.environ["PINECONE_API_KEY"] = "xxx"
  63. config = {
  64. "vectordb": {
  65. "provider": "pinecone",
  66. "config": {
  67. "metric": "dotproduct",
  68. "vector_dimension": 1536,
  69. "index_name": "ec-test",
  70. "serverless_config": {"cloud": "aws", "region": "us-west-2"},
  71. },
  72. }
  73. }
  74. app = App.from_config(config=config)
  75. app.add("https://www.forbes.com/profile/bill-gates", metadata={"year": 2022, "person": "gates"})
  76. app.add("https://en.wikipedia.org/wiki/Bill_Gates", metadata={"year": 2024, "person": "gates"})
  77. print("Filter with person: gates and year > 2023")
  78. raw_filter = {"$and": [{"person": "gates"}, {"year": {"$gt": 2023}}]}
  79. results = app.search("What is the net worth of Bill Gates?", raw_filter=raw_filter)
  80. print("Num of search results: ", len(results))
  81. ```