123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111 |
- ---
- title: '🔍 search'
- ---
- `.search()` enables you to uncover the most pertinent context by performing a semantic search across your data sources based on a given query. Refer to the function signature below:
- ### Parameters
- <ParamField path="query" type="str">
- Question
- </ParamField>
- <ParamField path="num_documents" type="int" optional>
- Number of relevant documents to fetch. Defaults to `3`
- </ParamField>
- <ParamField path="where" type="dict" optional>
- Key value pair for metadata filtering.
- </ParamField>
- <ParamField path="raw_filter" type="dict" optional>
- Pass raw filter query based on your vector database.
- Currently, `raw_filter` param is only supported for Pinecone vector database.
- </ParamField>
- ### Returns
- <ResponseField name="answer" type="dict">
- Return list of dictionaries that contain the relevant chunk and their source information.
- </ResponseField>
- ## Usage
- ### Basic
- Refer to the following example on how to use the search api:
- ```python Code example
- from embedchain import App
- app = App()
- app.add("https://www.forbes.com/profile/elon-musk")
- context = app.search("What is the net worth of Elon?", num_documents=2)
- print(context)
- ```
- ### Advanced
- #### Metadata filtering using `where` params
- Here is an advanced example of `search()` API with metadata filtering on pinecone database:
- ```python
- import os
- from embedchain import App
- os.environ["PINECONE_API_KEY"] = "xxx"
- config = {
- "vectordb": {
- "provider": "pinecone",
- "config": {
- "metric": "dotproduct",
- "vector_dimension": 1536,
- "index_name": "ec-test",
- "serverless_config": {"cloud": "aws", "region": "us-west-2"},
- },
- }
- }
- app = App.from_config(config=config)
- app.add("https://www.forbes.com/profile/bill-gates", metadata={"type": "forbes", "person": "gates"})
- app.add("https://en.wikipedia.org/wiki/Bill_Gates", metadata={"type": "wiki", "person": "gates"})
- results = app.search("What is the net worth of Bill Gates?", where={"person": "gates"})
- print("Num of search results: ", len(results))
- ```
- #### Metadata filtering using `raw_filter` params
- Following is an example of metadata filtering by passing the raw filter query that pinecone vector database follows:
- ```python
- import os
- from embedchain import App
- os.environ["PINECONE_API_KEY"] = "xxx"
- config = {
- "vectordb": {
- "provider": "pinecone",
- "config": {
- "metric": "dotproduct",
- "vector_dimension": 1536,
- "index_name": "ec-test",
- "serverless_config": {"cloud": "aws", "region": "us-west-2"},
- },
- }
- }
- app = App.from_config(config=config)
- app.add("https://www.forbes.com/profile/bill-gates", metadata={"year": 2022, "person": "gates"})
- app.add("https://en.wikipedia.org/wiki/Bill_Gates", metadata={"year": 2024, "person": "gates"})
- print("Filter with person: gates and year > 2023")
- raw_filter = {"$and": [{"person": "gates"}, {"year": {"$gt": 2023}}]}
- results = app.search("What is the net worth of Bill Gates?", raw_filter=raw_filter)
- print("Num of search results: ", len(results))
- ```
|