semantic-search.mdx 5.0 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697
  1. Semantic searching, which involves understanding the intent and contextual meaning behind search queries, is yet another popular use-case of RAG. It has several popular use cases across various domains:
  2. - **Information Retrieval**: Enhances search accuracy in databases and websites
  3. - **E-commerce**: Improves product discovery in online shopping
  4. - **Customer Support**: Powers smarter chatbots for effective responses
  5. - **Content Discovery**: Aids in finding relevant media content
  6. - **Knowledge Management**: Streamlines document and data retrieval in enterprises
  7. - **Healthcare**: Facilitates medical research and literature search
  8. - **Legal Research**: Assists in legal document and case law search
  9. - **Academic Research**: Aids in academic paper discovery
  10. - **Language Processing**: Enables multilingual search capabilities
  11. Embedchain offers a simple yet customizable `search()` API that you can use for semantic search. See the example in the next section to know more.
  12. ## Example: Semantic Search over Next.JS Website + Forum
  13. ### Step 1: Set Up Your RAG Pipeline
  14. First, let's create your RAG pipeline. Open your Python environment and enter:
  15. ```python Create pipeline
  16. from embedchain import Pipeline as App
  17. app = App()
  18. ```
  19. This initializes your application.
  20. ### Step 2: Populate Your Pipeline with Data
  21. Now, let's add data to your pipeline. We'll include the Next.JS website and its documentation:
  22. ```python Ingest data sources
  23. # Add Next.JS Website and docs
  24. app.add("https://nextjs.org/sitemap.xml", data_type="sitemap")
  25. # Add Next.JS Forum data
  26. app.add("https://nextjs-forum.com/sitemap.xml", data_type="sitemap")
  27. ```
  28. This step incorporates over **15K pages** from the Next.JS website and forum into your pipeline. For more data source options, check the [Embedchain data sources overview](/components/data-sources/overview).
  29. ### Step 3: Local Testing of Your Pipeline
  30. Test the pipeline on your local machine:
  31. ```python Search App
  32. app.search("Summarize the features of Next.js 14?")
  33. [
  34. {
  35. 'context': 'Next.js 14 | Next.jsBack to BlogThursday, October 26th 2023Next.js 14Posted byLee Robinson@leeerobTim Neutkens@timneutkensAs we announced at Next.js Conf, Next.js 14 is our most focused release with: Turbopack: 5,000 tests passing for App & Pages Router 53% faster local server startup 94% faster code updates with Fast Refresh Server Actions (Stable): Progressively enhanced mutations Integrated with caching & revalidating Simple function calls, or works natively with forms Partial Prerendering',
  36. 'metadata': {
  37. 'source': 'https://nextjs.org/blog/next-14',
  38. 'document_id': '6c8d1a7b-ea34-4927-8823-daa29dcfc5af--b83edb69b8fc7e442ff8ca311b48510e6c80bf00caa806b3a6acb34e1bcdd5d5'
  39. }
  40. },
  41. {
  42. 'context': 'Next.js 13.3 | Next.jsBack to BlogThursday, April 6th 2023Next.js 13.3Posted byDelba de Oliveira@delba_oliveiraTim Neutkens@timneutkensNext.js 13.3 adds popular community-requested features, including: File-Based Metadata API: Dynamically generate sitemaps, robots, favicons, and more. Dynamic Open Graph Images: Generate OG images using JSX, HTML, and CSS. Static Export for App Router: Static / Single-Page Application (SPA) support for Server Components. Parallel Routes and Interception: Advanced',
  43. 'metadata': {
  44. 'source': 'https://nextjs.org/blog/next-13-3',
  45. 'document_id': '6c8d1a7b-ea34-4927-8823-daa29dcfc5af--b83edb69b8fc7e442ff8ca311b48510e6c80bf00caa806b3a6acb34e1bcdd5d5'
  46. }
  47. },
  48. {
  49. 'context': 'Upgrading: Version 14 | Next.js MenuUsing App RouterFeatures available in /appApp Router.UpgradingVersion 14Version 14 Upgrading from 13 to 14 To update to Next.js version 14, run the following command using your preferred package manager: Terminalnpm i next@latest react@latest react-dom@latest eslint-config-next@latest Terminalyarn add next@latest react@latest react-dom@latest eslint-config-next@latest Terminalpnpm up next react react-dom eslint-config-next -latest Terminalbun add next@latest',
  50. 'metadata': {
  51. 'source': 'https://nextjs.org/docs/app/building-your-application/upgrading/version-14',
  52. 'document_id': '6c8d1a7b-ea34-4927-8823-daa29dcfc5af--b83edb69b8fc7e442ff8ca311b48510e6c80bf00caa806b3a6acb34e1bcdd5d5'
  53. }
  54. }
  55. ]
  56. ```
  57. The `source` key contains the url of the document that yielded that document chunk.
  58. If you are interested in configuring the search further, refer to our [API documentation](/api-reference/pipeline/search).
  59. ### (Optional) Step 4: Deploying Your RAG Pipeline
  60. Want to go live? Deploy your pipeline with these options:
  61. - Deploy on the Embedchain Platform
  62. - Self-host on your preferred cloud provider
  63. For detailed deployment instructions, follow these guides:
  64. - [Deploying on Embedchain Platform](/get-started/deployment#deploy-on-embedchain-platform)
  65. - [Self-hosting Guide](/get-started/deployment#self-hosting)
  66. ----
  67. This guide will help you swiftly set up a semantic search pipeline with Embedchain, making it easier to access and analyze specific information from large data sources.
  68. ## Need help?
  69. In case you run into issues, feel free to contact us via any of the following methods:
  70. <Snippet file="get-help.mdx" />