semantic-search.mdx 5.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101
  1. ---
  2. title: '🔍 Semantic Search'
  3. ---
  4. Semantic searching, which involves understanding the intent and contextual meaning behind search queries, is yet another popular use-case of RAG. It has several popular use cases across various domains:
  5. - **Information Retrieval**: Enhances search accuracy in databases and websites
  6. - **E-commerce**: Improves product discovery in online shopping
  7. - **Customer Support**: Powers smarter chatbots for effective responses
  8. - **Content Discovery**: Aids in finding relevant media content
  9. - **Knowledge Management**: Streamlines document and data retrieval in enterprises
  10. - **Healthcare**: Facilitates medical research and literature search
  11. - **Legal Research**: Assists in legal document and case law search
  12. - **Academic Research**: Aids in academic paper discovery
  13. - **Language Processing**: Enables multilingual search capabilities
  14. Embedchain offers a simple yet customizable `search()` API that you can use for semantic search. See the example in the next section to know more.
  15. ## Example: Semantic Search over Next.JS Website + Forum
  16. ### Step 1: Set Up Your RAG Pipeline
  17. First, let's create your RAG pipeline. Open your Python environment and enter:
  18. ```python Create pipeline
  19. from embedchain import App
  20. app = App()
  21. ```
  22. This initializes your application.
  23. ### Step 2: Populate Your Pipeline with Data
  24. Now, let's add data to your pipeline. We'll include the Next.JS website and its documentation:
  25. ```python Ingest data sources
  26. # Add Next.JS Website and docs
  27. app.add("https://nextjs.org/sitemap.xml", data_type="sitemap")
  28. # Add Next.JS Forum data
  29. app.add("https://nextjs-forum.com/sitemap.xml", data_type="sitemap")
  30. ```
  31. This step incorporates over **15K pages** from the Next.JS website and forum into your pipeline. For more data source options, check the [Embedchain data sources overview](/components/data-sources/overview).
  32. ### Step 3: Local Testing of Your Pipeline
  33. Test the pipeline on your local machine:
  34. ```python Search App
  35. app.search("Summarize the features of Next.js 14?")
  36. [
  37. {
  38. 'context': 'Next.js 14 | Next.jsBack to BlogThursday, October 26th 2023Next.js 14Posted byLee Robinson@leeerobTim Neutkens@timneutkensAs we announced at Next.js Conf, Next.js 14 is our most focused release with: Turbopack: 5,000 tests passing for App & Pages Router 53% faster local server startup 94% faster code updates with Fast Refresh Server Actions (Stable): Progressively enhanced mutations Integrated with caching & revalidating Simple function calls, or works natively with forms Partial Prerendering',
  39. 'metadata': {
  40. 'source': 'https://nextjs.org/blog/next-14',
  41. 'document_id': '6c8d1a7b-ea34-4927-8823-daa29dcfc5af--b83edb69b8fc7e442ff8ca311b48510e6c80bf00caa806b3a6acb34e1bcdd5d5'
  42. }
  43. },
  44. {
  45. 'context': 'Next.js 13.3 | Next.jsBack to BlogThursday, April 6th 2023Next.js 13.3Posted byDelba de Oliveira@delba_oliveiraTim Neutkens@timneutkensNext.js 13.3 adds popular community-requested features, including: File-Based Metadata API: Dynamically generate sitemaps, robots, favicons, and more. Dynamic Open Graph Images: Generate OG images using JSX, HTML, and CSS. Static Export for App Router: Static / Single-Page Application (SPA) support for Server Components. Parallel Routes and Interception: Advanced',
  46. 'metadata': {
  47. 'source': 'https://nextjs.org/blog/next-13-3',
  48. 'document_id': '6c8d1a7b-ea34-4927-8823-daa29dcfc5af--b83edb69b8fc7e442ff8ca311b48510e6c80bf00caa806b3a6acb34e1bcdd5d5'
  49. }
  50. },
  51. {
  52. 'context': 'Upgrading: Version 14 | Next.js MenuUsing App RouterFeatures available in /appApp Router.UpgradingVersion 14Version 14 Upgrading from 13 to 14 To update to Next.js version 14, run the following command using your preferred package manager: Terminalnpm i next@latest react@latest react-dom@latest eslint-config-next@latest Terminalyarn add next@latest react@latest react-dom@latest eslint-config-next@latest Terminalpnpm up next react react-dom eslint-config-next -latest Terminalbun add next@latest',
  53. 'metadata': {
  54. 'source': 'https://nextjs.org/docs/app/building-your-application/upgrading/version-14',
  55. 'document_id': '6c8d1a7b-ea34-4927-8823-daa29dcfc5af--b83edb69b8fc7e442ff8ca311b48510e6c80bf00caa806b3a6acb34e1bcdd5d5'
  56. }
  57. }
  58. ]
  59. ```
  60. The `source` key contains the url of the document that yielded that document chunk.
  61. If you are interested in configuring the search further, refer to our [API documentation](/api-reference/pipeline/search).
  62. ### (Optional) Step 4: Deploying Your RAG Pipeline
  63. Want to go live? Deploy your pipeline with these options:
  64. - Deploy on the Embedchain Platform
  65. - Self-host on your preferred cloud provider
  66. For detailed deployment instructions, follow these guides:
  67. - [Deploying on Embedchain Platform](/get-started/deployment#deploy-on-embedchain-platform)
  68. - [Self-hosting Guide](/get-started/deployment#self-hosting)
  69. ----
  70. This guide will help you swiftly set up a semantic search pipeline with Embedchain, making it easier to access and analyze specific information from large data sources.
  71. ## Need help?
  72. In case you run into issues, feel free to contact us via any of the following methods:
  73. <Snippet file="get-help.mdx" />