custom.mdx 1.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142
  1. ---
  2. title: '⚙️ Custom'
  3. ---
  4. When we say "custom", we mean that you can customize the loader and chunker to your needs. This is done by passing a custom loader and chunker to the `add` method.
  5. ```python
  6. from embedchain import App
  7. import your_loader
  8. from my_module import CustomLoader
  9. from my_module import CustomChunker
  10. app = App()
  11. loader = CustomLoader()
  12. chunker = CustomChunker()
  13. app.add("source", data_type="custom", loader=loader, chunker=chunker)
  14. ```
  15. <Note>
  16. The custom loader and chunker must be a class that inherits from the [`BaseLoader`](https://github.com/embedchain/embedchain/blob/main/embedchain/loaders/base_loader.py) and [`BaseChunker`](https://github.com/embedchain/embedchain/blob/main/embedchain/chunkers/base_chunker.py) classes respectively.
  17. </Note>
  18. <Note>
  19. If the `data_type` is not a valid data type, the `add` method will fallback to the `custom` data type and expect a custom loader and chunker to be passed by the user.
  20. </Note>
  21. Example:
  22. ```python
  23. from embedchain import App
  24. from embedchain.loaders.github import GithubLoader
  25. app = App()
  26. loader = GithubLoader(config={"token": "ghp_xxx"})
  27. app.add("repo:embedchain/embedchain type:repo", data_type="github", loader=loader)
  28. app.query("What is Embedchain?")
  29. # Answer: Embedchain is a Data Platform for Large Language Models (LLMs). It allows users to seamlessly load, index, retrieve, and sync unstructured data in order to build dynamic, LLM-powered applications. There is also a JavaScript implementation called embedchain-js available on GitHub.
  30. ```