Преглед изворни кода

Update documentation based on user feedback (#1141)

Deshraj Yadav пре 1 година
родитељ
комит
5fa6221f91

+ 16 - 7
docs/components/data-sources/csv.mdx

@@ -2,18 +2,27 @@
 title: '📊 CSV'
 ---
 
-To add any csv file, use the data_type as `csv`. `csv` allows remote urls and conventional file paths. Headers are included for each line, so if you have an `age` column, `18` will be added as `age: 18`. Eg:
+You can load any csv file from your local file system or through a URL. Headers are included for each line, so if you have an `age` column, `18` will be added as `age: 18`.
+
+## Usage
+
+### Load from a local file
 
 ```python
 from embedchain import App
+app = App()
+app.add('/path/to/file.csv', data_type='csv')
+```
+
+### Load from URL
 
+```python
+from embedchain import App
 app = App()
 app.add('https://people.sc.fsu.edu/~jburkardt/data/csv/airtravel.csv', data_type="csv")
-# Or add using the local file path
-# app.add('/path/to/file.csv', data_type="csv")
-
-app.query("Summarize the air travel data")
-# Answer: The air travel data shows the number of flights for the months of July in the years 1958, 1959, and 1960. In July 1958, there were 491 flights, in July 1959 there were 548 flights, and in July 1960 there were 622 flights.
 ```
 
-Note: There is a size limit allowed for csv file beyond which it can throw error. This limit is set by the LLMs. Please consider chunking large csv files into smaller csv files.
+<Note>
+There is a size limit allowed for csv file beyond which it can throw error. This limit is set by the LLMs. Please consider chunking large csv files into smaller csv files.
+</Note>
+

+ 25 - 22
docs/components/data-sources/pdf-file.mdx

@@ -1,14 +1,31 @@
 ---
-title: '📰 PDF file'
+title: '📰 PDF'
 ---
 
-To add any pdf file, use the data_type as `pdf_file`. Eg:
+You can load any pdf file from your local file system or through a URL.
+
+## Setup
+Install the following packages for loading youtube videos which help in transcription.
+
+```bash
+pip install pytube youtube-transcript-api
+```
+
+## Usage
+
+### Load from a local file
 
 ```python
 from embedchain import App
-
 app = App()
+app.add('/path/to/file.pdf', data_type='pdf_file')
+```
+
+### Load from URL
 
+```python
+from embedchain import App
+app = App()
 app.add('https://arxiv.org/pdf/1706.03762.pdf', data_type='pdf_file')
 app.query("What is the paper 'attention is all you need' about?", citations=True)
 # Answer: The paper "Attention Is All You Need" proposes a new network architecture called the Transformer, which is based solely on attention mechanisms. It suggests that complex recurrent or convolutional neural networks can be replaced with a simpler architecture that connects the encoder and decoder through attention. The paper discusses how this approach can improve sequence transduction models, such as neural machine translation.
@@ -23,25 +40,11 @@ app.query("What is the paper 'attention is all you need' about?", citations=True
 #             ...
 #         }
 #     ),
-#     (
-#         'Attention Visualizations Input ...',
-#         {
-#             'page': 12,
-#             'url': 'https://arxiv.org/pdf/1706.03762.pdf',
-#             'score': 0.41679039679873736,
-#             ...
-#         }
-#     ),
-#     (
-#         'sequence learning ...',
-#         {
-#             'page': 10,
-#             'url': 'https://arxiv.org/pdf/1706.03762.pdf',
-#             'score': 0.4188303600897153,
-#             ...
-#         }
-#     )
 # ]
 ```
 
-Note that we do not support password protected pdfs.
+We also store the page number under the key `page` with each chunk that helps understand where the answer is coming from. You can fetch the `page` key while during retrieval (refer to the example given above).
+
+<Note>
+Note that we do not support password protected pdf files.
+</Note>

+ 1 - 1
docs/get-started/quickstart.mdx

@@ -5,7 +5,7 @@ description: '💡 Create a RAG app on your own data in a minute'
 
 ## Installation
 
-First install the python package.
+First install the Python package:
 
 ```bash
 pip install embedchain