PDF Files
Upload PDF documents as knowledge sources.
Upload PDF documents to train your chatbot on manuals, guides, and other documents.
Uploading PDFs
- Go to Sources > Add Source
- Select File
- Click to upload or drag and drop your PDF
- Wait for processing to complete
Supported Formats
| Format | Support |
|---|---|
| Full support | |
| DOCX | Coming soon |
| TXT | Coming soon |
File Requirements
- Maximum file size: 10 MB per file
- Maximum pages: 200 pages per PDF
- Text-based PDFs: Must contain selectable text
- Language: English (other languages in beta)
Scanned PDFs (images of text) are not supported. The PDF must contain actual text, not just images.
How PDFs Are Processed
- Text extraction - All text is extracted from the PDF
- Chunking - Content is split into meaningful sections
- Indexing - Sections are indexed for search
- Ready - Content available for chat responses
Best Practices
Document Quality
For best results:
- Use PDFs with clear formatting
- Ensure headings are properly formatted
- Avoid PDFs that are mostly images
- Use text-based charts when possible
Document Organization
- Use descriptive file names
- One topic per PDF when possible
- Keep PDFs under 50 pages for faster processing
- Remove unnecessary pages (cover pages, blank pages)
Content Tips
- Include context and definitions
- Use complete sentences
- Avoid abbreviations without explanations
- Ensure technical terms are defined
Updating Documents
PDFs cannot be refreshed - to update:
- Delete the old source
- Upload the new version
- Wait for processing
Troubleshooting
Upload Failed
Possible causes:
- File too large (>10 MB)
- Corrupted PDF
- Unsupported PDF format
- Network timeout
Solutions:
- Compress the PDF to reduce size
- Re-export the PDF from the source application
- Split large PDFs into smaller files
No Content Extracted
Possible causes:
- Scanned PDF (image-based)
- Password-protected PDF
- PDF contains only images
Solutions:
- Use OCR software to create a text-based PDF
- Remove password protection before uploading
- Export with text layer from source application
Poor Quality Responses
Possible causes:
- Content poorly formatted
- Missing context
- Technical jargon undefined
Solutions:
- Reformat the document for clarity
- Add introductory context
- Include a glossary section