Restrictions
- The web page must be hosted on a publicly accessible website.
- When you create a Knowledge Source from a web page, Knowledge AI processes all visible text on the web page, including potentially unwanted text, such as cookie notices.
- Knowledge AI doesn’t process web page content with anti-crawling measures.
- Knowledge AI doesn’t support images or Optical Character Recognition (OCR) capabilities.
Chunking Process
When processing a web page, Knowledge AI:- Visits the URL as a page in a browser session.
- Scrolls to the bottom of the web page.
- Accesses lazy-loaded1 content by checking for any text changes until the web page is stable and no longer loads additional text.
- Generates Knowledge Source content based on the visible text result.
More Information
1: Lazy loading is a web development technique that delays loading non-critical or non-visible content until it is necessary. This technique improves web page loading times and user experience.