Skip to content

Text Extraction with the Basic Parser

Text Extraction with the Basic Parser is a Cognigy solution that extracts content from files more effectively. It splits the extracted content into chunks based on a fixed token length, which returns the best results based on our research.

We recommended using this Parser in combination with Top K set to 5 in the Search Extract Output Node.

Supported Formats

Cognigy.AI versions Formats
4.79 or later .pdf, .docx, .pptx, and .txt
4.78 and earlier .pdf, .docx, and .txt

Availability

  • Available in all environments.

How to Use

By default, the Advanced Parser is used for text extraction. To use the Basic Parser, follow these steps:

  1. Switch to Basic Text Parser
  2. Upload a File for Text Extraction

Switch to the Basic Parser

To switch to the Basic Parser, follow these steps:

  1. In your Project, navigate to Manage > Settings.
  2. On the Settings page, go to Knowledge AI Settings > Document Processing.
  3. From the Content Parser list, select Basic.
  4. Click Save.

Upload a File for Text Extraction

To use the Basic Parser, follow these steps:

  1. In your Project, navigate to Build > Knowledge.
  2. Open the existing Knowledge Store or create a new one.
  3. On the Knowledge Store page, click + New Knowledge Sources in the upper-left corner.
  4. In the New Knowledge Sources window, select File (basic). The lable basic means that you will be using the Basic Parser for text extraction.
  5. Drag and drop a .pdf, .docx, .txt, or .pptx file, or click Browse Files to select a file from your computer.
  6. Click Create.

More Information