Known Limitations

 

  1. The text extraction can only be successful if the printed document itself contains text information. If the document is an image, drawing, metafile, and so on, the printer driver is not able to extract any text.

  2. In some cases, the printing application sends the text or parts of the text as glyphs. The glyph codes cannot be converted back to character codes, and the text file contains unreadable characters. Use the Filter Junk Characters option from the .INI file for the driver not to save the glyphs to the text file.

  3. The coordinates of the beginning text are reported by the printing application. The coordinates of the end of the text is calculated by the driver, based on the resolution, font, and the actual characters in the text. A small variation of 1 to 5 pixels is normal.

  4. The coordinates of the text are saved as they are received from the printing application. Some applications, such as Quicken, change the coordinate system during printing. The part of the driver that generates the text output is not aware of this fact. In cases where the printing application changes the coordinate system, the coordinates saved to the text file may not be relative to the upper left corner of the image. There is no workaround for this issue.

  5. When printing the contents of cells from Excel, the contents of the cells are not separated. Use the Add Space option from the .INI file to instruct the printer driver to add an extra space after each text output command.

6.    Words can appear cut in half. It is possible that the line “Black Ice test” is extracted from a document by the PDF X1 Printer Driver and is saved to the text file as “Black Ice test”. This issue is caused by the printing application, which sends the text as different commands. The printer driver is not aware if the text is “correct”, it is saved exactly how it is received from the printing application. This is most likely to happen with applications such as Word and Notepad, when, for example, one part of a word or sentence uses one font and another part of the text uses a different font, or a word was typed and then later edited. Disabling the Add Space option solves most of these occurrences.

 

NOTE: The demo version of the printer driver will only extract data from the first page of the document. The data from subsequent pages will not be extracted to the text file.