Configure extractTextOption in JavaScript PDF Viewer
14 Jan 20262 minutes to read
The extractTextOption property controls how text is extracted and optimizes memory usage. It also affects the data returned in the extractTextCompleted event. Choose one of the following options to determine the text and layout data to retrieve.
Available Options:
None: No text information is extracted or returned. This is useful when you want to optimize memory usage and don’t need any text data.
TextOnly: Extracts only the plain text from the document. This option excludes any layout or positional information.
BoundsOnly: Extracts layout information, such as bounds or coordinates, without including the plain text data.
TextAndBounds: Extracts both the plain text and the layout (bounds) information, which is the default behavior.
The following example demonstrates how to configure the extractTextOption property to control the level of text extraction:
var pdfviewer = new ej.pdfviewer.PdfViewer({
documentPath: 'https://cdn.syncfusion.com/content/pdf/pdf-succinctly.pdf',
resourceUrl: "https://cdn.syncfusion.com/ej2/31.2.2/dist/ej2-pdfviewer-lib",
});
ej.pdfviewer.PdfViewer.Inject(ej.pdfviewer.TextSelection, ej.pdfviewer.TextSearch, ej.pdfviewer.Print, ej.pdfviewer.Navigation, ej.pdfviewer.Toolbar,
ej.pdfviewer.Magnification, ej.pdfviewer.Annotation, ej.pdfviewer.FormDesigner, ej.pdfviewer.FormFields, ej.pdfviewer.PageOrganizer);
pdfviewer.extractTextOption = 'None'; // Options: 'None', 'TextOnly', 'BoundsOnly', 'TextAndBounds'
pdfviewer.appendTo('#PdfViewer');Description of Each Option
extractTextOption.TextAndBounds (default): This option returns both plain text and its positional data (bounds). Use this option when you need to access both the content of the PDF and its layout for further processing or analysis.
extractTextOption.TextOnly: This option returns only the plain text from the PDF. No positional or layout data is included. Note that when using this option, text search functionality will be disabled. In such cases, it is recommended to use findTextAsync for text searching.
extractTextOption.BoundsOnly: This option returns only the layout information (bounds) of the text, excluding the actual content. It is useful when the focus is on the position of text elements rather than the text itself.
extractTextOption.None: This option does not extract or return any text or layout information. It is used to optimize memory usage when no text extraction is necessary. This setting is only relevant for the extractTextCompleted event and cannot be used with the ExtractText method.
NOTE
Text search: When using the
extractTextOption.TextOnlyorextractTextOption.Noneoptions, thefindTextmethod is unavailable. Use thefindTextAsyncmethod to perform text searches asynchronously.
See Also
Find Text
Text Search Events
Text Search Features
Extract Text
Extract Text Completed