About

This tool is meant to facilitate academic work and extracting potential quotations from references in pdf format by allowing for text searches in multiple pdf files simultaneously.

It allows text based searches, including regular expressions, and returns a list of extracts of text around any match that was found in each pdf file.

It also offers the possibility of downloading plain text extracted from uploaded pdf.

Developed in PHP using the standalone PHP library PdfParser to handle data extraction from pdf files.

Functionalities:

  1. Upload one or multiple pdf or plain text (txt) files.
  2. Define search terms. Regular expressions are accepted.
  3. Read the extracted text or download it to a plain text file.

Select PDF files