ClassifAI Expands OCR Scanning Capabilities To PDFs


The recent 1.7 release of ClassifAI, our free plugin that augments WordPress-powered websites with artificial intelligence and machine learning technology, expanded its Optical Character Recognition (OCR) scanning capabilities to support multi-page PDF files.

ClassifAI leverages cloud-based services like IBM Watson and Microsoft Azure AI to enhance content management in WordPress with:

  • Automated content tagging and classification
  • Automated image tagging and descriptive alt text assignment
  • Smart focal point cropping of images
  • Bulk scanning of existing content
  • OCR text generation for screenshots, images, and PDFs

The integration of OCR scanning for screenshots and other imagery was introduced in the 1.6 release, on the heels of Facebook and Instagram dropping support for open embeds. Now, ClassifAI uses the same automated text scanning technology to scan and index text within PDF documents of all sizes, adding the text content to the media description field.

The addition of OCR scanning for PDFs is a boon to content creators whose sites often contain large archives of scanned paperwork and PDF documents, as we’ve found to be especially common among higher education and governmental clientele.

Storing PDF text as metadata in WordPress dramatically improves onsite and admin search capabilities by making the text within a PDF searchable. This makes more site content discoverable and helps visitors find the right information faster and easier. For content managers and editorial teams, the ability to quickly search the media library for information within a PDF document avoids accidental duplication, and in turn, can help prevent visitors and search engines from being served outdated or inconsistent information.

As we look to the future, we remain focused on integrating features that simplify the content management process for all content creators and publishers. Our Open Source Practice is currently exploring automatically transcribing audio files for podcasts and integrating with popular personalization services.

Get ClassifAI

To register for a free license key and download the plugin, head to Registration allows us to keep adopters apprised of major updates and beta testing opportunities, gather feedback, and prioritize common use cases.

ClassifAI is developed on GitHub, where we are actively reviewing feedback and issue reports. Designed to be extended, it provides hooks and filters for developers to customize service providers and integration points.

To explore innovative ways to adopt artificial intelligence and machine learning technologies as part of your digital strategy, get in touch.

Leave a Comment

Finely crafted websites & tools that make the web better.