I’m a relative newbie, working on building an Omeka site that will contain a massive number of PDFs. Because I’m already formatting and OCRing them, I’m hoping to avoid copying and pasting each one.
I’ve found the PDFText plugin, which looks like it could really make the process more streamlined, but it requires that I have pdftotext installed, and to be frank… I’m just not clear on how to install it, or where in Reclaim’s default file structure… I’m just a bit out of my depth. Could anyone help me or point to a resource that might clarify the situation a bit?
pdftotext is something we’d need to install at the server level so individual cPanel accounts don’t have the necessary permissions. We’ve done this on one or two servers already so if it’s not available in your account you can put in a support ticket to us and we’ll be happy to add it and follow up!