3 packages
anyfileparser
Node.JS library that can parse text from any kind of file, eg. text, image, pdf, office files.
filesyscrawler
File System Crawler helps read the file system info for any user selected folder. It also helps extract text from files including pdf files. It can also perform OCR on image files and extract legible texts from them. Support for reading many other popular
officeparser
A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx, odt, odp, ods, pdf files.