OpenPipe - Document Pipeline

OpenPipe is an open source scalable platform for manipulating a stream of documents. A pipeline is an ordered set of steps / operations performed on a document to convert from its raw form to something ready to be put into the index.
The operations performed on documents include language detection, field manipulation, POS tagging, entity extraction or submitting the document to a search engine.
OpenPipe has support to extract content from database and file system. It could extract content or metadata from any file formats.
OpenPipe has support to extract content from database and file system. It could extract content or metadata from any file formats.