Behemoth - Large Scale Document Processing based on Apache Hadoop
Behemoth is an open source platform for large scale document processing based on Apache Hadoop. It consists of a simple annotation-based implementation of a document and a number of modules operating on these documents. One of the main aspects of Behemoth is to simplify the deployment of document analysers on a large scale.
https://github.com/jnioche/behemoth
License:
Tech:
Tags: