Mohomine, a provider of software for integrating text into enterprise applications, developed mohoClassifier v2.3 (mC,) a classification engine capable of categorizing documents written in complex character-based languages.
The software technology is targeted for use in intelligence
community applications.
mC reviews text information in e-mails, file systems, intranets and extranets, including the Internet, and provides automated document
classification and routing based on recognized patterns in the document as a whole.
It reports on user-definable properties such as topic, country source, subject, and tone/urgency of the author, among others.
The classifier can make fine-grained distinctions between
categories, can ascertain both text and numbers and has, in testing, reached accuracy rates as high as 98 percent.
According to Robert Ellsworth, former Deputy Secretary of Defense and U.S. Ambassador to NATO, "High document throughput and character-based language interpretation are critical requirements for
the intelligence community. Since 9/11, the intelligence community has been inundated by ever-increasing volumes of material and the need for greater foreign language proficiencies. Mohomine's offering can
greatly impact the speed and ease with which the intelligence
community filters through the material they have to review."
mC uses a learn-by-example methodology and needs only a small number of training documents, usually between 15 and 20, to reach high
levels of accuracy. Once trained, the software has very high throughput and can classify documents in real-time at a throughput of 100MB/minute or more. As an example, approximately 166 e-mails of 10K each could be analyzed per second, with nearly 10,000 e-mails analysed in a minute.
Mohomine had initially introduced mC with the ability to analyze documents in English and Western European languages in December 2001.
In light of the events of September 11th, and due to the importance of complex character-based languages to the intelligence community, the
company redefined its priorities for additional languages.
The classifier is inherently language independent but the process of qualifying it for character-based languages required the development
of a certification test suite and accuracy benchmark.
Designed from its inception to be embedded in large-scale
enterprise and intelligence community applications, mohoClassifier v2.3, is a server-based modular solution that can be integrated into other applications -- with integration cycles as short as one week. Based on highly scalable client-server
architecture, the software can be embedded in applications for portals, content management systems, business process management and security as well as Web-enabled applications and Web services environments.
Also being introduced in this version is an administrative console (GUI interface) for managing the generation of the model created through the learn-by-example workflow.
Samat added, "Our solution is differentiated by its language
independence, high document throughput and easy to integrate, scalable client-server architecture. These factors are key to any organization, but particularly to the intelligence community."
www.mohomine.com.

Comments
Post new comment