Search And Mining Tools with Linguistic Analysis

Domain-specific search through statistical Language modeling


The Samtla system has been designed as a language-agnostic research environment for quantifying text corpora through phrase searches and comparative methods.

Samtla adopts practices developed in Information Retrieval including character-based suffix trees, statistical language models, Named Entity Recognition, and data mining techniques such as clustering and classification. Samtla presents search results according to the underlying principles and structure of the language present in domain specific corpora.

If you wish to find out more about Samtla please contact us.

Team of researchers