Monday, 19 January 2015

Mixing Text Mining with Semantic Technologies - sample application.

The very broad subject of processing the natural language is incredibly hot nowadays. In many cases, a regular text mining approach is not adequate to the problems that we are facing. Therefore text mining methods are mixed with Natural Language Processing(NLP) methods, like also, with semantic technologies - what gives better results. One of such a problem is how to find out, if two sentences are semantically equal or not.

The solution for the above problem could be used on many fields. One of them is detection of an abusive clauses inside a contract. Sometimes it's really hard to understand correctly, the exact meaning of a clause inside a contract, even for a specialists. For a sake of presentation I have developed a simple application prototype which attempts to solve this problem. Application was developed in C# and it uses Ontorion SDK.

Input

Before running the application we need three files:
  1. File with contract in which we will attempt to detect abusive clauses.
  2. File with abusive clauses.
  3. File with ontology.

Friday, 16 January 2015

Using the rOntorion package in R / RStudio and Fluent Editor

The rOntorion package is the port of Cognitum's Semantic Technologies to R. R has become an important tool among Statisticians and Data Scientists and we are proud to provide this community with an enhanced Linked-Data manipulation experience that will allow them to edit, store and reason over structured data (in the supported formats ocnl, rdf and owl); henceforth discovering new horizons in Data Analysis. rOntorion allows to extend Fluent Editor in R and in turn provides the users with the capability of creating their own custom functionality.

rOntorion in R

To demonstrate the use of rOntorion directly from R, let us go through a minimal example. In this example we are going to reason over a set of dummy sentences and infer a single logical conclusion by querying the semantic engine with a question expressed in ocnl format. First we need to install rOntorion: to do so, issue the following command in an R Console: