Friday, 6 February 2015

Reasoning about ontologies - fast vs. complete answers


In this article you will gain more intuition about:
- how to query your ontology
- the difference between reasoner and materialized graph
- what is materialization mode OWL-DL and materialization mode OWL-RL+
- when you can use faster OWL RL+ reasoning mode safely

You will see two example ontologies:
- about books (using data types, cardinality restriction, data type restrictions)
- about political preferences (SWRL rules, defining concepts by enumeration)

You can reproduce the steps by downloading the ontologies:
- my_books.encnl
- political_parties.encnl
and opening it with FluentEditor on your computer.

About reasoners and materialized graph





With recent FluentEditor releases the user has three tools to query the ontology:
  • reasoner of choice (in this example Hermit reasoner is used)
  • materialized graph (we can use either OWL-DL or OWL-RL+ materialization mode)
  • SPARQL queries

I will focus on the first two options.
Reasoner processes Controlled Natural Language. We can query it about concepts and instances, asking questions beginning with  Who-Or-What...  It returns answers of three types:
  • instances satisfying the concept (first column)
  • subconcepts of a given concept (second column)
  • superconcepts (third column)


Reasoner always returns all the knowledge that can be inferred from our ontology.
Materialized graph is a graph that stores collection of:
  •  all known and reasoned triples (Mary has-child Julia, Anna has-child Matt, Matt has-age equal-to 16.)
  • information about superconcepts of our instances (Mary is a mother, Mary is a woman.)



We can query materialized graph with similar questions as the reasoner. The materialized graph will be searched for answers.

Allowed queries are rewritten into SPARQL graph queries. SPARQL query engine is much more faster than the reasoner, therefore, if time is the crucial factor, querying the materialized graph gives us solution, rather than the reasoner. Additionally if you don't change your ontology, you can use the same materialized graph for your next queries, which makes the time advantage even greater.

Tricky ontology about books

You can follow the example with  my_books.encnl.

As mentioned above, a query to materialized graph may not retrieve all the results. It actually means that the you had in mind some abstract information that could not be materialized to graph. Such use cases happen rarely. Below we present one of them.
Please read the simple ontology about books. It contains value comparison (lower, equal, greater).


Every book has-number-of-pages one (some integer value).
Pride-And-Prejudice is a book.
Crime-And-Punishment is a book.
Atonement is a book.
Pride-And-Prejudice has-number-of-pages equal-to 272.
Crime-And-Punishment has-number-of-pages lower-or-equal-to 500.
Atonement has-number-of-pages lower-or-equal-to 443.
Atonement has-number-of-pages greater-or-equal-to 443.

Consider a question: 
Who-Or-What has-number-of-pages lower-or-equal-to 1000?
It is not hard for us to say that there are three correct answers. All three books mentioned in the ontology, surely have less than 1000 pages. How difficult is the automated reasoning for the reasoning engine? Try asking the question (3 scenarios):
  • in the reasoner window
  • in the materialized graph window
For the materialized graph switch between different materialization modes: OWL-DL and OWL-RL+ (tab Home at the top of Fluent Editor window).

You will see that each scenario gives different results. Some things are missing.


CASE 1: Pride and Prejudice


That is an easy case, since the number of pages was stated directly. The answer appears in all the scenarios - in reasoner and in materialized graph, regardless of the materialization mode.


CASE 2: Crime and Punishment


That answer appears only in the reasoner results. It is never listed in materialized graph results, regardless of the materialization mode.
This example shows you the difference between the way reasoner and materialized graph process the question.

Reasoner assumes there exists some unknown number of pages that Crime and Punishment has and processes the information about it that is available.

Materialized graph is created only once before the question itself is analyzed. It contains only instances and precise values that could be reasoned from the ontology. When the question is asked, only the graph is analyzed.

Later, when you will process lengthy and complex ontologies, you will notice that the first question to the materialized graph takes longer (the graph is constructed), while next questions are answered quickly (the graph is examined only). It is much quicker than questions to the reasoner.


CASE 3: Atonement


That is the most astonishing case. OWL RL+ materialization mode may give fewer results then OWL DL. Actually Atonement is mentioned only if the materialization mode is set to OWL DL (official OWL DL standard specification).
Have you noticed that some sentences were highlighted in orange when you have chosen OWL-RL+? The sentences are perfectly correct in terms of grammar but they are not in OWL RL+ profile. OWL RL+ is an extension of OWL RL (official OWL2 RL standard specification) profile by SWRL sentences. This is Fluent Editor custom feature which indicates what may cause some trouble during materialization...


And indeed, some of the highlighted sentences do cause trouble. Two pieces of knowledge: that number of pages is lower or equal to 443 and that number of pages is greater or equal to 443 were not joined properly.

The most tedious thing for the reasoner is assuming there must exist some anonymous instance or value which has some properties but is not known precisely and reasoning over it. OWL-RL+ is Fluent Editor custom materialization mode. OWL-RL+ uses different algorithm (forward chain rule).
OWL-DL/ OWL-RL+ as well as materialization mode are described in detail in Fluent Editor help.
Later, when you will process lengthy and complex ontologies, you may notice that for OWL-RL+ materialization mode materialized graph is prepared faster than for OWL-DL.






Typical ontology about politics

You can follow the example with  political_parties.encnl.

Happily most typical ontologies do not cause trouble for materialized graph even in OWL-RL+ mode. The answers are complete.
Our simple ontology about political preferences consists of a few simple rules:

Every person supports a political-party.
Something is a political-party if-and-only-if-it is either Republican-Party or Democratic-Party.
Something is an adult if-and-only-if-it is a person and has-age greater-or-equal-to 18.
If an adult supports a political party then the adult votes-for the political-party.

The rules are followed by information about a few people .

Barack has-age equal-to 53.
Barack does-not support Republican-Party.
George has-age equal-to 68.
George does-not support Democratic-Party.
Tom has-age equal-to 17.
Tom supports Democratic-Party.
Anna has-age equal-to 20.
Anna supports Democratic-Party.
Mary has-age equal-to 28.
Mary supports Republican-Party.

As you can see even the fast reasoning mode properly processes the If ... then ... rules and complex definitions of the form Something ... if-and-only-if-it... 




Summary

Hopefully, now you will be able to tailor the reasoning technique to your needs. Remember that reasoner always gives complete result. Contrary, materialized graph in OWL RL+ may miss some answers if the ontology contains sentences outside OWL RL+ mode. To have complete results, you also have to make sure that you ask about materialized triples, not some abstract information.
However, the performance of materialized graph is faster and it is reliable in most use cases. You can ask many queries in the row and if you don't change your ontology on the way, the subsequent answers will be very quick. The materialized graph is constructed only once. Additionally, the materialization is much faster if we change mode to OWL RL+.



No comments:

Post a Comment