Technically, Ask Data Anything is capable of performing projection, sub-setting, grouping and aggregation operations, providing answers for queries involving the following information:
- What? Any of the columns of your data table are considered a quantitative field over which to perform queries,
- How? How the output is to be shown. The results of the query can be retrieved on either a table, histogram or a map,
- Where? (Optional) The "in" preposition allows to restrict the search to an specific named group of items as happens for instance with continents which can be seeing as a group of countries,
- Of? (Optional) The "of" preposition allows to dive into the data, restricting the desired results to a certain set of types (concepts in the Fluent Editor sense) by searching the data in a certain column for instances (in Fluent Editor sense) of those types; we call this material sub-setting,
- By? (Optional) By which type (in Fluent Editor sense) you would like to group the results for aggregation purposes.
- When? (Optional) Queries can contain time constraints.
Inside Ask Data AnythingAsk Data Anything consists of the following blocks:
The ontology modeling adds additional semantic layers on top of your data, expanding the search dimensionality and providing in turn an insightful querying experience.
Operational SemanticsNext we are going to briefly describe the 4 types of exploring modes available: projection, aggregation, sub-setting (Circumstantial) and material sub-setting (Conceptual).
For demonstration purposes, let us take the data represented in the following table as our sample dataset:
|T-002||Men's Dark One Button Suit||1395||15||Krakow||07/12/2013||Armani|
|T-003||Solid Polo Shirt||580||18||Krakow||17/08/2012||Gucci|
|T-004||Men's Mallow Graphic Tee||163||9||Warsaw||19/03/2013||Nike|
|T-005||I'm Bob Graphic Tee||73||5||Berlin||01/02/2014||Zara|
|T-006||21 Years Old Women's Dark T-Shirt||386||7||Alicante||22/12/2012||Armani|
|T-007||Hanes Men's Comfortblend EcoSmart Jersey Polo, 2 Pack||425||11||Berlin||14/03/2013||Chanel|
|T-008||Men's Short Sleeve Stripe Polo||820||7||Boston||05/09/2014||Tommy Hilfiger|
|T-009||Women's Button Down Roll Tab Shirt||244||12||Munchen||29/08/2014||Lacoste|
|T-011||Men's Geometric Print Short Sleeve Shirt||975||12||Alicante||23/08/2014||Armani|
|T-012||Men's Sasquatch Hunter Graphic Tee||180||10||Boston||19/11/2012||Gucci|
|T-013||Women's Essential V-Neck Tee||147||3||Madrid||21/07/2013||Nike|
|T-014||Men's Bass Guitar Guy Graphic Tee||86||2||Krakow||22/03/2012||Zara|
|T-015||Men's Essential shirt||754||2||Boston||01/04/2013||Zara|
|T-016||Women's Scoopneck Tee 2-Pack||448||6||Munchen||26/07/2012||Chanel|
What's inside?To start exploring the possibilities, it is always useful to know what is inside:
The dimensions are the columns of the data (quantitative fields), the possible operations are sum and averaging and the outputs are histogram, table and map.
ProjectionThis identity operation allows for projections over the data, retrieving subsets of it meeting certain requirements expressed through mathematical expressions.
Example query:Item with price > 700
AggregationAggregation is performed over hierarchical data, modeled in ontologies through typed instances (instances of concepts) related by either an "making part of" property or the ordinary time embedding i.e., days as part of months and months as part of years. This way instances of the concept country are related to the concept continent as "Every country is part of a continent".
Quantity summed by month on histogram:
Sub-setting (Circumstantial sub-setting)Sub-setting allows to retrieve a subset of the data by a (circumstantial) belonging relation. This means, we can ask about the specific results in a given country or modeled group of instances: in this latter case we can constraint the result for groups of brands categorized by origin, i.e, Spanish, American, etc.
Quantity summed in Europe by country on map:
Material sub-setting (Conceptual sub-setting)Material sub-setting allows sub-setting by diving into the data properties as modeled in the provided ontology. This features allows us to make pretty expressive queries as:
Quantity summed in Europe for item of (type) t-shirt:
By default the aggregation is performed in the target type that is marked here by the query sub-part "in Europe", which subsets the data using as discriminant a continent, so it returns the result of quantity for cities (which is the type in the data and is a part of continent). This behavior can be modified by adding the "by" part, as in "by brand", which would retrieve the aggregated sum of the quantitative field Quantity by brand (Lacoste, Armani, etc.).
You can go further and make a consistency check by retrieving all t-shirts from the Item column:
Semantic ModelingThe key feature offered by Ask Data Anything is adding additional semantic layers on top of the data (which are not explicit in the data itself) implying an increment of its dimensionality, which enhances the possibilities for data exploration.
This way we are capable of asking queries as Price averaged in French-Brands, with French-Brands being a modeled instance of some "brands-by-origin" concept, which adds a grouping abstraction over the (modeled) brand instances that is not otherwise present in the data. Hereby we have models for the brand instances "Chanel" and "Lacoste" and therefore the averaging would be performed over all occurrences of this 2 brands in the data.
In summary Ask Data Anything can handle queries involving:
- User-defined concepts
- Instances of predefined concepts
Watch quick overview of Ask Data Anything: https://goo.gl/XnaIq3
To learn more about Semantic Technologies visit: http://goo.gl/7pkWIQ