Technically, Ask Data Anything is capable of performing projection, sub-setting, grouping and aggregation operations, providing answers for queries involving the following information:
- What? Any of the columns of your data table are considered a quantitative field over which to perform queries,
- How? How the output is to be shown. The results of the query can be retrieved on either a table, histogram or a map,
- Where? (Optional) The "in" preposition allows to restrict the search to an specific named group of items as happens for instance with continents which can be seeing as a group of countries,
- Of? (Optional) The "of" preposition allows to dive into the data, restricting the desired results to a certain set of types (concepts in the Fluent Editor sense) by searching the data in a certain column for instances (in Fluent Editor sense) of those types; we call this material sub-setting,
- By? (Optional) By which type (in Fluent Editor sense) you would like to group the results for aggregation purposes.
- When? (Optional) Queries can contain time constraints.
Inside Ask Data Anything
Ask Data Anything consists of the following blocks:The ontology modeling adds additional semantic layers on top of your data, expanding the search dimensionality and providing in turn an insightful querying experience.
Operational Semantics
Next we are going to briefly describe the 4 types of exploring modes available: projection, aggregation, sub-setting (Circumstantial) and material sub-setting (Conceptual).For demonstration purposes, let us take the data represented in the following table as our sample dataset:
Transaction | Item | Price | Quantity | City | Date | Trademark |
---|---|---|---|---|---|---|
T-001 | Sleeve Shirt | 543 | 11 | Warsaw | 03/07/2013 | Lacoste |
T-002 | Men's Dark One Button Suit | 1395 | 15 | Krakow | 07/12/2013 | Armani |
T-003 | Solid Polo Shirt | 580 | 18 | Krakow | 17/08/2012 | Gucci |
T-004 | Men's Mallow Graphic Tee | 163 | 9 | Warsaw | 19/03/2013 | Nike |
T-005 | I'm Bob Graphic Tee | 73 | 5 | Berlin | 01/02/2014 | Zara |
T-006 | 21 Years Old Women's Dark T-Shirt | 386 | 7 | Alicante | 22/12/2012 | Armani |
T-007 | Hanes Men's Comfortblend EcoSmart Jersey Polo, 2 Pack | 425 | 11 | Berlin | 14/03/2013 | Chanel |
T-008 | Men's Short Sleeve Stripe Polo | 820 | 7 | Boston | 05/09/2014 | Tommy Hilfiger |
T-009 | Women's Button Down Roll Tab Shirt | 244 | 12 | Munchen | 29/08/2014 | Lacoste |
T-010 | Men's jeans | 184 | 17 | Munchen | 06/02/2012 | Nike |
T-011 | Men's Geometric Print Short Sleeve Shirt | 975 | 12 | Alicante | 23/08/2014 | Armani |
T-012 | Men's Sasquatch Hunter Graphic Tee | 180 | 10 | Boston | 19/11/2012 | Gucci |
T-013 | Women's Essential V-Neck Tee | 147 | 3 | Madrid | 21/07/2013 | Nike |
T-014 | Men's Bass Guitar Guy Graphic Tee | 86 | 2 | Krakow | 22/03/2012 | Zara |
T-015 | Men's Essential shirt | 754 | 2 | Boston | 01/04/2013 | Zara |
T-016 | Women's Scoopneck Tee 2-Pack | 448 | 6 | Munchen | 26/07/2012 | Chanel |
What's inside?
To start exploring the possibilities, it is always useful to know what is inside:The dimensions are the columns of the data (quantitative fields), the possible operations are sum and averaging and the outputs are histogram, table and map.
Projection
This identity operation allows for projections over the data, retrieving subsets of it meeting certain requirements expressed through mathematical expressions.Example query:
Item with price > 700Aggregation
Aggregation is performed over hierarchical data, modeled in ontologies through typed instances (instances of concepts) related by either an "making part of" property or the ordinary time embedding i.e., days as part of months and months as part of years. This way instances of the concept country are related to the concept continent as "Every country is part of a continent".Example query:
Quantity summed by month on histogram:
Sub-setting (Circumstantial sub-setting)
Sub-setting allows to retrieve a subset of the data by a (circumstantial) belonging relation. This means, we can ask about the specific results in a given country or modeled group of instances: in this latter case we can constraint the result for groups of brands categorized by origin, i.e, Spanish, American, etc.Quantity summed in Europe by country on map:
Material sub-setting (Conceptual sub-setting)
Material sub-setting allows sub-setting by diving into the data properties as modeled in the provided ontology. This features allows us to make pretty expressive queries as:Quantity summed in Europe for item of (type) t-shirt:
By default the aggregation is performed in the target type that is marked here by the query sub-part "in Europe", which subsets the data using as discriminant a continent, so it returns the result of quantity for cities (which is the type in the data and is a part of continent). This behavior can be modified by adding the "by" part, as in "by brand", which would retrieve the aggregated sum of the quantitative field Quantity by brand (Lacoste, Armani, etc.).
You can go further and make a consistency check by retrieving all t-shirts from the Item column:
Semantic Modeling
The key feature offered by Ask Data Anything is adding additional semantic layers on top of the data (which are not explicit in the data itself) implying an increment of its dimensionality, which enhances the possibilities for data exploration.This way we are capable of asking queries as Price averaged in French-Brands, with French-Brands being a modeled instance of some "brands-by-origin" concept, which adds a grouping abstraction over the (modeled) brand instances that is not otherwise present in the data. Hereby we have models for the brand instances "Chanel" and "Lacoste" and therefore the averaging would be performed over all occurrences of this 2 brands in the data.
In summary Ask Data Anything can handle queries involving:
- Location
- Time
- User-defined concepts
- Instances of predefined concepts
Watch quick overview of Ask Data Anything: https://goo.gl/XnaIq3
To learn more about Semantic Technologies visit: http://goo.gl/7pkWIQ
!!!!
ReplyDeleteHello,
ReplyDeleteI have a question related to the data loading process and the data type recognition.
In the example above you have the column City which contains the cities' name, with ADA your able to aggregate over them. I would like to understand if ADA recognize those values as cities by the column name (City) or it is able to recognize by the column content or there is some config file which contains some configuration.
Thanks a lot in advance.
The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. machine learning projects for final year In case you will succeed, you have to begin building machine learning projects in the near future.
ReplyDeleteProjects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.
Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.