Wednesday, 28 October 2015

Ask Data Anything - Election results example

In modern organizations, data management is a major issue and at the same time a major resource. In our experience, the first challenge a business that wants to use its data is facing how to have a unified view of their data. Generally data inside organizations is stored in different databases that have often proprietary API making it difficult to move from one database to the other. Furthermore, also when the technology used to store data is the same, there are still semantic problems like different terminologies, languages etc.


The bigger the company is, the lower the possibility to standardize the procedures are, so that these kind of situations will not happen. This happens because we are human and we naturally tend to interpret data using our own experience and knowledge. Thus we cannot expect the technical team to call all pieces of a car using the exact same terminology as the logistic department. This is why, our solution aims at giving the possibility to standardize the way in which the end user interact with the data without actually changing the source of the data.

Ask Data Anything (ADA), allows companies to add a semantical layer on top of the data without the need of copying data. The product is managing term disambiguation, aggregation of data using hierarchies defined in ontologies, data integration between different data sources.

This article presents usage of ADA tool with sample data about election results in Leeds borough in 2014, coming from DATA.GOV.UK (http://data.gov.uk/dataset/election-results). It presents example operation which can be performed over data: firstly - summarizing results, then aggregation and finally projection. All sections include example queries and results view. Overall knowledge, consisting of information form data file (which can be e.g. CSV file) and taxonomy in form of CNL ontology, allows ADA to give answers for statistical queries like parties' results in different regions as well as results for specific candidates. ADA is capable of handling queries involving time, location and both predefined and user's concepts.


Summarizing data

One of the possibilities is to ask ADA to summarize parties' general results and present them on pie chart. Query result's computation is based on data contained in data file and semantically modelled taxonomy. Thanks to that knowledge ADA's engine can recognize concepts included in query and perform aggregation on demand.

Example query:
Summarize party by result on piechart



Performing aggregation

There are many possible types of output for each result: table, histogram, pie chart and map. It is also possible to perform different types of aggregation on numerical values, like sum or average. Picture below presents overall sum of votes for each party in form of histogram:

Example query:
Sum votes by party on histogram


Limiting the aggregation area

Domain of operation can be limited to specified area, declared in taxonomy. Here we present the result for a question similar to previous one, but where votes from Kirkstall borough only are used. This time output type was set to the table, which is default option.

Example query:
Sum votes by party in Kirkstall



It is  possible to combine restrictions to get answers to more sophisticated queries. Once again one can ask for votes given to parties, but this time the expected outcome is the average result in two boroughs: Kirkstall and Beeston.

Example query:
Sum votes by party in Kirkstall and Beeston on histogram


Performing projection over data

Another way to manipulate queries' meaning is by using "with" keyword. Clause added after "with" allows to filter output to contain only specific results, as in query presented below:

Example query:
Surname in Alwoodley with votes > 1000


Besides strictly statistical data, one can also ask for more specific information. Example presented in picture below shows all parties which candidates received mandates in Otley borough.

Example query:
Party in Otley with mandate > 0



Summary


Ask Data Anything supports data exploration by applying semantic layer on top of the raw data, which allows to execute analysis without explicitly stating all information. It enables to perform variety of operations, like aggregation or projection, over data set when needed. ADA allows to formulate queries in intuitive way using natural language and present results in convenient for users form (on table, histogram, map or pie chart). Expanding taxonomy makes it possible to ask more complex questions and extends knowledge base with minimal effort.

References





20 comments:

  1. More and more small and medium businesses depend heavily on data. That data might be stored on various servers and/or computers with sensitive information about clients, customer orders, financials, sales records and more.
    top virtual data room providers

    ReplyDelete
  2. This kind of method has limited aggregation results so it is not reliable when it comes to perfect value. It could not give us exact method of information, so when it comes to data, I used UKessays.com review since it has efficient information.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Did you saw the last Google Assistant update? It's amazing! This is what I call 'Ask data anything' for real)) If you didn't saw it already, I would recommend you to check some reviews. It is literally the revolution of technology. And we live in this amazing times.

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. I appreciate you sharing this blog post.Really looking forward to read more. Great. WizEssay.com review

    ReplyDelete
  7. By choosing your college application college essay topic exampless well, your essay will also serve to show how coherently you can arrange your thoughts, how logically you can express yourself and how well you are able to write.

    ReplyDelete
  8. Hey there! Guys I adore writing and writing papers. But when I don't have time for finishing my writings then I apply to the super essay writers

    ReplyDelete
  9. Database and data management are the basic problems that any large organization has to deal with. According to statistics from Assignment Writing Services, organizations that follow different security steps to protect their databases from hackers are the only ones that are successful.

    ReplyDelete
  10. Positive site, where did u come up with the information on this posting? I'm pleased I discovered it though, ill be checking back soon to find out what additional posts you include. palottery

    ReplyDelete
  11. The Content You Shared With Us Is Excellent & Extraordinary Useful to all Aspirants Thanks For Sharing With Us!
    Autocad training in Ahmednagar

    ReplyDelete
  12. Thanks for such a useful information for me. As I am writing an article related with this and I need more information about how to write synopsis for project so while I was surfing the Internet I saw your post which helped me a lot.

    ReplyDelete
  13. GiftsandAll.com is the title of belief, confidence, and dependence. You can Send Gifts to Pakistan to your special ones from all over the world, Send Online Gifts to Pakistan like Flower, Cakes, Chocolate and much more. We accept client fulfilment and commitment. Commitment in both timely delivery and items quality. We have a primary and Quick Shopping System to put your order effortlessly.
    send cakes to Pakistan

    ReplyDelete
  14. Enterprises have been a significant market for telecoms in recent years. Because businesses from many industries are implementing cutting-edge technologies like artificial intelligence, IoT, and machine learning, the role of telecom operators in the corporate sector has expanded. Telecom operators have changed from being typical voice and data service providers to acting as platforms for cutting-edge technological products and services. Telcos are gaining access to new revenue sources by assisting businesses on their journey toward digital transformation. We have the top simulation games and some beginner's guide to playing it better.

    ReplyDelete
  15. We are providing the latest updates of seriale turcesti subtitrat in Romana on our website. You can watch all updates of seriale turcesti subtitrat in Romana. Now you are watching mireasa sezonul 6 on clicksud. For more updates of seriale turcesti subtitrat in Romana bookmark our website.

    ReplyDelete
  16. During the tenure of our service offering, we have delivered thousands of assignments, essays and dissertations and have gained trust of our customers, which can be seen by our high review ratings of 4.8 / 5 Stars. write me an assignment buy









































    ReplyDelete
  17. Nice and great article share. Your blog post has lots of information. All content you describe is in your site is very useful for viewers. Thanks for sharing the valuable information. uniport basic studies application form

    ReplyDelete
  18. There are several software development companies that specialize in IDX (Internet Data Exchange) real estate software. IDX software enables real estate professionals to display and manage property listings on their websites, providing a seamless search and browsing experience for potential buyers. Here are a few prominent IDX real estate software development companies.
    idx real estate software development company

    ReplyDelete
  19. Thanks for sharing such article. Induce India is helping you providing Domestic product certification in india do visit us.

    ReplyDelete