Monday, 19 January 2015

Mixing Text Mining with Semantic Technologies - sample application.

The very broad subject of processing the natural language is incredibly hot nowadays. In many cases, a regular text mining approach is not adequate to the problems that we are facing. Therefore text mining methods are mixed with Natural Language Processing(NLP) methods, like also, with semantic technologies - what gives better results. One of such a problem is how to find out, if two sentences are semantically equal or not.

The solution for the above problem could be used on many fields. One of them is detection of an abusive clauses inside a contract. Sometimes it's really hard to understand correctly, the exact meaning of a clause inside a contract, even for a specialists. For a sake of presentation I have developed a simple application prototype which attempts to solve this problem. Application was developed in C# and it uses Ontorion SDK.

Input

Before running the application we need three files:
  1. File with contract in which we will attempt to detect abusive clauses.
  2. File with abusive clauses.
  3. File with ontology.

Files 1&2 can be in Ms Word or txt format:


Figure 1


The ontology file was created in Fluent Editor. It contains knowledge extracted from the files 1&2. In this sample the ontology is very small and simple: 



Figure 2

Above ontology was exported to the Ontorion so I could easily explore it from within application, using Ontorion SDK.

In application we just load file with contract and file with abusive clauses, next specify the name of the Ontorion database in which the ontology resides and then we can start the matching process.

Figure 3 

Algorithm

An algorithm of matching sentences does the following:
  1. Sentences detection of files 1&2.
  2. Tokenization of files 1&2 sentences.
  3. Stemming of each sentence words in files 1&2.
  4. Obtaining the knowledge from ontology by using Ontorion SDK - the relations between concepts/instances along with its annotations.
  5. Calculating the improved semantic cosine similarity measure between all of file 1 and file 2 sentences.
This algorithm doesn't take in account a words order, what in this particular case will be an advantage. The most important part of whole process plays an ontology. More "rich" the ontology is, more accurate results we will get.

Example

As an example I have prepared one contract clause and one abusive clause. Ontology used is same as on Figure 2. We will start from the end, so from the result of comparison, and then describe it:

Figure 4

So the contract clause is as follow:

Payment for the training: after the charge, money is not refundable.

and an abusive clause is:

Payment for the service: after the payment the money is not returnable.

From the relations described in ontology we know that:


Every training is a service.

thus, "training" and "service" words, can be considered here as semantically equal. In our ontology we also defined an annotations for each instance/concept, which are considered as synonyms. So for "payment" concept we defined, inter alia, a "charge" annotation, and for "refundable" concept - "returnable" annotation. So despite the differences between words used in both sentences, the meaning of them is same, that's why the semantic similarity rate equals 1.

You can see this sample application in action on below video:






To learn more about Ontorion visit: http://www.cognitum.eu/semantics/


106 comments:

  1. The expansion of internet and other business intelligence leads to large volume of data. Industries are looking for talented professionals to maintain and process huge volume of data with latest tools available in the market. Taking Hadoop Training in Chennai | Big Data Training in Chennai will ensure better career prospects for talented professionals.

    ReplyDelete

  2. This was an nice and amazing and the given contents were very useful and the precision has given here is good.
    Data Science Training in Chennai

    ReplyDelete
  3. good one,very onformative thanks for sharing your views and ideas.. it is very useful to me.. thanks once again
    Python Training in Chennai

    ReplyDelete
  4. Thank you.. your blog is doing good work. Awesome blog. Devops Training in Chennai

    ReplyDelete
  5. The concept of DevOps is founded on building a culture of collaboration between teams that historically functioned in relative siloes. The promised benefits include increased trust, faster software releases, ability to solve critical issues quickly, and better manage unplanned work.Devops Training in Chennai
    Devops Training in Bangalore

    ReplyDelete
  6. Wow, Nice blog. Thank you so much for the efforts of this blog. Visit for
    Best Offshore VPS

    ReplyDelete
  7. Thanks for your interesting ideas.the information's in this blog is very much useful for me to improve my knowledge.
    app development course in bangalore
    Android Training in Perambur
    Android Training in Nungambakkam
    Android Training in Karapakkam

    ReplyDelete
  8. Thanks for the information, worth reading article.

    IoT Training Pune

    ReplyDelete
  9. Played on BGAOC with big wins? NOT? Come to us as soon as possible and win with us. perfect all slots casino Come get your chance to win.

    ReplyDelete
  10. It has been just unfathomably liberal with you to give straightforwardly what precisely numerous people would've promoted for an eBook to wind up making some money for their end, basically given that you could have attempted it in the occasion you needed.
    iosh safety course in chennai

    ReplyDelete
  11. Congratulations guys, quality information you have given!!!
    Regards,
    Data Science Course In Chennai

    ReplyDelete
  12. Thanks for giving great kind of information. So useful and practical for me. Thanks for your excellent blog, nice work keep it up thanks for sharing the knowledge.
    dining room interior designer in noida

    ReplyDelete
  13. nice work keep it up thanks for sharing the knowledge.Thanks for sharing this type of information, it is so useful.
    Epoxy Grout manufacturer

    ReplyDelete
  14. Very Clear Explanation. Thank you to share this
    Regards,
    Regards,
    Best Devops Training Institute in Chennai

    ReplyDelete
  15. Супер отличная гибкая світлодіодна стрічка на любой вкус и цвет, обычно покупаю в интернет магазине.

    ReplyDelete
  16. This is best one article so far I have read online, I would like to appreciate you for making it very simple and easy
    Regards,
    Devops Training in Chennai | Devops Certification in Chennai

    ReplyDelete
  17. You are doing a great job. I would like to appreciate your work for good accuracy.
    devops certification in chennai

    ReplyDelete
  18. Amazing! I like to share it with all my friends and hope they will like this information.
    Regards,
    Python Training in Chennai | Python Programming Classes | Python Classes in Chennai

    ReplyDelete
  19. Thanks for sharing this information admin, it helps me to learn new things. Continue sharing more like this.
    Regards,
    Tableau training in Chennai | Tableau Courses Training in Chennai | Tableau training Institute in Chennai

    ReplyDelete
  20. I feel happy about and learning more about this topic. keep sharing your information regularly for my future reference. This content creates a new hope and inspiration with in me. Thanks for sharing article like this. the information which you have provided is better then other blog.
    IELTS Coaching in Dwarka

    ReplyDelete
  21. This comment has been removed by the author.

    ReplyDelete
  22. Thank you for this great article i learn a lot from your article keep it up.
    attitude status in hindi
    Life status in hindi
    Love Status in hindi

    ReplyDelete
  23. many peoples want to join random whatsapp groups . as per your demand we are ready to serve you whatsapp group links . On this website you can join unlimited groups . click and get unlimited whatsapp group links

    ReplyDelete
  24. Interesting information and attractive.This blog is really rocking... Yes, the post is very interesting and I really like it.I never seen articles like this. I meant it's so knowledgeable, informative, and good looking site. I appreciate your hard work. Good job.
    Kindly visit us @
    Sathya Online Shopping
    Online AC Price | Air Conditioner Online | AC Offers Online | AC Online Shopping
    Inverter AC | Best Inverter AC | Inverter Split AC
    Buy Split AC Online | Best Split AC | Split AC Online
    LED TV Sale | Buy LED TV Online | Smart LED TV | LED TV Price
    Laptop Price | Laptops for Sale | Buy Laptop | Buy Laptop Online
    Full HD TV Price | LED HD TV Price
    Buy Ultra HD TV | Buy Ultra HD TV Online
    Buy Mobile Online | Buy Smartphone Online in India

    ReplyDelete
  25. The article is very interesting and very understood to be read, may be useful for the people. I wanted to thank you for this great read!! I definitely enjoyed every little bit of it. I have to bookmarked to check out new stuff on your post. Thanks for sharing the information keep updating, looking forward for more posts..
    Ruby

    ReplyDelete
  26. I like your post very much. It is very much useful for my research. I hope you to share more info about this. Keep posting!!
    When operating system no longer can put data on RAM then it uses some area of hard disk space to store data that area is called Swap space. When the system goes on low memory mode then it uses Swap space to store data. Basically, it swaps the data on RAM of an ideal program to swap area. In this tutorial, we are going to learn how to add swap space on Linux Mint 19.

    Prerequisites
    Before you start to create sudo user on Linux Mint 19. You must have the root user account credentials of your system.
    Read More

    ReplyDelete
  27. Amazing article. Your blog helped me to improve myself in many ways thanks for sharing this kind of wonderful informative blogs in live.
    mobile App Development Training in Chennai | Android Development Training in Chennai | Ios App Development Training in Chennai

    ReplyDelete
  28. Thanks for posting this highly informative article. I feel glad about learning more about this concept. Maintain the number of posting and keep up the good work!
    Data Science Course in Chennai | Data Science Training in Chennai

    ReplyDelete
  29. Usually I never comment on blogs but your article is so convincing that I never stop myself to say something about it. You’re doing a great job Man,Keep it up.
    New Zealand education Consultants in Delhi

    ReplyDelete
  30. Thanks for the Valuable information.Really useful information. Thank you so much for sharing.It will help everyone.Keep Post. Find Some Indian Memes. Interesting News Tamilrockers Movie Download Trending News Some Life hacks tips Life hacks Entertainment News Find Some Viral News Here.Trending News
    saurabh jindal

    ReplyDelete
  31. How To Download PUBG Mobile Lite iOS Free – For Android, IOS and PC – Tips and APK Download

    ReplyDelete
  32. Thanks for sharing valuable information.It will help everyone.keep Post.
    Kerala Lottery Guessing

    ReplyDelete
  33. Nice article
    Thanks for sharing the information
    Please visit leadmirro to know your blog rank

    ReplyDelete
  34. amazing post written ... It shows your effort and dedication. Thanks for share such a nice post.
    sandeep maheshwari quotes and harry potter wifi names

    ReplyDelete
  35. Flox Blog Beginner's Guide for Web Developers.


    Nice article. It's very helpful to me. Thank you. Please check my online rgba color picker tool.


    Check your website is mobile friendly or not with Responsive Checker.


    Convert your text to Sentence Case. Then go for Flox Sentence Case Converter.

    ReplyDelete
  36. Usually I never comment on blogs but your article is so convincing that I never stop myself to say something about it. You’re doing a great job Man, Keep it up.
    Veteran Mode, MLive Mod APK, Layon Shop, Multitas Pinjaman, Brasil Tv New, Project IGI, Enlight Pixaloop Pro, Gimy TV, Sakura Live Show China, TR Vibes HotStar

    ReplyDelete
  37. I discovered your blog while scanning for the updates, I am glad to be here. Exceptionally helpful substance and furthermore effectively reasonable giving.. Trust me I did composed a post about instructional exercises for amateurs with reference of your blog.fruits drawing

    ReplyDelete
  38. Thank you for sharing the valuable information. Thanks for providing a great informatic blog, really nice required information & the things I never imagined. Thanks you once again Download Ludo King Mod Apk

    ReplyDelete
  39. Thank you for sharing such valuable information and tips. This can give insights and inspirations for us; very helpful and informative! Would love to see more updates from you in the future.
    CLICK HERE

    ReplyDelete
  40. Thank you for this great article i learn a lot from your article keep it up.
    How to Install Yarn on Ubuntu 18.04

    ReplyDelete
  41. Today i found a new sexy bike Suzuki Gixxer SF 250 , its look so beautiful is available in India at a price of Rs. 1.71 - 1.71 Lakh ex-showroom

    ReplyDelete
  42. KTM Bike launch a super bike KTM 790 Duke in India at a price of Rs. 8.64 Lakh ex-showroom Delhi, Check it MotoBike

    ReplyDelete
  43. Wow What A Nice And Great Article, Thank You So Much for Giving Us Such a Nice & Helpful Information, please keep writing and publishing these types of helpful articles, I visit your website regularly
    jagranjosh

    ReplyDelete
  44. Thank you for such a nice article keep posting, I am a Regular Visitor of your website.
    ncvt mis home

    ReplyDelete
  45. yojana magazine september 2019 pdf download
    I found your article on Google when I was surfing, it is written very nicely and is optimized .Thank you I visit your website regularly.

    ReplyDelete




  46. Wow What A Nice And Great Article, Thank You So Much for Giving Us Such a Nice & Helpful Information, please keep writing and publishing these types of helpful articles, I visit your website regularly.
    best exercise for burning calories and losing weight

    ReplyDelete