IJCATR Volume 3 Issue 12

Sentence Validation by Statistical Language Modeling and Semantic Relations

Lakshay Arya
10.7753/IJCATR0312.1012
keywords : language modeling, smoothing, chunking, statistical, semantic

PDF
This paper deals with Sentence Validation - a sub-field of Natural Language Processing. It finds various applications in different areas as it deals with understanding the natural language (English in most cases) and manipulating it. So the effort is on understanding and extracting important information delivered to the computer and make possible efficient human computer interaction. Sentence Validation is approached in two ways - by Statistical approach and Semantic approach. In both approaches database is trained with the help of sample sentences of Brown corpus of NLTK. The statistical approach uses trigram technique based on N-gram Markov Model and modified Kneser-Ney Smoothing to handle zero probabilities. As another testing on statistical basis, tagging and chunking of the sentences having named entities is carried out using pre-defined grammar rules and semantic tree parsing, and chunked off sentences are fed into another database, upon which testing is carried out. Finally, semantic analysis is carried out by extracting entity relation pairs which are then tested. After the results of all three approaches is compiled, graphs are plotted and variations are studied. Hence, a comparison of three different models is calculated and formulated. Graphs pertaining to the probabilities of the three approaches are plotted, which clearly demarcate them and throw light on the findings of the project.
@artical{l3122014ijcatr03121012,
Title = "Sentence Validation by Statistical Language Modeling and Semantic Relations",
Journal ="International Journal of Computer Applications Technology and Research(IJCATR)",
Volume = "3",
Issue ="12",
Pages ="812 - 814",
Year = "2014",
Authors ="Lakshay Arya"}
  • null