An investigative design of optimum stochastic language model for bangla autocomplete

Md.Iftakher Alam Eyamin, Md. Tarek Habib, Muhammad Ifte Khairul Islam, Md. Sadekur Rahman, Md. Abbas Ali Khan

Abstract


Word completion and word prediction are two important phenomena in typing that have extreme effect on aiding disable people and students while using keyboard or other similar devices. Such autocomplete technique also helps students significantly during learning process through constructing proper keywords during web searching. A lot of works are conducted for English language, but for Bangla, it is still very inadequate as well as the metrics used for performance computation is not rigorous yet. Bangla is one of the mostly spoken languages (3.05% of world population) and ranked as seventh among all the languages in the world. In this paper, word prediction on Bangla sentence by using stochastic, i.e. N-gram based language models are proposed for autocomplete a sentence by predicting a set of words rather than a single word, which was done in previous work. A novel approach is proposed in order to find the optimum language model based on performance metric. In addition, for finding out better performance, a large Bangla corpus of different word types is used.


Keywords


Word prediction, natural language processing, language model, N-gram, machine learning, eager learning, performance metric

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v13.i2.pp671-676
Total views : 227 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

shopify stats IJEECS visitor statistics