NLP & ML

Worked on topic modelling, Automatic Bot-QA, turn text response, text similarity, LDA, tf-idf, exceptional question/slang filtering etc. This app is for your child to discover top kids videos, songs and other learning content.

Worked on Multilingual machine translation, speech to text, phonetic typing, text to speech etc. This is a messaging and calling app that delivers the utmost privacy and security.

Scope: SRBD Icon of The month (July-August 2018) for outstanding contribution to the development of Bangla ASR PoC, participation on an international NLP Workshop in Jordan and inauguration of a collaborative ASR project with BUET professor Muhammad Abdullah Adnan, PhD and his students on the development of a full scale Bangla ASR for Samsung Electronics. Published a research paper for Samsung titled “Customizing Grapheme-to-Phoneme System for Non-Trivial Transcription Problems in Bangla Language” on North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019), At Minneapolis, USA. Guided some interns achieving their project goals on time.

LDA is a machine learning algorithm that extracts topics and their related keywords from a collection of documents. Using these information about different documents it is possible to do context based file searching according their relevance with search keyword or document classification based on their context. In LDA, a document may contain several different topics, each with their own related terms. The algorithm uses a probabilistic model for detecting the number of topics specified and extracting their related keywords. For example, a document may contain topics that could be classified as beach-related and weather-related. The beach topic may contain related words, such as sand, ocean, and water. Similarly, the weather topic may contain related words, such as sun, temperature, and clouds.

  • Spatio Temporal Relation Extraction From Online News Corpora In Different Languages.

Scope: Relation Extraction (spatial and temporal together, person-location-time ) from raw web news text in different languages(English, German, Hindi). Different Natural Language Processing techniques and Machine Learning techniques are exercised. Feature Engineering was the most critical part that I had already designed common to all experimenting languages. Experimented with Support Vector Machines and later Ensemble Learning. Before that I designed specialized annotation scheme to be applied on collected web data and finally collected in extensible markup language format for further application and improvement. Here some resources about this project.

This project was from Language Technologies Institute, School of Computer Science, Carnegie Mellon University under supervision of Judith Gelernter, PhD

  • Isolated Bangla Word Recognition and Speaker Detection by Semantic Modular Time Delay Neural Network.

Scope: Developed a Neural Network model for Bengali isolated acoustic words recognition on a collected corpus. I used concepts from Neuroscience and brain research with the application of Natural Language Processing techniques and Neural Network as Machine Learning approach. Neural network were designed in such a modular way that every module had some semantic role with time delay receptors for increasing experiences with time. Here some resources about this project..

  • Extracting Semantic Relatedness For Bangla Words.

A framework for extracting semantic relational words in Bangla is presented in this project. Here extraction of Synonyms, Antonyms, Hyponym, Hypernym, Meronym, Holonym and Polysemy are primarily investigated as a rule based model. For every word two other things: concept and parts of speech category are also presented for clarification. A semantic analyzer is used to extract these relations from nouns, adjectives and verbs. Here some resources about this project.

AI