Recent Projects

  • Detection of sosckpuppet in Wikipedia [Aug’21 - July’22]

This project emphasis on the development of a framework to detect sockpuppet in Wikipedia. I am working on this project as part of my MS thesis.

  • Development of a search engine [Aug’21 - Dec’21]

In this project, I developed a search engine capable of query formation/suggestions and identifying candidate resources based on relevance ranking. [Project URL]

Past Projects

  • Movie recommendation system [Oct’21]

As part of this project, I built a content based recommender system using the disney plus movies and tv shows dataset.. [Project URL]

  • Improving the representation of search engine result pages for children [Jan’21 - Aug’21]

This project was done to improve search engine result pages (adding like, dislike, bookmarks, navigation icon) for children to foster interactions. [Project URL]

  • Readibiliy model for children [Jan’21 - Aug’21]

Developed a readability model for children’s text using lexical, syntactic features and BERT fine-tuning. [Project URL]

  • BERT fine-tuning [Apr’21]

In this project, I classified the WOS dataset using torch with a hugging face BERT tokenizer and pre-trained transformer model. [Project URL]

  • Context-free grammar design [Feb’21]

This project emphasizes on the generation of context-free grammar for tamarian language with NLTK library to apply parts of speech tag instead of words. [Project URL]

  • Identified misinformation on social media [Dec’20]

In this research project, misinformation was detected on social media datasets utilizing network, text, and news features. [Project URL]

  • Semi-optimal player for suspicion game [Dec’20]

In this project, optimization of the game moves using uncertainty approaches were implemented. [Project URL]

  • Agricultural change calculation [Dec’20]

Calculated the agricultural trend and measured the impact of urban development and climate change on the rural land of Idaho. [Data: USDA census of agriculture, US census, terraclimate]. [Project URL]

  • Assessing the prevalence of suspicious activities in asphalt pavement construction using algorithmic logics and machine learning [Aug’18 - July’2020]

Civil Engineering MS thesis: It is estimated that $340 billion is lost globally each year due to corruption in the construction industry. Asphalt pavement construction is specifically prone to potential suspicious activities, existence of which can translate to huge socioeconomical repercussions given the large size of the industry. Idaho Transportation Department (ITD) relies on Contractor produced Quality Control (QC) and State produced Quality Assurance (QA) test results for the payment of the Hot mix asphalt (HMA) pavement projects. In 2017, a case study found some unnatural trends where 74% of the ITD test results didn’t match the Contractor results. ITD’s approach to track down the accuracy of mix design and volumetric test data inspired this research to investigate suspicious activities in asphalt pavement projects. The first objective of this research is to develop an Artificial Intelligence system to recognize the patterns of discrepancies in agency- and Contractor-produced QC/QA test results. This was possible with a unique dataset that ITD collected from several dozen HMA projects, in which all instances of data entry to the material testing report file was recorded in the background, without the operator’s knowledge. Our solution was bifurcated into development of an algorithm to automatically detect and categorize suspicious instances when multiple data entry was observed. Modern data mining approaches were used to explore the latent insights and screen out suspicious incidences to identify the chances of suboptimal materials used for paving and extra payment in HMA pavement projects. We have also successfully prompted supervised machine learning technique to detect suspicious alteration (S.A.) instances from plausible correction (P.C.) cases. The second step of this research is to calculate the monetary losses due to data manipulation and alteration. We replicated ITD’s procedure for HMA payment calculation, and quantified payment-related parameters and associated payment for each project for two cases: 1. When first parameter value categorized as S.A. was used for payment calculation, and 2. When last S.A. parameter value is used for payment. It was evident from our findings that there has been excessive amount of over payment on construction projects across Idaho due to the manipulations. Overall, we found that there has been an overpayment ranging from $14,000 to $360,000 per project, for the available audit data. Further analysis shows that manipulation of each major material testing parameter’s value can cause roughly $1,000 to $5,000 overpayment. We also note that data manipulation did not always pursue monetary gains. Other possible motives include passing Percent Within Limit and precision criteria. Throughout the research we strive to automate suspicious activity detection system, calculate associated extra payment and develop tools to help the practitioners improve quality of HMA projects. [Project URL]

  • Detect fradulent activity [Aug’18 - July’2020]

This project was developed with the goal of applying machine learning algorithms to quantify the fraudulence activity. [Project URL]