The efficiency of a community Question Answering (cQA) forum depends on how efficiently the questions are tagged. A more appropriately tagged question increases the reach of the question by notifying the right audience. Moreover, it also increases the effiency of search in a cQA. TagMyQuestion is a recommendation system that works on a user given question title along with a knowledge base of questions in order to recommend additional relevant tags to the user. This helps in increasing the efficiency of the cQA and also takes the burden away from the user.

The thesis can be viewed here

The draft of research paper to be published can be viewed here

Hadoop single node setup

A single node Hadoop cluster was setup for demonstration purpose using WordCount as an example problem.

The documentation can be viewed here

Document partitioned Search

Distributed inverted indexes are created for efficient search of keywords in a corpus containing large number of documents. Such distribution can either be obtained by partitioning keywords or documents. In this minor project, document partitioned indexes were created with 10%, 25%, 50% and 100% documents in a single inverted index. BM25 ranking function was further employed to rank these documents for keyword searches. The execution time of such ranking was compared for each such partitioning to find the most efficient one.

The detailed report can be viewed here

Travelling Salesman Problem Approximation

Travelling Salesman Problem (TSP) belongs to the class of NP-Complete problems. However, there are several approximate algorithms for solving TSP for an approximate solution. One such approximation is that by using a Genetic Algorithm. Further description can be found here

Nine Tile Solver

AStar search was employed to solve Nine Tile problem in an efficient way. Further details can be found here

BPlus Store

A simple database was implemented using B+ Trees. Data is stored along with the index. Further details can be found here