The efficiency of a community Question Answering (cQA) forum depends on how efficiently the questions are tagged. A more appropriately tagged question increases the reach of the question by notifying the right audience. Moreover, it also increases the effiency of search in a cQA. TagMyQuestion is a recommendation system that works on a user given question title along with a knowledge base of questions in order to recommend additional relevant tags to the user. This helps in increasing the efficiency of the cQA and also takes the burden away from the user.
The thesis can be viewed here
The draft of research paper to be published can be viewed here
A single node Hadoop cluster was setup for demonstration purpose using WordCount as an example problem.
The documentation can be viewed here
Distributed inverted indexes are created for efficient search of keywords in a corpus containing large number of documents. Such distribution can either be obtained by partitioning keywords or documents. In this minor project, document partitioned indexes were created with 10%, 25%, 50% and 100% documents in a single inverted index. BM25 ranking function was further employed to rank these documents for keyword searches. The execution time of such ranking was compared for each such partitioning to find the most efficient one.
The detailed report can be viewed here
Travelling Salesman Problem (TSP) belongs to the class of NP-Complete problems. However, there are several approximate algorithms for solving TSP for an approximate solution. One such approximation is that by using a Genetic Algorithm. Further description can be found here