International Data Science & Engineering Symposium

Adopting Machine Learning Algorithms for Cloud-Based Application Categorization

Çağatay ÇATAL Besme ELNACCAR Özge ÇOLAKOĞLU Bedir TEKİNERDOĞAN1

Abstract

Manual categorization of applications in software repositories such as SourceForge is often time-consuming and error-prone. Automation of this process not only simplifies the daily task of administrators but also helps project owners to add their projects into the corresponding subcategory of the repository without any delay. In this study, we propose a cloudbased application categorization system that applies machine learning algorithms to support the classification of applications. The categorization system has a web-based client application to parse, process, and submit the project source code, a web service which automatically performs classification of applications into domain categories, and a cloud-computing platform which hosts the categorization service. Several multi-class classification algorithms have been adopted including, Artificial Neural Networks, Logistic Regression, Decision Jungle, and Decision Forest algorithms to validate the effectiveness of the system in multiple case studies. The case studies were performed on three public datasets generated based on 3286 Java applications of SourceForge repository. Our study shows that the highest accuracy was achieved with Artificial Neural Networks (ANN) algorithm. The resulting prediction model has been transformed into a web service and then, deployed on the Azure cloud platform.



Conference
International Data Science & Engineering Symposium
Keywords
Cloud Computing Software Maintenance Machine Learning Application Categorization End-To-End Cloud System

Language
English

Subject
Engineering

Full Paper (PDF)

345 views
168 downloads