ResearchHub | Open Science Community

Mining software insights: uncovering the frequently occurring issues in low-rating software applications

Nek Khan et al.Jul 10, 2024

In today's digital world, app stores have become an essential part of software distribution, providing customers with a wide range of applications and opportunities for software developers to showcase their work. This study elaborates on the importance of end-user feedback for software evolution. However, in the literature, more emphasis has been given to high-rating & popular software apps while ignoring comparatively low-rating apps. Therefore, the proposed approach focuses on end-user reviews collected from 64 low-rated apps representing 14 categories in the Amazon App Store. We critically analyze feedback from low-rating apps and developed a grounded theory to identify various concepts important for software evolution and improving its quality including user interface (UI) and user experience (UX), functionality and features, compatibility and device-specific, performance and stability, customer support and responsiveness and security and privacy issues. Then, using a grounded theory and content analysis approach, a novel research dataset is curated to evaluate the performance of baseline machine learning (ML), and state-of-the-art deep learning (DL) algorithms in automatically classifying end-user feedback into frequently occurring issues. Various natural language processing and feature engineering techniques are utilized for improving and optimizing the performance of ML and DL classifiers. Also, an experimental study comparing various ML and DL algorithms, including multinomial naive Bayes (MNB), logistic regression (LR), random forest (RF), multi-layer perception (MLP), k-nearest neighbors (KNN), AdaBoost, Voting, convolutional neural network (CNN), long short-term memory (LSTM), bidirectional long short term memory (BiLSTM), gated recurrent unit (GRU), bidirectional gated recurrent unit (BiGRU), and recurrent neural network (RNN) classifiers, achieved satisfactory results in classifying end-user feedback to commonly occurring issues. Whereas, MLP, RF, BiGRU, GRU, CNN, LSTM, and Classifiers achieved average accuracies of 94%, 94%, 92%, 91%, 90%, 89%, and 89%, respectively. We employed the SHAP approach to identify the critical features associated with each issue type to enhance the explainability of the classifiers. This research sheds light on areas needing improvement in low-rated apps and opens up new avenues for developers to improve software quality based on user feedback.

How Do Crowd-Users Express Their Opinions Against Software Applications in Social Media? A Fine-Grained Classification Approach

Nek Khan et al.Jan 1, 2024

App stores allow users to search, download, and purchase software applications to accomplish daily tasks. Also, they enable crowd-users to submit textual feedback or star ratings to the downloaded software apps based on their satisfaction. Recently, crowd-user feedback contains critical information for software developers, including new features, issues, non-functional requirements, etc. Previously, identifying software bugs in low-star software applications was ignored in the literature. For this purpose, we proposed a natural language processing-based (NLP) approach to recover frequently occurring software issues in the Amazon Software App (ASA) store. The proposed approach identified prevalent issues using NLP part-of-speech (POS) analytics. Also, to better understand the implications of these issues on end-user satisfaction, different machine learning (ML) algorithms are used to identify crowd-user emotions such as anger, fear, sadness, and disgust with the identified issues. To this end, we shortlisted 45 software apps with comparatively low ratings from the ASA Store. We investigated how crowd-users reported their grudges and opinions against the software applications using the grounded theory & content analysis approaches and prepared a grounded truth for the ML experiments. ML algorithms, such as MNB, LR, RF, MLP, KNN, AdaBoost, and Voting Classifier, are used to identify the associated emotions with each captured issue by processing the annotated end-user data set. We obtained satisfactory classification results, with MLP and RF classifiers having 82% and 80% average accuracies, respectively. Furthermore, the ROC curves for better-performing ML classifiers are plotted to identify the best-performing under or oversampling classifier to be selected as the final best classifier. Based on our knowledge, the proposed approach is considered the first step in identifying frequently occurring issues and corresponding end-user emotions for low-ranked software applications. The software vendors can utilize the proposed approach to improve the performance of low-ranked software apps by incorporating it into the software evolution process promptly.

Mining software insights: uncovering the frequently occurring issues in low-rating software applications

How Do Crowd-Users Express Their Opinions Against Software Applications in Social Media? A Fine-Grained Classification Approach

Scan to connect with one of our mobile apps

Coinbase Wallet app

Coinbase app

Or try the Coinbase Wallet browser extension