From Conventional Methods to Large Language Models: A Systematic Review of Techniques in Mobile App Review Analysis
This paper focuses on app review analysis techniques, driven by the rapid advancement of the mobile app market and NLP techniques in optimizing mobile app user experiences.
Owing to technological advancements, app review analysis has rapidly evolved. This study examines both conventional and emerging techniques, including current advancements such as large language models (LLMs) in app review analysis. It provides an overview of the various methods used across different categories of app review analysis, comparing effective strategies for identifying user concerns and enhancing app functionality.
A systematic review was utilized based on two major standard guidelines, PRISMA and Kitchenham’s guidelines, for the period of 2014 to 2024. After defining the review protocol, papers were identified through keyword-based searches on six major online databases: Scopus, Web of Science, IEEE Xplore, ACM Digital Library, Science Direct, and Springer. Following screening and excluding papers based on defined quality criteria, 53 papers were considered for this study. The use of PRISMA ensures a transparent and reproducible review process, while Kitchenham’s guidelines provide a structured and rigorous approach for evaluating and synthesizing the literature.
This review study aims to evaluate the current state of knowledge on app review analysis techniques to improve mobile app user experiences. This study categorized the existing state-of-the-art papers into eight different categories, such as sentiment analysis, review classification, summarization, and prioritization, and examined challenges related to app review analysis. Furthermore, the study emphasizes the potential of LLMs for optimizing and automating app review analysis and provides future directions to address gaps in user-centric app development.
Among the eight main categories defined in app review analysis, sentiment analysis is the most prevalent, followed by review classification and information extraction. Most studies use a combination of these categories to achieve a comprehensive goal. Prioritization techniques such as risk matrices, thumbs-up count-based approach, and anomaly detection are widely used to identify emerging issues. Extracting meaningful information and evaluating the proposed approach are the most common challenges identified. Novel LLMs, like Chat-GPT, significantly enhance review analysis by automating the process, improving feature extraction, and enabling context-aware review classification.
The combination of conventional approaches and novel LLM-based methods can enhance both the efficiency and accuracy in identifying and addressing critical issues raised through mobile app user reviews. It effectively prioritizes user concerns by leveraging the strengths of both traditional preprocessing techniques and advanced LLMs.
Researchers are encouraged to explore the integration of emerging technologies like LLMs to enhance the of app review analysis, particularly in feature-specific sentiment analysis.
The results of this study contribute to enhancing the mobile app user experience through effective app review analysis, which improves user satisfaction and supports user-centered app development. This ultimately leads to a better mobile app ecosystem, benefiting both users and developers.
In the future, this research can be extended in multiple directions. Researchers can address the existing research gaps that LLMs have yet to address, particularly in prioritizing user concerns. Additionally, there is potential for further research on tool implementations focusing on identifying persistent issues through time series analysis by considering the app version and date of the app reviews. Moreover, there is a need to develop comprehensive frameworks that are more generalizable across different apps and categories, with a focus on identifying user concerns related to specific features.