One of our customer’s mobile app had a major issue related to memory. “Out of memory error” (OOME) which eventually resulted in in-app crash, and there were many complaints from customers. Of course, there were crash logs and we did find that crash appears to be a result of memory being consumed. The problem was not the crash per se but identifying why it is crashing.
Our testing context
We use Robotium for automation to help us test for regression, reliability and use it as a part of Continuous Integration. Our exploratory testers pick the APK once the build passes through automated checks. We focus on users and business. The app of our customer connects their business to users (or their customers) and we test if there are problems that threaten the value of the product. We bring in different kinds of information to our customers than just functional validation.
In this particular context of OOME crash, our automated scripts helped us identify the crash and on investigating, we found that script was aimed at reliability for infinite scrolling.
Part 1 of our investigation: Root cause analysis
We analyzed the crash logs and other relevant logs using the following tools
- Little eye.
- Eclipse MAT Analyzer.
- Automation tool – Robotium
Regular Performance analysis using Little Eye helped us to identify that heap memory was not freed and resulted in exceeding the allocated heap size and hence the reason for the app crash. But still, we were not able to narrow down to the object that was holding the memory. So to find that we used, Eclipse MAT Analyse. It revealed Textviews and image views were causing this problem.
Part 2 of investigation: What can help us serve users better?
The result of our Part 1 of the investigation was shared with our customer dev and tech teams. They debugged further and identified that the three text views objects inside viewpager were not releasing the memory and if these objects were called extensively for some user actions, the app runs out of memory.
The image view was consuming more memory but it needed more memory usage. The home page had a 25% allocation for Image cache. In order to provide more memory to image view’s this cache was reduced to 12%. After this memory adjustment, there was no OOME crash observed.
During this investigation, we tried seeking answers to the question: does our custom image loader resulted in out of memory leak? Our tests and inferences from the investigation revealed the answer as yes and we have moved away to Google’s Volley image loader. This update has gone live and we closely monitor what users are saying. We haven’t heard any OOME error post this.
Why are we sharing this with the world?
We are tired of the world trying to hide good work of their employees thinking it could give a competitive advantage to their competitors if they do so. We are interested at seeing the world a better place than protecting stuff in our chest. We hope this report can help someone facing OOME crash and trying to debug the issue.
The team from Moolya who gets the credit for this work is
- Manikandan MG, Software Developer in Test
- Dhyanesh Munde, Exploratory Software Tester
- Ankur Jain, Exploratory Software Tester
- With adequate support from our customer team