What is the average size of an Android app, or what is the average rating and how many permissions are requested by apps? These are some of the questions that might pop in your head when you are playing around with a huge data-set of apps from Play Store.
A recently acquired data-set containing details about more than 200.000 applications originating from Play Store sparked the interest of creating some cool statistics from it. Although, the data-set itself was going to be used for an entirely different purpose, it would be a shame to lose the opportunity to bring forth our inner data scientist.
In the rest of this article we are going to present and comment some graphs generated from that data-set. By applications we define only games and apps from Play Store and not movies, books, series, audio-books etc. Please note that we will simply present the graphs and comment each one shortly, the goal is to simply share them and not an in depth analysis explaining them.
The following statistics are shown:
- Installation Size
- Star Rating + Free/Paid
- Free vs Paid percentage
- Number of apps released per year
- Number of apps updated per year
- Number of apps never updated per year
- Number of apps per creator
- Star Rating for different number of downloads
- Comments count per Star Rating
- Number of apps per category
- Number of permissions requested
- Percentages on the different permissions
As shown in the figure above, the installation size can vary a lot between the different apps with an average size of approximately 45 Mb.
The two figures above show first, the average rating for all applications which is 4.07 out of 5 and in the second one the average if we separate the groups between the Free(4.06) and the Paid(4.19) ones.
A rather simple graph showing that from our data-set the paid apps are not even the 5% of it.
It is interesting to see here how many applications from our data-set were released each year. Also it is nice to see the difference between the free and the paid apps, as it seems that the rate of releases for paid apps did not change as much in comparison to the free apps.
Interesting to see that for all cases the 2022 was by far the year with the more updates for applications. Which leads us to the question, are there perhaps apps that are released and not updated?
Although we are still early in 2023 we can see how many new releases and updates we have. So for the 2023 it makes sense to have releases but not updates yet as these are new apps. What about the previous years though? We can see that an amount of 5-10k apps from 2020 and 2021 were never updated after release. Given how rare it is for an application not to require any kind of update after a release, it might be that a big percentage of these apps are abandoned by their creators.
Amazing that there are really a lot of companies/creators with over 100 apps each. Among all these companies the name "Kirill Sidorov" stands out as unless I am mistaken is the only non-company creator in this list with multiple applications.
There are two interesting things to notice in this graph. The first is how the two top graphs about the 100 million and 10 million downloads are more concentrated around the star rating of 4, while the bottom two are more spread. If we think about it that makes sense as the more popular an application becomes the more attention is given to it by the developers to improve it further.
The second thing to notice, that can be seen clearly on the first figure of the 100 million downloads, is that they are concentrated not only around the star rating of 4 as we mentioned before but also towards the first 50k positions. The apps were collected mimicking the behavior of a normal user, which means that Play Store itself would promote applications that have higher chances to be liked by the user. Applications that have already a high number of downloads are of course very likely to be liked by other users and so are being promoted by Play Store and therefore introduced to the user before others.
Another rather simple and expected graph, which shows that we have a higher comment count for applications that have high star rating. Around the 4.5 star rating we can see applications with several million comments.
Of-course the games in that category win by a lot any other single app category and the category of "Education" comes second, which was a surprise to me, as I was expecting something "fancier" to be there.
A very interesting graph especially from a security perspective. It shows that most apps (almost 16k in our data-set) request just 8 permissions, while there are apps cropped from this graph for better appearance, that request more than 200 permissions!
Seeing this graph, makes us question, about the type of permissions requested, which brings us to the following graph:
Let us now explain what we are seeing in this graph. As you might know there is some classification about the different permissions from Android itself. The different categories we have in this graph are: Dangerous, Signature and Third-Party. The first two refer to the "Protection Level" while the third one refers to a note stating:
Not for use by third-party applications that can be found in some of the permissions. For example INSTALL_PACKAGES is one of that permissions that are not supposed to be used by third party application yet as it can be seen by the graph it is not really the case.
So we can see that approximately 50% of the apps request at least one permission that is classified as dangerous, 30% do not have a permission belonging to each of the three categories and 12% request a permissions that belong to the dangerous and the signature category.
Feel free to analyze on your own the figures and make your own conclusions. If this is of interest to you, and you have more ideas on what other graphs can be created using the units available, please reach out to me and let me know.