In this article we are going to introduce apkInspector, a tool designed to provide insights about the zip structure and the AndroidManifest even in cases where static analysis evasion techniques are employed. But before diving into the details of apkInspector, it's important to establish its origins and the specific need it addresses.
Contents
On June 28, 2023, Joe Security posted an analysis on a social media platform. This analysis pertained to a submission made to their tool that could not be thoroughly assessed using ordinary static analysis methods/tools. You can access the specific submission analysis here (re-uploaded it as it was removed). The Yara rule that is triggered during this analysis has the description 'Yara detected an APK with invalid zip compression'.
Although this behavior is not entirely unprecedented, delving into the intricacies behind it proved to be a particularly intriguing endeavor, revealing a critical gap in existing tools. Remarkably, it became apparent that none of the current tooling seemed to cover the evasion tactics this APK employed. This is where apkInspector steps in, purposefully designed to fill this crucial void.
Without diving into the details of what makes this particular APK unique and why typical static analysis tools stumble, it becomes apparent that it has undergone tampering. This tampering extends beyond merely deviating from the zip specification standard; it also deviates from the expected structure of the AndroidManifest.xml file.
The AndroidManifest.xml file, crucial for Android app configuration, is deliberately structured in a manner contrary to conventional norms. Given that most tools operate based on established specifications, their failure is almost inevitable when faced with such non-compliance.
Of course the only reason this becomes intriguing is because, despite these alterations, Android exhibits the capability to seamlessly install and run the APK without encountering any issues.
Top 3 existing tooling
To provide a comprehensive understanding of apkInspector's functionality and capabilities, the most effective approach is to illustrate its operation through practical examples, contrasting it with other existing tools. So before seeing what apkInspector has to offer, lets see how other known tools behave when attempting to statically analyze this tampered APK.
We will assess each tool by testing two distinct scenarios: first, with the original tampered APK, and second, with a modified APK ("partially fixed" APK hereafter) where only the tampering, specifically related to the ZIP structure, has been rectified.
1) apktool
apktool needs no introduction as it is one of the standard tooling used most often when attempting to reverse engineer an APK. The following figure shows the error message retrieved by apktool when attempting to decode the original APK:
As can be seen the tool fails with the highlighted error message. The compression method used for one of the files is not among the standard ones.
When using the "partially fixed" APK, as can also be seen in the following figure, the error is different this time as is the cause.
2) Jadx
jadx as one of the best tools when reverse engineering android applications, could not miss from this list. The following figure shows the output error when attempting to load the APK:
Again we can verify that the error is related to the tampered compression method.
Trying jadx against the "partially fixed" APK yields the following results:
As you can see jadx is now able to load the source code from the decompiled .dex files within the APK, but it is unable to process the AndroidManifest.xml file.
3) Androguard
Last but not least, Androguard is the most versatile and comprehensive tool for Android application analysis. When we put it to the test with the APK, here are the results:
Again, as we the tools shown before, the behavior is the same as is the error.
When trying it against the "partially fixed" APK we have the following:
Now there is no error related to the zip structure but we see a very interesting warning related to the tampering of the AndroidManifest.xml file. Still though Androguard is unable to process the AndroidManifest.xml file properly.
apkInspector: Used as CLI
In this section, we will showcase the command-line interface (CLI) version of apkInspector and delve into the array of options it offers in its current version(1.1.6). As our example APK, we will employ the same malicious APK referenced above.
The following snippet shows the help message offered by the tool:
$ apkInspector -h
usage: apkInspector [-h] [-apk APK] [-f FILENAME] [-ll] [-lc] [-la] [-e] [-x] [-xa] [-m] [-sm SPECIFY_MANIFEST] [-a] [-v]
apkInspector is a tool designed to provide detailed insights into the zip structure of APK files, offering the capability to extract content and decode the
AndroidManifest.xml file.
options:
-h, --help show this help message and exit
-apk APK APK to inspect
-f FILENAME, --filename FILENAME
Filename to provide info for
-ll, --list-local List all files by name from local headers
-lc, --list-central List all files by name from central directory header
-la, --list-all List all files from both central directory and local headers
-e, --export Export to JSON. What you list from the other flags, will be exported
-x, --extract Attempt to extract the file specified by the -f flag
-xa, --extract-all Attempt to extract all files detected in the central directory header
-m, --manifest Extract and decode the AndroidManifest.xml
-sm SPECIFY_MANIFEST, --specify-manifest SPECIFY_MANIFEST
Pass an encoded AndroidManifest.xml file to be decoded
-a, --analyze Check an APK for static analysis evasion techniques
-v, --version Retrieves version information
The flags --list-central
and --list-local
are providing information about the central directory entries and the local header entries respectively. The information retrieved in each case is according the ZIP specification, and can be exported to JSON using the flag --export
. In most cases though, an APK contains a large number of entries and therefore inspecting the entire list can be time consuming, especially when we are looking for a specific file. For this reason there is the flag --filename
which can be used to specify a single filename for which the central directory entry along with the local header entry will be fetched.
Given that the tricks employed for evading static analysis tools are targeting to create issues during the extraction process, then the --extract
and --extract-all
flags are of high interest. Furthermore, the other target for these evading tactics is the AndroidManifest file, which can be either extracted with a combination of the flags --filename
and --extract
or extracted and decoded directly using the flag --manifest
. Finally, the --analyze
flag will attempt to detect if there are any evasion tactics employed by an APK.
Let us try now apkInspector against the same APK file as before:
The figure above shows apkInspector using the flag -xa
to extract all content from the APK, which was completed successfully. We can also see that the AndroidManifest.xml was extracted with the correct size.
If we now attempt to parse the AndroidManifest.xml, it yields the following results:
As can be seen in the figure above, the manifest was extracted and decoded successfully.
Finally, the flag to provide us details as to what was tampered for a given APK is shown below:
$ apkInspector -apk infected.apk -a
apkInspector Version: 1.1.6
Copyright 2023 erev0s <[email protected]>
1 file(s) listed below appear to have a tampered zip structure!
--------------- AndroidManifest.xml ---------------
central compression method : 2920
local compression method : 9154
actual compression method : STORED_TAMPERED
differing headers : ['compression_method']
The AndroidManifest.xml file was tampered using the following patterns:
file_type: 0
string count: 212
real string count: 131
dummy attributes: found (verify manually)
dummy data: found
apkInspector: Used as Library
The CLI available is backed up by the library offered by apkInspector. In the current version (1.1.6) there are five modules available within the library. The purpose of this section is to showcase a few of the available methods.
The provided code snippets serve as illustrative examples demonstrating the utilization of apkInspector as a library. Please note that these snippets are not exhaustive and are intended to showcase various use cases. The code operates on the same APK file as previously mentioned.
The one to start with, is the headers
module, which is responsible for identifying the proper parts of the APK following the zip specification. This module contains the relevant classes that form a composition of a zip file. The class ZipEntry
contains a classmethod
named parse
, which can be used to identify the end of central directory record, the central directory entries and all the local header entries.
The following snippets show examples on how to read the APK file, list the contents and decode the AndroidManifest.xml
.
Parsing the APK:
>>> from apkInspector.headers import ZipEntry
>>> with open('infected.apk', 'rb') as apk_file:
... zipentry = ZipEntry.parse(apk_file)
The "end of central directory" record is the first thing that is parsed as part of the APK. The following snippet shows the representation of it as a dictionary:
>>> zipentry.eocd.to_dict()
{'signature': b'PK\x05\x06', 'number_of_this_disk': 0, 'disk_where_central_directory_starts': 0, 'number_of_central_directory_records_on_this_disk': 13, 'total_number_of_central_directory_records': 13, 'size_of_central_directory': 906, 'offset_of_start_of_central_directory': 401408, 'comment_length': 0, 'comment': ''}
The information provided by the 'end of central directory' record, are enough to pinpoint the 'central directory' and get a list of all the entries contained within the APK. This is automatically handled by apkInspector and returns a dictionary with all the entries detected along with their details. The following snippet is an example on how could someone retrieve this dictionary:
zipentry.central_directory.to_dict()
{'AndroidManifest.xml': {'version_made_by': 20, 'version_needed_to_extract': 10, 'general_purpose_bit_flag': 2048, 'compression_method': 2920, [...]}, 'resources.arsc': {'version_made_by': 20, 'version_needed_to_extract': 10, 'general_purpose_bit_flag': 2048, 'compression_method': 0,[...]
Furthermore, a list of the local header for each entry in the central directory are available in the zipentry
object. Fetching the local header of a specific file, for example for the AndroidManifest.xml
is shown below:
>>> zipentry.get_local_header_dict('AndroidManifest.xml')
{'version_needed_to_extract': 10, 'general_purpose_bit_flag': 2048, 'compression_method': 9154, 'file_last_modification_time': 41339, 'file_last_modification_date': 22168, 'crc32_of_uncompressed_data': 3352709148, 'compressed_size': 6214, 'uncompressed_size': 17756, 'file_name_length': 19, 'extra_field_length': 3, 'filename': 'AndroidManifest.xml', 'extra_field': '\x00\x00\x00'}
Utilizing the information gathered from the headers module, the extract module takes care of extracting a single entry or all the entries detected in the central directory header.
For convenience there are two methods within the extract
module handling the extraction process without the need to deal with the headers
module as well. The following snippet shows an example of both methods, the first when extracting all the entries of the APK and the second extracting a single file from the APK:
>>> from apkInspector.extract import extract_single_file, extract_all
>>> apk_path = "/path/to/infected.apk"
>>> extract_all(apk_path, "./")
Extraction successful for: /path/to/infected
>>>
>>>
>>> extracted_data = extract_single_file(apk_path, "AndroidManifest.xml", save=True)
Data saved to EXTRACTED_AndroidManifest.xml
Finally, the axml
module takes care of decoding the AndroidManifest.xml. It may be used directly on a raw AndroidManifest.xml file, or on an APK, to extract only the AndroidManifest.xml and then decode it. The second case is shown below:
>>> from apkInspector.axml import parse_apk_for_manifest
>>> manifest = parse_apk_for_manifest(apk_path, save=False)
>>> manifest
'<manifest xmlns:android="http://schemas.android.com/apk/res/android" package="wyija.utykuvr.uwpexgh"[...]
apkInspector to the TEST
I might claim that apkInspector works and yields proper results, but without actual field testing and validation, there is no way to be sure about the quality of the tool.
For this reason, it was decided to utilize our access from a previous project, and fetch a bit more than 1500 of the top applications from all categories from Play Store as of November 2023. There are two assumptions that we should consider for this experiment:
- Given that these apps belong to the TOP apps, no static analysis evasion techniques are expected to be used and therefore we can compare the results from the extraction process and the decoding of the AndroidManifest with other well known tools.
- The efficiency of apkInspector against applications that employ static analysis evasion techniques can not be compared with other tools and thus validated, as to our knowledge there is no other tool capable of handling these APKs.
The objective is to subject these applications to analysis using apkInspector in conjunction with androguard and zipfile for the purpose of validating the accuracy of apkInspector's results.
The methodology involves a dual approach. Firstly, leveraging zipfile, we extracted the contents of each APK. Subsequently, apkInspector was employed for the same task. The ensuing directories from both methods were then compared to ensure the exact same output, thereby validating the consistency of results.
Moving to the second phase, our focus shifts to extracting the AndroidManifest.xml files. This was performed separately using both androguard and apkInspector. The contents of these manifest files were then undergone a comparison utilizing xml.etree.ElementTree
to verify that they contain identical elements, adding an additional layer of validation to the analysis process.
An overview of the test is available here, and the results are here. The results were very positive and as stated also in the overview in the github page linked above:
The results of the test indicate that apkInspector can unzip an APK and decode the AndroidManifest.xml reliably and efficiently, comparable to other tools such as androguard.
Conclusion
apkInspector is able to handle properly and reliably the extraction process of the entries of an APK and decode the AndroidManifest file. It shines specifically when static analysis evasion tactics are employed as it manages to still process the APK, when other tooling fails to do so.
Hopefully, apkInspector can become a tool that can be used widely, especially when dealing with malware APKs. Please keep in mind that the apkInspector project is work in progress and more features and options are expected to be added, but besides this, bugs and issues may come up, so if you have ideas on what can be added, or you found a bug, make sure you open an issue on github to take it further.
As always feel free to reach out for any interesting ideas or suggestions.