Analyze and Visualize Data with Android APK

September 21, 2017 | Views: 2745

Begin Learning Cyber Security for FREE Now!

FREE REGISTRATIONAlready a Member Login Here

Analyze and Visualize Data Android APK with LINT in Python CGI Programming

In this brief tutorial, I will show you, step by step, how to analyze an Android APK with Android LINT as well as create a nice graphical output of the result into python CGI web.

Note: Make sure you have installed Android LINT (Android SDK) into your operating system and also make sure you have enabled and configured the Python CGI directory on your operating system.

Step 1: Use Android LINT to analyze Android APK

Before we start! First, you should know what Android lint is. According to tools.android.comAndroid Lint is a new tool introduced in ADT 16 (and Tools 16) which scans Android project sources for potential bugs.

Here are some examples of the types of errors that it looks for:

  • Missing translations (and unused translations)

  • Layout performance problems (all the issues the old layoutopt tool used to find, and more)

  • Unused resources

  • Inconsistent array sizes (when arrays are defined in multiple configurations)

  • Accessibility and internationalization problems (hardcoded strings, missing contentDescription, etc)

  • Icon problems (like missing densities, duplicate icons, wrong sizes, etc)

  • Usability problems (like not specifying an input type on a text field)

  • Manifest errors

Also it finds hardcoded strings in your java class APK. So it is a really good tool for whoever wants to improve the code and check your APK code to find bugs or vulnerabilities inside. Now, we have covered enough important theory that we should know before using android lint, so let’s start coding.

To generate the data of the analysis, I will use these tools from github “keepsafe / android-resource-remover” (link: https://github.com/KeepSafe/android-resource-remover) but I will customize the code slightly. Instead of removing the bug, I will alter the code to find the bug.

You can install from pip or just download from github link that I provided:

command : #~ pip install android-resource-remover

(Note: I downloaded the zip file from Github so I could edit and customize the code.)

After downloading the zip file of the code, go to the android_clean_app.py file. From the provided screenshot, comment the part of the code that removes the resource and value so it will make the code find the bug for us.

From this point you have two choices:

1. Use Python version 2 to run the program:

python android_clean_app_py [--lint LINT] [--app APP] [--xml XML]

2. Use python version 3 to run this program:

I will choose the second option because in my project I have to create an automated tool of analyzing Android APK using Python version 3 and because the android_clean_app.py is in python version 2, I have to use the subprocess-call to execute the script. When I try to convert the Python program to Python version 3, there are some dependencies. To save time, I will execute it and pipe the result.

Following is the code from python version 3 that will execute the android_clean_app.py:

#!/usr/bin/python3

import subprocess

lint_path = “/usr/share/android-sdk/lint” #path of my lint tools

app_path = “ “ #fill with your own directory output

process = subprocess.Popen([‘python3’,’android_clean_app.py’,’--lint’,lint_path,’--app’,app_path],stdout=subprocess.PIPE,stderr=subprocess.PIPE)

out,error = process.communicate() #print out for debugging

So, after you execute the program it will create an XML file contain the value inside the directory you specify.

Step 2: Organize the XML into CGI web

Now that you have the result, you can move to the second step which is organizing the result into the python CGI. The following is code that I used in my project, so please edit or customize according to your needs. The scenario in this program is that which will get the result from another main Python program to get the value of the application directory path using form CGI.

The following is the result of the code that I used:

#!/usr/bin/python3

import xml.etree.ElementTree as ET

import numpy as np

import cgi

import cgitb

import os

import html

import sys

cgitb.enable() #for debugging

form = cgi.FieldStorage()

path_lint = form.getvalue('path') #get the path from main python cgi program

os.chdir(path_lint)

xml = ET.ElementTree(file="lint-result.xml") #create an object to handle the result

root = xml.getroot()

print("Content-type: text/htmlrnrn")

print("<h1 align='center'>Lint Analysis:</h1>")




for child,xml in zip(root,root.iter(tag='location')): #zip two value tag (issues, location)

print("<table width='100%'>")

print("<tr>")

print("<th width='20%' bgcolor='#6666ff'>")

print(child.tag)

print("</th>")

print("<th width = '80%' bgcolor='#6666ff'>")

print("</th>")

print("</tr>")

print("<tr>")

print("<td width='20%'>")

print("Id:")

print("</td>")

print("<td width = '80%'>")

print(child.attrib["id"])

print("</td>")

print("</tr>")

print("<tr>")

print("<td width='20%'>")

print("Severity:")

print("</td>")

print("<td width = '80%'>")

print(child.attrib["severity"])

print("</td>")

print("</tr>")

print("<tr>")

print("<td width='20%'>")

print("Message:")

print("</td>")

print("<td width = '80%'>")

print(child.attrib["message"])

print("</td>")

print("</tr>")

print("<tr>")

print("<td width='20%'>")

print("Summary:")

print("</td>")

print("<td width = '80%'>")

print(child.attrib["summary"])

print("</td>")

print("</tr>")

if "errorLine1" not in child.attrib.keys(): #sometimes in the result there will be no errorline attribute

pass

else:

print("<tr>")

print("<td width='20%'>")

print("Errorline:")

print("</td>")

print("<td width = '80%'>")

print(html.escape(child.attrib["errorLine1"])) #convert to readable string because the value is in html format

print("</td>")

print("</tr>")

print("<tr>")

print("<td width='20%' bgcolor='#6666ff'>")

print(xml.tag)

print("</td>")

print("<td width = '80%' bgcolor='#6666ff'>")

print("</td>")

print("</tr>")

print("<tr>")

print("<td width='20%'>")

print("Location:")

print("</td>")

print("<td width = '80%'>")

print(xml.attrib["file"])

print("</td>")

print("</tr>")

print("</table>")

print("<br/>")

print("<br/>")

 

 

Step 3: Visualize LINT result into graph

Now, this is the tricky part. You cannot just simply put the graph to the cCGIwebsite. According to “http://scipy-cookbook.readthedocs.io

Trying to use matplotlib in a python CGI script naïvely will most likely result in the following error:

 

...
352, in _get_configdir
raise RuntimeError("'%s' is not a writable dir; you must set
environment variable HOME to be a writable dir "%h)
RuntimeError: '<WebServer DocumentRoot>' is not a writable dir; you must set
environment variable HOME to be a writable dir

Matplotlib needs the environment variable HOME to point to a writable directory. One way to accomplish this is to set this environment variable from within the CGI script on runtime (another way would be to modify the file but that would be not as portable).

You can check the coding at this link: (http://scipy-cookbook.readthedocs.io/items/Matplotlib_Using_MatPlotLib_in_a_CGI_script.html)

So here is the following code that I used to generate bar graph:

(Note: These two codes that I just showed are used in the same CGI program. I break it down into two parts to make it easier for me to explain the functionality of the code section.)

#!/usr/bin/python3

import xml.etree.ElementTree as ET

import numpy as np

import cgi

import cgitb

import os

import html

import sys

import matplotlib

import pylab

import matplotlib.pyplot as plt




data1 = []

data2 = []

os.environ['HOME'] = path_lint

matplotlib.use('Agg') # chose a non-GUI backend

xml = ET.ElementTree(file="lint-result.xml") #create an object to handle the result

root = xml.getroot()

for child in root: #loop for find unique key

if child.attrib["id"] in data1:

pass

else:

data1.append(child.attrib["id"])

for line in data1: #loop to find how many issues have occur in the lint result

count = 0

for line_root in root:

if line in line_root.attrib["id"]:

count += 1

data2.append(count)

data_x = [x for x in range(0,len(data1))]

data_x1 =['A'+str(x) for x in range(0,len(data1))]

#i stick two value into one because the x axis of the graph cannot be change to string so I have to #map it with xticks function

plt.xticks(data_x,data_x1)

pylab.bar(data_x,data2)

pylab.savefig("chartlint.png", format='png') #save the figure to the directory

link to the result's picture: https://imgur.com/a/aiSCx

 

Step 4: Assemble the code into one and put it into Python CGI web

For all of you who are a little bit confused about how to put the code together, here is the full code construction that I use in my project.

 

#!/usr/bin/python3

import xml.etree.ElementTree as ET

import numpy as np

import cgi

import cgitb

import os

import html

import sys

cgitb.enable()




form = cgi.FieldStorage()

path_lint = form.getvalue('path')

os.chdir(path_lint)

xml = ET.ElementTree(file="lint-result.xml")

root = xml.getroot()

print("Content-type: text/htmlrnrn")

print("<h1 align='center'>Lint Analysis:</h1>")

data1 = []

data2 = []




os.environ['HOME'] = path_lint




import matplotlib

matplotlib.use('Agg')

for child in root:

if child.attrib["id"] in data1:

pass

else:

data1.append(child.attrib["id"])

for line in data1:

count = 0

for line_root in root:

if line in line_root.attrib["id"]:

count += 1

data2.append(count)

import pylab

import matplotlib.pyplot as plt

data_x = [x for x in range(0,len(data1))]

data_x1 =['A'+str(x) for x in range(0,len(data1))]

plt.xticks(data_x,data_x1)

pylab.plot(data_x,data2)

pylab.savefig("chartlint.png", format='png')

for child,xml in zip(root,root.iter(tag='location')):

print("<table width='100%'>")

print("<tr>")

print("<th width='20%' bgcolor='#6666ff'>")

print(child.tag)

print("</th>")

print("<th width = '80%' bgcolor='#6666ff'>")

print("</th>")

print("</tr>")

print("<tr>")

print("<td width='20%'>")

print("Id:")

print("</td>")

print("<td width = '80%'>")

print(child.attrib["id"])

print("</td>")

print("</tr>")

print("<tr>")

print("<td width='20%'>")

print("Severity:")

print("</td>")

print("<td width = '80%'>")

print(child.attrib["severity"])

print("</td>")

print("</tr>")

print("<tr>")

print("<td width='20%'>")

print("Message:")

print("</td>")

print("<td width = '80%'>")

print(child.attrib["message"])

print("</td>")

print("</tr>")

print("<tr>")

print("<td width='20%'>")

print("Summary:")

print("</td>")

print("<td width = '80%'>")

print(child.attrib["summary"])

print("</td>")

print("</tr>")

if "errorLine1" not in child.attrib.keys():

pass

else:

print("<tr>")

print("<td width='20%'>")

print("Errorline:")

print("</td>")

print("<td width = '80%'>")

print(html.escape(child.attrib["errorLine1"]))

print("</td>")

print("</tr>")

print("<tr>")

print("<td width='20%' bgcolor='#6666ff'>")

print(xml.tag)

print("</td>")

print("<td width = '80%' bgcolor='#6666ff'>")

print("</td>")

print("</tr>")

print("<tr>")

print("<td width='20%'>")

print("Location:")

print("</td>")

print("<td width = '80%'>")

print(xml.attrib["file"])

print("</td>")

print("</tr>")

print("</table>")

print("<br/>")

print("<br/>")

 

That’s all!  It may be pretty slow for those of you who are using xmltree and matplotlib for the first time, but just give it a try and you will get the phase once you grasp the essence. As a final note and warning, be careful when you chunk a lot of data into a graph because, by default, the graph will not adapt to the size of the data. So, it will probably be bulky and not give you a nice output. You will want to customize the for matplotlib graph to handle a large data.

Share with Friends
FacebookTwitterGoogle+LinkedInEmail
Use Cybytes and
Tip the Author!
Join
Share with Friends
FacebookTwitterGoogle+LinkedInEmail
Ready to share your knowledge and expertise?
1 Comment
  1. Thanks for sharing with us as I already installed Andriod LINT on my system and python CGI directory is also enable but then also I am finding a problem visualizing the data in Andriod.APK??
    Rocketmail-Customer-Service

Comment on This

You must be logged in to post a comment.

Our Revolution

We believe Cyber Security training should be free, for everyone, FOREVER. Everyone, everywhere, deserves the OPPORTUNITY to learn, begin and grow a career in this fascinating field. Therefore, Cybrary is a free community where people, companies and training come together to give everyone the ability to collaborate in an open source way that is revolutionizing the cyber security educational experience.

Cybrary On The Go

Get the Cybrary app for Android for online and offline viewing of our lessons.

Get it on Google Play
 

Support Cybrary

Donate Here to Get This Month's Donor Badge

 
Skip to toolbar

We recommend always using caution when following any link

Are you sure you want to continue?

Continue
Cancel