Ready to Start Your Career?

Apache Logs Analysis

Masood 's profile image

By: Masood

June 27, 2017

Log analysis can be a tedious task.  Raw logs do not reveal much information unless they are processed through a log analysis engine or Security information and event management (SIEM) solution. In this article, I will pull the Apache logs of my site into a log collector (Sumo Logic Free version which is a cloud-based data analytics service) and see what information I can get out of this. I will not show how to ingest the logs into Sumo Logic as that is a separate issue and Sumo Logic has documentation available to show how to do that. In this scenario, a log collector agent has been installed on the Ubuntu server to upload the logs to Sumo Logic cloud. 

Let’s take a look at the raw logs (auth.log) and see what they look like –

 1 06/26/2017 19:13:42.000 -0400 157.55.39.123 - - [26/Jun/2017:23:13:42 +0000] "GET / HTTP/1.1" 200 6534 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +https://www.bing.com/bingbot.htm)"Host: jupiter  Name: /var/log/apache2/access.log  Category: dev/os/linux  2 06/26/2017 19:11:28.000 -0400 157.55.39.123 - - [26/Jun/2017:23:11:28 +0000] "GET /wp-content/uploads/2017/06/cropped-shutterstock_114469858.jpg HTTP/1.1" 200 150607 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"Host: jupiter  Name: /var/log/apache2/access.log  Category: dev/os/linux  3 06/26/2017 19:06:33.000 -0400 209.150.43.6 - - [26/Jun/2017:23:06:33 +0000] "GET /favicon.ico HTTP/1.1" 200 203 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.79 Safari/537.36 Edge/14.14393"Host: jupiter  Name: /var/log/apache2/access.log  Category: dev/os/linuxWithout any parsing, this isn’t telling me much. Now let’s parse the IP address out (highlighted above) and do a count:_source = "APACHE" | parse "* " as visitor_ipHere’s the output:
values#%
91.200.12.1031,47084.92%
209.150.43.6995.72%
162.243.14.44533.06%
51.15.6.15331.91%
35.188.122.97271.56%
66.87.116.165120.69%
157.55.39.123110.64%
207.46.13.17290.52%
69.164.211.14490.52%
78.85.148.12780.46%
 
Interesting.. 1 IP is responsible for 85% of visits! Let's see if we can track this IP using Sumo Logic’s Geo Lookup features…_source = "APACHE" | parse "* " as visitor_ip| lookup latitude, longitude, country_code, country_name, region, city, postal_code, area_code, metro_code from geo://default on ip=visitor_ip| count by latitude, longitude, country_code, country_name, region, city, postal_code, area_code, metro_code| sort _count 
150.4500030.52330UAUkraine001,470
240.74541-73.90540USUnited StatesNYWoodside1137771850199
340.73080-73.99750USUnited StatesNYNew York1001121250153
437.41920-122.05740USUnited StatesCAMountain View940436508074
Now.. lets see what this IP is trying to accomplish, using logreduce: 

_source = "APACHE" | parse "* " as visitor_ip

 

| lookup latitude, longitude, country_code, country_name, region, city, postal_code, area_code, metro_code from geo://default on ip=visitor_ip | where country_name = "Ukraine" | logreduce

 

1,466  

91.200.12.103 - - [$DATE] "POST /wp-login.php HTTP/1.1" 200 3512 "http://masoodrahman.com/wp-login.php" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1; 125LA; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022)"

4   

91.200.12.103 - - [$DATE] "GET /***** HTTP/1.1" ***** "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"

With 1,466 “POSTS” and 4 “GETS”, it is clear that the POST requests to wp-login are brute-force attacks to log in to WordPress. Obviously, someone has already “scanned” the site and found out that the site contains WordPress.

With this information, we can be proactive and harden the WordPress site.

Schedule Demo