Log Analysis: Access Log Analysis Using Command Line

HAppy New Year 2017!!! My first entry on January. Hopefully will assist in Web Attack investigations.

First, we need know a Log Format :
 

"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O"

%h Remote host, the IP address of the request.
%l Remote logname, this will never have a value as IdentityCheck is off, it’s just included for backwards compatability.
%u Remote user if htauth is being used (may be bogus if return status (%s) is 401)
%t Time the request was received in the format [day/month/year:hour:minute:second zone]
%r First line of the request
%>s The final HTTP status code, see full list of possible status codes in the HTTP 1.1 specification (RFC2616 section 10).
%b Size of response in bytes, excluding HTTP headers. In CLF format, i.e. a ‘-’ rather than a 0 when no bytes are sent.
%{Referer} The “Referer” (sic) HTTP request header, this is provided by the client request so it may be bogus.
%{User-Agent} The User-Agent HTTP request header, this is provided by the client request so it may be bogus.
%I Bytes received, including request and headers.
%O Bytes sent, including headers.


Then some introduction about tools for analysis :

cat –  prints the content of a file in the terminal window
grep – searches and filters based on patterns
awk –  can sort each row into fields and display only what is needed
sed –  performs find and replace functions
sort – arranges output in an order
uniq – compares adjacent lines and can report, filter or provide a count of duplicates

wc - displays the number of lines, words, and bytes contained in each input file, or standard input (if no file is specified to the standard output.
head - display first lines of a file 
tail - display the last part of a file

awk '{print $1}' access.log          # ip address (%h)
awk '{print $2}' access.log          # RFC 1413 identity (%l)
awk '{print $3}' access.log          # userid (%u)
awk '{print $4,5}' access.log       # date/time (%t)
awk '{print $9}' access.log          # status code (%>s)
awk '{print $10}' access.log        # size (%b)
awk -F\" '{print $2}' access.log   # request line (%r)
awk -F\" '{print $4}' access.log   # referer
awk -F\" '{print $6}' access.log   # user agent
wc -l access.log                        # display number of lines
head -1 access.log                        # first lines of the files
tail -1 access.log                          # last lines of the files

Identify IP addresses are making the most request:

analyst#~awk '{print $1}' www-access.log | sort | uniq -c | sort -n | tail -10
   3 190.166.87.164
   4 114.111.36.26
   4 123.4.59.174
   4 92.62.43.77
   6 208.80.69.69
  12 221.192.199.35
  14 208.80.69.74
  18 10.0.1.14
  36 65.88.2.5
 241 10.0.1.2

 

Files are most requested :

analsyt#~ head www-access.log | awk '{print $7}'
/feed/
/feed/
/feed/
/feed/
http://proxyjudge1.proxyfire.net/fastenv
/feed/
/feed/
http://www.wantsfly.com/prx2.php?hash=FABB83E72D135F1018046CC4005088B36F8D0BEDCEA7
/feed/
/feed/


Most popular individual occurrences ad how often each line occurs

analyst#~awk '{print $7}' www-access.log | sort | uniq -c | head
   9 /
 272 /feed/
   3 /login/
   2 /robots.txt
  20 /signup/
   1 /wp-admin
   1 /wp-admin/
  18 /wp-cron.php?doing_wp_cron
   3 72.51.18.254:6677
   4 92.62.43.77:6667


Sort by most popular and axe all but the top few matches:


analyst#~awk '{print $7}' www-access.log | sort | uniq -c | sort -rn | head
 272 /feed/
  20 /signup/
  18 /wp-cron.php?doing_wp_cron
  15 http://proxyjudge1.proxyfire.net/fastenv
  12 http://www.wantsfly.com/prx2.php?hash=FABB83E72D135F1018046CC4005088B36F8D0BEDCEA7
   9 /
   4 92.62.43.77:6667
   3 http://72.51.18.254:6677
   3 72.51.18.254:6677
   3 /login/


Identify different server responses and requests:

analyst#~awk '{print $9}' www-access.log | sort | uniq -c | sort
   2 502
   3 400
   4 500
   8 301
  13 302
  29 404
 306 200


List all user agents ordered by the number of times they appear (descending order):

analyst#~awk -F\" '{print $6}' www-access.log | sed 's/(\([^;]\+; [^;]\+\)[^)]*)/(\1)/' | sort | uniq -c | sort -fr
 272 Apple-PubSub/65.12.1
  20 Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3
  18 WordPress/2.9.2; http://www.domain.org
  15 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
  13 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
   8 -
   6 Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_2; en-us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/531.21.10
   4 Mozilla/4.0 (compatible; NaverBot/1.0; http://help.naver.com/customer_webtxt_02.jsp)
   3 pxyscand/2.1
   3 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.1.249.1059 Safari/532.5
   1 Mozilla/5.0 (Windows; U; Windows NT 5.1; es-ES; rv:1.9.0.19) Gecko/2010031422 Firefox/3.0.19
   1 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.1.249.1045 Safari/532.5
   1 Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_2; en-us) AppleWebKit/531.22.7 (KHTML, like Gecko) Version/4.0.5 Safari/531.22.7
Requests that returned 200 ("OK")


analyst#~awk '($9 ~ /200/)' www-access.log | awk '{print $9,$7}' | sort | uniq
200 /
200 /feed/
200 /login/
200 /robots.txt
200 /signup/
200 /wp-cron.php?doing_wp_cron


Identify Blank User Agents (indication that the request is from an automated script or someone who really values their privacy)

analyst#~awk -F\" '($6 ~ /^-?$/)' www-access.log | awk '{print $1}' | sort | uniq
193.109.122.15
193.109.122.18
193.109.122.33
221.194.47.162
92.62.43.77


Displaying the domain associated with each address:

analyst#~awk '{print $1}' www-access.log | sort | uniq -c | sort -n | tail -10 | awk '{print $2,$2,$1}' | logresolve | awk '{printf "%6d %s (%s)\n",$3,$1,$2}'
     3 164.87.166.190.f.sta.codetel.net.do (190.166.87.164)
     4 114.111.36.26 (114.111.36.26)
     4 hn.kd.ny.adsl (123.4.59.174)
     4 proxyscanner.quakenet.org (92.62.43.77)
     6 trueventures.pier38.web-pass.com (208.80.69.69)
    12 221.192.199.35 (221.192.199.35)
    14 69.80.208.web-pass.com (208.80.69.74)
    18 10.0.1.14 (10.0.1.14)
    36 65.88.2.5 (65.88.2.5)
   241 10.0.1.2 (10.0.1.2)


Credit to ~ http://www.the-art-of-web.com




 






Comments

Popular posts from this blog

Port Scanning, Intrusion Detections, and Packet Analysis by Using Nmap, Snort and Wireshark

Penetration Testing on Windows XP SP2/ SP3 by Exploiting a Vulnerability in Windows Samba Service {ms08-67}.

Malware Analysis Part 2: Using RemNux