RETURN TO THE ENTRANCE HOME PAGE

Entrance Application Note:
ANALYZING WEB SERVER LOGS

      


Where are your web site vistors coming from?  This step-by-step guide shows you how to find out using Entrance with the MaxMind GeoLite or GeoIP datasets.

Step 1:  Import the log data
First, get the log data you want to study into a MySQL table, if it is not already.  If your server is using a standard Apache log format, this will be easy.  You can use the Entrance Import Dialog, setting the type to either "Apache Common Log Format" or "Apache Combined Log Format".  

Common Log Format lines look like this:
  127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
Combined Log Format lines have two additional fields, one for the referrer and one for the browser:
  1127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326         
         "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)"

Step 2:  Create the GeoLite City tables
Go to the  the GeoLite web site , and download "the latest GeoLite City CSV". Using Tools | Import data..., load the two files you download into tables named "GeoLiteCity_Blocks" and "GeoLiteCity_Location".   They have Copyright lines, so check "Skip header".  Also, check "First data row contains column names":


Step 3:  Look up latitude and longitude
Many IP addresses will appear multiple times in the log table, so to save time we make a new table where each IP address appears just once:

-- calculate_ipnums.sql

-- drop table if exists ACCESS_IPNUMS;


create table ACCESS_IPNUMS (
   ip       char(15),
   ipnum    long,
   begin_ip long,
   locid    long
);

insert into ACCESS_IPNUMS(ip)
  select  distinct(ip) from ACCESS;

update ACCESS_IPNUMS
  set ipnum = INET_ATON(ip);

update ACCESS_IPNUMS
  set locid = (select locid from maxmind.GEOLITECITY_BLOCKS as b
                  where b.startipNum <= ipnum and ipnum <= b.endipNum limit 1);


There will also be repeating locids, so we make a new table where each appears just once:

-- calculate_latlon.sql

- drop table if exists ACCESS_LATLON;


create table ACCESS_LATLON (
   locid     long,
   lat       double default 0,
   lon       double default 0
);


insert into ACCESS_LATLON(locid)
  select distinct locid from ACCESS_IPNUMS;


update ACCESS_LATLON
  set lat = (select latitude from maxmind.GEOLITECITY_LOCATION
                where ACCESS_LATLON.locid = locid limit 1),

      lon = (select longitude from maxmind.GEOLITECITY_LOCATION
                where  ACCESS_LATLON.locid = locid limit 1);



Step 4:  Make a simple EarthChart plot
Plotting is relatively easy:

plot EarthChart
    x, very small filled yellow  circle
  with
    gray gridlines
select lon, lat
   from ACCESS_LATLON;

Step 5:   Playing with EarthChart
Location dots are easier to pick out against a black background, and you can "flip" back and forth between an EarthChart with a normal background and one with a black background.  You can produce a chart with a black background this way:

plot EarthChart

    x, very small filled yellow  circle
  with
    gray gridlines
    background black
    no bitmap
select lon, lat
   from ACCESS_LATLON;



You can also change colors and mark locations using data labels using techniques like this:

plot EarthChart
    x, very small filled yellow  circle,  white circles, datalabels right, color override
  with
    gray gridlines
    foreground white

select lon, lat, null, null, 'yellow'

   from ACCESS_IPS_0424

union

select lon, lat, null, null, 'red'
   from ACCESS_IPS_0425

union
select 10.02, null, 53.55, 'Hamburg', null ; 



Some final notes
Jeremy Cole's blog describes a way to use MySQL spatial indexes to do the lat,lon lookup more quickly.   It looks promising.   Check the comments for his blog, too, which include a tip from Andy Skelton about using:  SELECT * FROM ip2location WHERE 123456789 <= ip_to LIMIT 1.

EarthChart is a special kind of XY chart, so anything you can do with an Entrance XY chart you can do with it.  (We saw that color overrides could be used in the last example)  If you use a simple query on one table, you will be able to drag out a box over data points to select them.  If you include a color override and and primary key in our query you will be able to do data painting.  The Entrance Help Menu entry for PLOT syntax will get you started.