If you have been downloading log files regularly and running log analyzers, you might want to consider the option of importing the log files into a database. The advantage of the database is that you can run your own custom queries on the database. Here are a few sample queries
How many people came to page A from page B?
What search terms did people use to find page A?
The more expensive analyzers would definitely give answers to such queries. But the less expensive ones don’t give such details. These are just two examples. We will talk about more examples later in this article. If you would like to know more about log files, read our article on basics of log files.
Another reason why you might want to consider inserting log files into database is to double check data being reported by tools. Tools are not always very transparent about how it reports data. So it’s handy to have access to the raw log files so that you can be sure about the data being reported.
Now that we have established the reason behind inserting log file data into a database, let's move onto how this can be done.
This article is not meant to be a detailed technical article and we would like to refer you to this article from Microsoft. The article goes into detail on how to insert log files into a database. But for you who are familiar with databases can probably do this without having to go through the article.
For the purpose of this article, I am going to assume that the log file format is W3C.
Now that data is safely inside a database, we can start looking at how to use this.
For the sample queries, we are going to assume the following table structure.
TABLE [logfile]
[date] [datetime] NULL,
[time] [datetime] NULL,
[c-ip] [varchar] (50) NULL,
[cs-method] [varchar] (50) NULL,
[cs-uri-stem] [varchar] (255) NULL,
[cs-uri-query] [varchar] (2048) NULL,
[sc-status] [int] NULL,
[sc-bytes] [int] NULL,
[time-taken] [int] NULL,
[cs(User-Agent)] [varchar] (255) NULL,
[cs(Cookie)] [varchar] (2048) NULL,
[cs(Referrer)] [varchar] (2048) NULL
Sample query 1
Want to know how many people came from a certain website, www.somesite.com?
Select * from logfile where [cs(referrer)] like ‘%somesite.com%’
Sample query 2
Want to know how many people came to page A from page B?
Select * from logfile where [cs-uri-stem] like ‘pageA-url’ and [cs(referrer)] like ‘pageB-url’
Sample query 3
Want to know how many people came to page A directly?
Select * from logfile where [cs-uri-stem] like ‘pageA-url’ and [cs(referrer)] is NULL
Sample query 4
How many people came to page A from Google?
Select * from logfile where [cs-uri-stem] like ‘pageA-url’ and [cs(referrer)] like ‘%google%’
Sample query 5
How many people came to page A from other pages of the website?
Select * from logfile where [cs-uri-stem] like ‘pageA-url’ and [cs(referrer)] like ‘%yoursite.com%’
Sample query 6
Want to know the dates and the pages visited by Google crawler?
Select [date],[cs-url-stem],[cs-uri-query] from logfile where [cs(referrer)] like ‘%googlebot%’
Please remember that logfiles could be large and in a short time your database could become too big and too slow. We recommend removing all files except the web pages. For example, you can run the following queries to reduce the size of the database.
Delete from logfiles where [cs-uri-stem] like ‘%.gif%’
Delete from logfiles where [cs-uri-stem] like ‘%.jpg%’
Delete from logfiles where [cs-uri-stem] like ‘%.css%’
Delete from logfiles where [cs-uri-stem] like ‘%.js%’
Delete from logfiles where [cs-uri-stem] like ‘%.swf%’
These commands remove the images, css files, Javascript files and flash files and reduce the database size to almost 10%.
Another way of handling the size would be to store each month’s data to a separate table. Storing 6 months data in a single table, for example, could be very time consuming and cumbersome to manage. Especially if your website gets couple of thousand page views every day.
Fill in our contact us form to know how Nabler can help. It would be our pleasure to assist you in making your business succeed.
If you would like to know more about our Web Analytics services, please see our Web Analytics Services page.