Server Logs, Hit Counters, Web Statistics

... From 29
Server Logs » Page Tags  
 
In the initial stages of web development the statistic that many collected and were concerned of was number of requests made by the client to the web server which is also called the number of hits received by the web server. This gave a reasonable idea of the web sites performance since in the initial stages there were no graphics or other objects within the web pages and the total content in relation to a web page is received by the browser with a single call.

• Hits vs. Page Views

As web pages became rich in content by including images and other objects within them, the browser needed to make multiple calls to the server for receiving the total content within a single page. The number of hits a server received could not be equated to the number of pages requested. Thus generated the idea of separating hits and page views. A page view gives an idea of the hits received by the web server for web pages (those requests which relates to a file which is identified as a page in log analysis). Hits give an idea of the requests made to the web server for content which may include a web page, an image, a sound file, a css file, a java script file, a video file etc.

• Visits/Visitors

To track the number of visitors utilising the web site an additional methodology has been derived. A series of requests emanating from the same client computer is treated as requests from the same visitor. A visitors visit is assumed to be over after a certain period of inactivity say a period of 30 minutes. If the server receives a request from the same client after a gap of 30 minutes from the last request, it is considered to be a new visit

• Unreal Traffic » Search Engine Spider visit

Search engines crawl web sites for updating their index. This is made by software programs called spiders. These visits/hits received by the web server do not represent real visits made by humans.

These hists/requests are to be eliminated to get a clear idea of the real human traffic to the web site. Such elimination is done by the software that is used for web analysis by ignoring traffic from know search engine spiders.

• Masked Traffic » Proxies/Dynamic IP Addresses

A client computer in the earlier days used to be identified by its IP address. With the advent of proxy technology, many computers started accessing the internet indirectly through a computer (proxy server) where in all the requests made to web servers are masked as requests made by the proxy server itself. This made it difficult to identify the uniqueness of the visitors.

Tracking and recognising the subsequent visits made by the client computer also became difficult on account of dynamic allocation of IP addresses. Computers connecting to the internet through the Internet Service Providers (ISP) computer would be allocated an IP address dynamically. Thus the client may not have the same IP address if it is disconnecting and getting connected again (even in a short time span).

To overcome this problem, software used to analyse web stats started using cookies to identify client computers.

• Missed Traffic » Caching

The second and subsequent requests made for the same resource from a web site may be served by the browser by retrieving them from the browsers cache. This results in the request not actually reaching the web server, thus resulting in missed traffic. This may sometimes result in the server losing track of the path traversed by the visitor in moving from one page to another.

Though caching can be regulated by choosing appropriate options on the web server, this is not resorted to for the fear of performance degradation to the user.

• Sources of Data

Data used for analysing a web sites performance is collected from two different sources (1) Web server log and (2) using Javascript through a process called Page Tagging

» Server Logs

Each time a web browser accesses a web server a lot of communication takes place between the client computer and the server computer. All the data relating to such communications is stored in log files by the server. Such log files can be accessed by using programming code (scripts) in any file hosted on the server.

Sample of the Server Log
With an explanation for the php code used to obtain the data from server variables

» Page Tagging

To overcome the limitations in web analysis using log files as well the problems arising on account of various other reasons like proxies, web page caching, web site spidering etc., an alternative methodology has been derived called page tagging.

This method requires each page on the web site to include a small visible or invisible image and a piece of java script along with it in the web page. The image is hosted on the web server making up the analysis. Whenever the web page is loaded in the browser of a client computer, a request for the image is made to the web server providing the web stat analysis.

Along with the image request, information used in the web stat analysis like client screen resolution, color depth, browser name etc., which is not available in the server logs is collected by the web server making the web stat analysis. This data is analysed and presented in the form of graphs, images, numerical values to give an idea of the web sites traffic performance.

• Server Log vs Page Tag

Both data analysis using server logs and page tagging have their own advantages and disadvantages

Server log analysis use data which is readily available in server log files while page tagging uses data collected using javascript placed in web page. For page tagging to work the client browser should respond to the data requests and basically should be javascript enabled (which 99%+ browsers are). Page tagging enables collection of greater variety of data and would thus give a richer web site analysis.

Move To  

Hit Counter  
 
A hit counter is a numerical counter that displays the number of times the web site is visited by users. Taking the case of web analysis being made by page tagging, the page count indicates the number of times the page is accessed by users. This is obtained as the number of times the image placed in the web page is accessed from the web site hosting it.

Where the data is not being maintained for each page separately, the count indicates the cumulative value of page visits. This value should be understood as the number of visits to the web site.

This is the simplest tool available to you.

Web Statistics  
 
The data in the server log files coupled with the data collected using page tagging can be interpreted in a number of different ways. This is what is done by web sites providing such web statistics. Many web sites provide a free account which would be useful to small web sites and those who need information relating to a short past period.

If you intend to have an analysis for a longer duration, you may buy a commercial account with such web stat providing web sites.

Register with the web site you intend to take the services from, provide some preliminary details relating to you blog and your preferences, collect the code generated, paste that into the web page whose visitors you wish to analyse.

Stat What it lets you know
Popular Pages Pages that have been viewed the highest number of times
Entry Pages The first page accessed by users
Exit Pages The last page viewed by visitors
Came From URL's of web pages from which users come to your blog
Recent Came From URL's of web pages from which users come to your blog recently
Keyword Analysis keywords used as search terms in locating your site
Recent Keyword Activity Recently used keywords as search terms in locating your site
Search Engine Wars Which search engine is driving what % of traffic to your blog
Visitor Paths Analysis of blog visitors

In web sites using templates for page formatting, pasting the code into the template amounts to including the code in all the pages that use the template.

Move To  

Google Analytics  
 
Another tool that can be used for free is google analytics. This is a program developed by urchin and bought out by google. Google combines this with its online advertisement program "Adwords" and provides it without restrictions to "Adwords Customers".

Non adword customers can analyse upto 5 Million (50,00,000) page views per month. If your web site has more traffic create a google adwords account which would cost you just $5.

This does not provide any hit counter to be displayed on the weg page. It is an invisible code tracking your web pages.

Author Credit : The Edifier ... Continued Page 31

Move To  

♣ Copyright � Krishbhavara. All rights reserved
♣ Site optimized for Internet Explorer 5.5 and above