Webalizer Webpage and SEO Measurements

Webalizer is a fast, free web server log file analysis program. It produces highly detailed, easily configurable usage reports in HTML format, for viewing with a standard web browser. It should be checked monthing to assit in determining the health of your website and online marketing success

Features

  • Is written in C to be extremely fast and highly portable. On my 1.6Ghz laptop, it can process close to 70,000 records per second, which means a log file with roughly 2 million hits can be analyzed in about 30 seconds.

  • Handles standard Common logfile format (CLF) server logs, several variations of the NCSA Combined logfile format, wu-ftpd/proftpd xferlog (FTP) format logs, Squid proxy server native format, and W3C Extended log formats. In addition, gzip (.gz) and bzip2 (.bz2) compressed logs may be used directly without the need for uncompressing.

  • Generated reports can be configured from the command line, or more commonly, by the use of one or more configuration files. Detailed information on configuration options can be found in the README file, supplied with all distributions.

  • Supports multiple languages. Currently, Albanian, Arabic, Catalan, Chinese (traditional and simplified), Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hungarian, Icelandic, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Malay, Norwegian, Polish, Portuguese (Portugal and Brazil), Romanian, Russian, Serbian, Slovak, Slovene, Spanish, Swedish, Thai, Turkish and Ukrainian are available.

  • Unlimited log file sizes and partial logs are supported, allowing logs to be rotated as often as needed, and eliminating the need to keep huge monthly files on the system.

  • Fully supports IPv4 and IPv6 addresses. Includes built-in distributed DNS lookup capability and native Geolocation services.

  • Distributed under the GNU General Public License, complete source code is available, as well as binary distributions for some of the more popular platforms. Please read the Copyright notices for additional information.

Webalizer example and use for benchmarking

In March 2012 we took on a small Charleston SC Real Estate client that previously averaged about 400 visits per month.

We immediately created a local viral SEO event that netted good results for April. 2,798 visits - 13,9097 hits. (without AdWords)

However, that same month our client contracted with YoXXX. In May the YoXXX landing pages with YoXXX-owned phone numbers and email addresses siphoned off all of the viral campaign gains. When the YoXXX campaign was canceled at the end of the month, their landing pages remained for several month leaving our client with numerous landing pages to nowhere. Phone numbers to nowhere! email addresses to nowhere!

Even so, the clients webpage received steady growth each month.

In February 2013 our client received 10,686 visits and 128,581 hits directly to their webpage..

Are you getting this kind of internet activity directly to your website? The Numbers in Yellow and Blue are important.

Analytical Terms

Website traffic analysis is produced by grouping and aggregating various data items captured by the web server in the form of log files while the website visitor is browsing the website. Some of the most commonly used website traffic analysis terms are listed below:

URL: A Uniform Resource Locator (URL) uniquely identifies the resource requested by the user's browser.

Hit: Each HTTP request submitted by the browser is counted as one hit. Note that HTTP requests may be submitted for non-existent content, in which case they still will be counted. For example, if one of the five image files referred by the example page mentioned above is missing, the web server will still count six HTTP requests, but in this case, five will be marked as successful (one HTML file and four images) and one as a failed request (the missing image)

Page: A page is a successful HTTP request for a resource that constitutes primary website's content. Pages are usually identified by a file extension (e.g. .html, .php, .asp, etc.) or by a missing extension, in which case the subject of the HTTP request is considered a directory and the default page for this directory is served.

File: Each successful HTTP request is counted as a file.

Visitor: A visitor is the actual person browsing the website. A typical website serves content to anonymous visitors and cannot associate visitors with the actual person browsing the website. Visitor identification may be based on their IP address or an HTTP cookie. The former approach is simple to implement, but results in all visitors browsing the same website from behind a firewall counted as a single visitor. The latter approach requires special configuration of the web server (i.e. to log HTTP cookies) and is more expensive to implement. Note that neither of the approaches identifies the actual person browsing the website and neither provides 100% accuracy in determining that the same visitor has visited the website again.

Visit: A visit is a series of HTTP requests submitted by a visitor with the maximum time between requests not exceeding a certain amount configured by the webmaster, which is typically set at 30 minutes. For example, if a visitor requested page A, then in 10 minutes page B and then in 40 minutes page C, then this visitor has generated two visits, one when pages A and B were requested and another when the page C was requested.

Host: In general, a host is the visitor's machine running the browser. Hosts are often identified by IP addresses or domain names. Those web traffic analysis tools that use IP addresses to identify visitors use the words hosts, domain names and IP addresses interchangeably.

User Agent: User agent is a synonym for a web browser.

In order to illustrate the difference between hits, pages and files, let's consider a user requesting an HTML file referring to five images, one of which is missing. In this case the web server will log six hits (i.e. one successful for the HTML file itself and four for successfully retrieved images and one for the missing image), five files (i.e. five successful HTML requests) and one page (i.e. the HTML file).