awstats

最後更新: 2018-08-08

 

目錄

 


awstats 安裝

 

# epel

yum install awstats

# Script 位置:

perl code 位置

/usr/share/awstats/tools/

# Fix folder permission

$ find ./awstats -type d -exec chmod 701 '{}' \;
$ find ./awstats -not -type d -exec chmod 404 '{}' \;

chmod 400 /etc/awstats/*.conf

htpasswd -c /etc/awstats/htpasswd.users admin

cd /usr/share/awstats/wwwroot/cgi-bin/

vi .htaccess

AuthName "STOP - Do not continue unless you are authorized to view this site! - Server Access"
AuthType Basic
AuthUserFile /etc/awstats/htpasswd.users
Require valid-user

chmod 444 .htaccess

如果只是用人手 gen report, 那可以建立 ".htaccess"

Deny from all

# backup 起 default 的 cron job

mkdir /etc/backup

mv /etc/cron.hourly/*awstats /etc/backup

 


建立 cron job (它會建立 awstats 的 report)

 

/etc/cron.hourly/awstats

exec /usr/share/awstats/tools/awstats_updateall.pl now \
     -configdir="/etc/awstats"  \
     -awstatsprog="/usr/share/awstats/wwwroot/cgi-bin/awstats.pl" >/dev/null

/etc/logrotate.d/httpd

# 確保 apache 換 log 前會食到最尾的 record

/var/log/httpd/*log {
  missingok
  notifempty
  sharedscripts
  prerotate
    /var/www/awstats/awstats.pl-update -config=somesite.net
  endscript
  postrotate
    /etc/init.d/httpd reload > /dev/null 2>&1 || true
  endscript
}

P.S.

設定好 /etc/awstats/awstats.myserver.conf 後

訪問 http://myserver/awstats/awstats.pl?config=myserver

就會見到

"Last Update:     Never updated"

原因係 /etc/cron.hourly/awstats 一次都未行過.

 


GeoIP

 

You must choose between using Maxmind plugin or the GeoIPfree plugin

參考

https://datahunter.org/geoip

DB

rpm -qa | grep -i  geoip

GeoIP-1.5.0-11.el7.x86_64

ls -1 /usr/share/GeoIP/*.dat

/usr/share/GeoIP/GeoIP.dat
...
/usr/share/GeoIP/GeoLiteCity.dat

# 安裝 library

# REQUIRED MODULES: Geo::IP OR Geo::IP::PurePerl (from Maxmind)

# yum install perl-Geo-IP

perl -MCPAN -e 'install Geo::IP'

Reading '/root/.cpan/Metadata'
  Database was generated on Fri, 27 Apr 2018 21:54:26 GMT
Geo::IP is up to date (1.51).

perl -MCPAN -e 'install Geo::IP::PurePerl'

Reading '/root/.cpan/Metadata'
  Database was generated on Fri, 27 Apr 2018 21:54:26 GMT
Geo::IP::PurePerl is up to date (1.26).

AWStats Config:

# Memory(GEOIP_MEMORY_CACHE) or File(GEOIP_STANDARD) lookups

PARAMETERS: [GEOIP_STANDARD | GEOIP_MEMORY_CACHE]

# GeoIP_City_Maxmind

LoadPlugin="geoip GEOIP_MEMORY_CACHE /usr/share/GeoIP/GeoIP.dat"

Remark

-update 時已經要 enable plugin, 否則會沒有記錄 !!

 


設定

 

Change apache log file format:

CustomLog /yourlogpath/yourlogfile common

to

CustomLog /yourlogpath/yourlogfile combined

 

Apache 設定檔 /etc/httpd/conf.d/awstats.conf:

# 在 output dirctory 行以下 command 亦得
# cp -a /usr/share/awstats/wwwroot/classes/ .
# cp -a /usr/share/awstats/wwwroot/css/ .
# cp -a /usr/share/awstats/wwwroot/icon/ .
Alias /awstatsclasses "/usr/local/awstats/wwwroot/classes/"
Alias /awstatscss "/usr/local/awstats/wwwroot/css/"
Alias /awstatsicons "/usr/local/awstats/wwwroot/icon/"

# 如果是人手 gen report, 那就要加 "#" 了到以下設定
# ScriptAlias /awstats/ "/usr/local/awstats/wwwroot/cgi-bin/"

<Directory "/usr/local/awstats/wwwroot">
    Options None
    AllowOverride None
    Order allow,deny
    Allow from all
</Directory>

# Additional Perl modules
<IfModule mod_env.c>
    SetEnv PERL5LIB /usr/share/awstats/lib:/usr/share/awstats/plugins
</IfModule>

 

awstats 設定檔: awstats.mysite.conf

* 紅色位置必須修改

####################################### LOG SETTING
LogFile="/home/vhosts/datahunter.org/logs/access.log"

# W - For a web log file
# M - For a mail log file
# F - For a ftp log file
LogType=W

# 1 - Apache (native combined log format)
# 2 - IIS (IIS W3C log format).
# 3 - Webstar native log format.
# 4 - Apache or Squid native common log format (CLF log format)
LogFormat=1

# This parameter is not used if LogFormat is a predefined value (1,2,3,4)
LogSeparator=" "

####################################### Hostname 設定
# If you share the same log file for several virtual web servers, 
# this parameter is used to tell AWStats to filter record that contains records for this virtual host name only
SiteDomain="datahunter.org"

# This parameter is used to analyze referer field in log file and
# to help AWStats to know if a referer URL is a local URL of same site or an URL of another site
# regular expression values writing value with REGEX[value].
HostAliases="www.datahunter.org localhost 127.0.0.1 REGEX[mydomain\.(net|org)$]"


#######################################
# hosts reported by name instead of ip address
# Default: 2
# 0 - No DNS Lookup
# 1 - DNS Lookup is fully enabled
# 2 - DNS Lookup is made only from static DNS cache file
DNSLookup=0

# 注意 DirData 的 Permission
DirData="/var/lib/awstats/datahunter.org"

# If you build static reports
# path of icon directory relative to the output directory 'outputpath'.
DirIcons="/awstats/icon"

# AWStats saves and sorts its database on a month basis
# (2 - Allowed on CLI only, -Year- value in combo is visible but not allowed)
AllowFullYearView=2

#default: Lang="auto"
Lang="en"

# Each time you run the update process, AWStats overwrite the 'historic file'
# for the month (awstatsMMYYYY[.*].txt) with the updated one.
# When write errors occurs (IO, disk full,...), this historic file can be corrupted and must be deleted.
KeepBackupOfHistoricFiles=1

############################################################ 

# This parameter is used only when AWStats is run from command line
# with -output option
DirCgi="/awstats"

# AWStats adds a button on report page to allow to "update" statistics from a web browser.
AllowToUpdateStatsFromBrowser=1

# permission
AllowAccessFromWebToAuthenticatedUsersOnly=1
AllowAccessFromWebToFollowingAuthenticatedUsers="admin webmaster"
AllowAccessFromWebToFollowingIPAddresses="127.0.0.1 192.168.123.1-254"


############################################################ OPTIONAL Setting
# DNS
DNSStaticCacheFile="dnscache.txt"
DNSLastUpdateCacheFile="dnscachelastupdate.txt"
SkipDNSLookupFor=""

BuildHistoryFormat=xml
BuildReportFormat=xhtml

# Do not include access from clients that match following criteria
# Use space between each value
# regular expression format: REGEX[^10\.0\.0\.]
SkipHosts="127.0.0.1"

 


Personalized_log_format

 

%host Client hostname or IP address (or Sender host for mail log)
%host_r Receiver hostname or IP address (for mail log)

%lognamequot Authenticated login/user with format: "alex"
%logname Authenticated login/user with format: alex

%time1 Date and time with format: [dd/mon/yyyy:hh:mm:ss +0000] or [dd/mon/yyyy:hh:mm:ss]
%time2 Date and time with format: yyyy-mm-dd hh-mm-ss
%time3 Date and time with format: Mon dd hh:mm:ss or Mon dd hh:mm:ss yyyy
%time4 Date and time with unix timestamp format: dddddddddd

%methodurl Method and URL with format: "GET /index.html HTTP/x.x"
%methodurlnoprot Method and URL with format: "GET /index.html"
%method Method with format: GET
%url URL only with format: /index.html
%query Query string (used by URLWithQuery option)

%code Return code status (with format for web log: 999)

%bytesd Size of document in bytes

%refererquot Referer page with format: "http://from.com/from.htm"
%referer Referer page with format: http://from.com/from.htm

%uaquot User agent with format: "Mozilla/4.0 (compatible, ...)"
%ua User agent with format: Mozilla/4.0_(compatible...)

%gzipin mod_gzip compression input bytes: In:XXX
%gzipout mod_gzip compression output bytes & ratio: Out:YYY:ZZpct.
%gzipratio mod_gzip compression ratio: ZZpct.
%deflateratio mod_deflate compression ratio with format: (ZZ)

%email EMail sender (for mail log)
%email_r EMail receiver (for mail log)

%virtualname Web sever virtual hostname.
Use this tag when same log contains data of several virtual web servers.
AWStats will discard records not in SiteDomain nor HostAliases

%cluster If log file is provided from several computers
(merged by logresolvemerge.pl), this tag define field of cluster id.

%extraX Another field that you plan to use for building a personalized report with ExtraSection feature (See later).

# If your log format has some fields not included in this list, use
%other Means another field
%otherquot Means another not used double quoted field

 

Rackspace CloudFIle log Format:

client_ip - - [day/month/year:hour:minute:second timezone] "method request HTTP_version" 
return_code bytes_sent "referrer" "user_agent"

Example

i.e.

x.x.x.x - - [30/10/2014:02:26:18 +0000] "GET /???/your_file.pdf HTTP/1.1" 0 20296 "-" 
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36"
LogFormat = "%host - - %time1 %methodurl %code %bytesd %refererquot %uaquot"

 


人手生成 report

 

<0> 找出 awstats.pl 的位置

rpm -ql awstats | grep -w awstats.pl

<1> Building/updating statistics database

# The first log analysis should be done manually from the command line

# since the process may be long and it's easier to solve problems when you can see the command output

# 在 /usr 行 find . -name awstats.pl 找出 awstats.pl 位置

cd /usr/share/awstats/wwwroot/cgi-bin

perl awstats.pl -config=mysite -update

# --config  對應了 /etc/awstat/awstats.YOUR_SITE_NAME.conf 的 YOUR_SITE_NAME

output

From data in log file "/home/virtualhosts/carer/log/access_log"...
Phase 1 : First bypass old records, searching new record...
Searching new records from beginning of log file...
Phase 2 : Now process new records (Flush history on disk after 20000 hosts)...
Jumped lines in file: 0
Parsed lines in file: 932605
 Found 0 dropped records,
 Found 0 comments,
 Found 0 blank records,
 Found 803 corrupted records,
 Found 0 old records,
 Found 931802 new qualified records.

Example: 建立不同時間 DB

/usr/share/awstats/wwwroot/cgi-bin/awstats.pl --config=YOUR_SITE_NAME -DatabaseBreak=day
/usr/share/awstats/wwwroot/cgi-bin/awstats.pl --config=YOUR_SITE_NAME -DatabaseBreak=month
/usr/share/awstats/wwwroot/cgi-bin/awstats.pl --config=YOUR_SITE_NAME -DatabaseBreak=year

DirData

AWStats statistics database files are saved in directory defined by the DirData parameter in configuration file.

When AWStats updates its statistics, it stores results of its analysis in files

grep ^DirData /etc/awstats/awstats.mysite.conf

ls /var/lib/awstats/mysite

-rw-r--r-- 1 root root 9.7K Dec 16 17:49 awstats102014.MyWebsite.txt
-rw-r--r-- 1 root root 221K Dec 16 17:52 awstats112014.MyWebsite.txt
-rw-r--r-- 1 root root  39K Dec 16 17:52 awstats122014.MyWebsite.txt
....................................

<3> ACL

.htaccess

# access control
AuthName "Restricted Area"
AuthType Basic
AuthBasicProvider file
AuthUserFile ${rootDir}/htpasswd
Require valid-user

<4> icon, css

Troubleshoot

icon 之類不見了

解決

grep ^Alias /etc/httpd/conf.d/awstats.conf

DirIcons="icons"
StyleSheet="css/awstats_default.css"

cp -a /var/www/awstats/icon/ ./

cp -a /var/www/awstats/css/ ./

<5> Run reports: Building and reading reports

# output 要係 filename

/var/www/awstats/awstats.pl -config=mysite -output -staticlinks > awstats.mysite.html

 

awstats_buildstaticpages

awstats_buildstaticpages 相當於行了一堆 awstats.pl 的 -staticlinks command

/var/www/awstats/awstats.pl -config=MyWebsite -output -staticlinks > awstats.mysite.html

Usage:

awstats_buildstaticpages.pl (awstats_options) [awstatsbuildstaticpages_options]

awstats_options

    -config=configvalue is value for -config parameter (REQUIRED)
    -update option used to update statistics before to generate pages

** Reported period:    Month Dec 2014

    -lang=LL to output a HTML report in language LL (en,de,es,fr,...)
    -month=MM to output a HTML report for an old month=MM
    -year=YYYY to output a HTML report for an old year=YYYY

options

-awstatsprog=pathtoawstatspl gives AWStats software (awstats.pl) path

-dir=outputdir to set output directory for generated pages

-builddate=%YY%MM%DD Used to add build date in built pages filenames

-staticlinksext=xxx For pages with .xxx extension instead of .html

-buildpdf[=pathtohtmldoc] Build a PDF file after building HTML pages.

Output directory must contains icon directory when this option is used (need 'htmldoc').  <--- yum install htmldoc

Troubleshoot

Error: Can't find AWStats program ('awstats.pl').

解決

Use -awstatsprog option to solve this.

awstats_buildstaticpages.pl -awstatsprog=/var/www/html/awstats/awstats.pl -config=xxx -dir=/var/www/html/awstats/12

ExtraTrackedRowsLimit-500
Launch update process : "/var/www/awstats/awstats.pl" -config=MyWebsite -configdir=
Build alldomains page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=alldomains
Build allhosts page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=allhosts
Build lasthosts page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=lasthosts
Build unknownip page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=unknownip
Build allrobots page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=allrobots
Build lastrobots page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=lastrobots
Build session page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=session
Build urldetail page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=urldetail
Build urlentry page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=urlentry
Build urlexit page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=urlexit
Build osdetail page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=osdetail
Build unknownos page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=unknownos
Build browserdetail page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=browserdetail
Build unknownbrowser page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=unknownbrowser
Build downloads page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=downloads
Build refererse page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=refererse
Build refererpages page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=refererpages
Build keyphrases page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=keyphrases
Build keywords page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=keywords
Build errors404 page: "/var/www/awstats/awstats.pl" -config=MyWebsite -staticlinks -output=errors404
21 files built.
Main HTML page is 'awstats.MyWebsite.html'.

# gen 月份的 report

awstats_buildstaticpages.pl -month=11 -awstatsprog=/var/www/html/awstats/awstats.pl -config= -dir=/var/www/html/awstats/11

 

# Script to build single report

/root/scripts/gen_awstats.sh

#!/bin/bash
# * 此script只會生成本月的report, 如果想build另一個月就要修改 "-month=MM"

# awstats.pl script 的位置
_AWSTATS=/usr/share/awstats/wwwroot/cgi-bin/awstats.pl

# 對應 /etc/awstats/awstats.X.X.conf, 它會 Show 在 Page 的頂頭 "Statistics for: X.X"
_SITE=X.X

# Folder - /home/vhosts/$_SITE/public_html/awstats 要有 subfolder "classes" "css" "icon"
_OUT=/home/vhosts/$_SITE/public_html/awstats

# 有 setting DirData="..."
# mkdir /home/vhosts/$_SITE/public_html/awstats/DB



#### to update statistics ($DirData/awstatsMMYYYY*.txt)
$_AWSTATS -config=$_SITE -update


#### Gen Report ####

# Main Page
$_AWSTATS -config=$_SITE -month=$_MONTH -output -staticlinks > $_OUT/$_MONTH.html

# GeoIP
$_AWSTATS -config=$_SITE -output=alldomains -staticlinks > $_OUT/awstats.$_SITE.alldomains.html

# Hosts
$_AWSTATS -config=$_SITE -output=allhosts  -staticlinks  > $_OUT/awstats.$_SITE.allhosts.html
$_AWSTATS -config=$_SITE -output=lasthosts -staticlinks  > $_OUT/awstats.$_SITE.lasthosts.html
$_AWSTATS -config=$_SITE -output=unknownip -staticlinks  > $_OUT/awstats.$_SITE.unknownip.html

####    END    ####

# To create specific individual reports,

# specify the report name on the command line as follows:

... -output=alllogins -staticlinks > awstats.mysite.alllogins.html
... -output=lastlogins -staticlinks > awstats.mysite.lastlogins.html
... -output=allrobots -staticlinks > awstats.mysite.allrobots.html
... -output=lastrobots -staticlinks > awstats.mysite.lastrobots.html
... -output=urldetail -staticlinks > awstats.mysite.urldetail.html
... -output=urlentry -staticlinks > awstats.mysite.urlentry.html
... -output=urlexit -staticlinks > awstats.mysite.urlexit.html
... -output=browserdetail -staticlinks > awstats.mysite.browserdetail.html
... -output=osdetail -staticlinks > awstats.mysite.osdetail.html
... -output=unknownbrowser -staticlinks > awstats.mysite.unknownbrowser.html
... -output=unknownos -staticlinks > awstats.mysite.unknownos.html
... -output=refererse -staticlinks > awstats.mysite.refererse.html
... -output=refererpages -staticlinks > awstats.mysite.refererpages.html
... -output=keyphrases -staticlinks > awstats.mysite.keyphrases.html
... -output=keywords -staticlinks > awstats.mysite.keywords.html
... -output=errors404 -staticlinks > awstats.mysite.errors404.html

 


Update from a browser

 

AllowToUpdateStatsFromBrowser=1

Browser 的 Link:

# Monthly

http://YOUR_SITE_NAME/awstats/awstats.pl?config=YOUR_SITE_NAME&month=08&year=2013&databasebreak=day

# Daily

http://YOUR_SITE_NAME/awstats/awstats.pl?config=YOUR_SITE_NAME&month=08&year=2013&day=23&databasebreak=day

 


設定 show 幾多資料 (Top X)

 

##### MaxNbOfX #####
# Stats by countries/domains
MaxNbOfDomain = 25

# Hosts (Top 10)
MaxNbOfHostsShown = 99

# Pages-URL (Top 10)
MaxNbOfPageShown = 99

# Stats for keywords
MaxNbOfKeywordsShown = 20
MaxNbOfKeyphrasesShown = 20

# Stats by referers
MaxNbOfRefererShown = 50

# Downloads (Top 10)
MaxNbOfDownloadsShown = 99

#########################################
MaxNbOfOsShown = 10
MaxNbOfBrowsersShown = 10
MaxNbOfRobotShown = 10
MaxNbOfScreenSizesShown = 5

 


"-update" Type

 

Dropped records

are records discarded because they were not "user HTTP requests" or were requests matching AWStats filters

(See the SkipHosts, SkipUserAgents, SkipFiles, OnlyHosts, OnlyUserAgents and OnlyFiles parameters).

If you want to see which lines were dropped, you can add the -showdropped option on the command line.

Corrupted records

are records that do not match the log format defined by the "LogFormat" parameter in the AWStats configuration file.

All web servers will typically have a few corrupted records (<5%) even when everything works correctly.

This can result for several reasons:

1) Web server internal bugs,

2) bad requests made by buggy browsers,

3) a dirty web server shutdown, such as unplugging the server... 

If you want to see which lines are corrupted, you can add the -showcorrupted option on the command line.

Old records

are simply records that were already processed by a previous update session.

Although it is not necessary to purge your log file after each update process,

it is highly recommended that you do so as often as possible.

New records

It are records in your log file that were successfully used to build/update the statistics database.

 


DOC