The internet is a very different place than the one that I experienced growing up. Back then there were no ad networks, browser fingerprinting, drive-by exploits, obfuscated code running cryptomining software while you read, and all other manner of shady money-driven tactics you see today. It was a simpler time. Now without ad/tracker-blocking browser extensions, the web is almost unusable.
So what can be done by an individual still operating their own website today? Well we can stop using third-party analytics solutions. There is a reason those services are free for you to use. You are paying with your visitors' privacy and eventual erosion of trust in you. And since you are including third-party code on the client, you are risking your visitors' security as well.
Let's get to it. I've been using GoAccess as a replacement for web traffic analytics:
GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
GoAccess analyzes web server log files directly to generate analytics reports. It is compatible with Apache, Nginx, Amazon S3, Elastic Load Balancing, CloudFront, and others. It's open-source, so if you're using some obscure web server with unusual log formats, you can contribute to the project via their GitHub.
Simple Web Server Configuration
You will need to install GoAccess wherever your web server's log files are stored. For the sake of simplicity, we will assume it's on your web server. Here's how to install goaccess:
wget https://tar.goaccess.io/goaccess-1.4.tar.gz
tar -xzvf goaccess-1.4.tar.gz
cd goaccess-1.4/
./configure --enable-utf8 --enable-geoip=legacy
make
make install
For more information see the download page on the project's website.
If you're running your web server on a Linux machine, its log files are likely generated and pruned by a utility named "logrotate". The default max age for log files is quite short (14 days). So you will want to increase this to something like 6 months or a year. Configuration files for logrotate are located in /etc/logrotate.d
. Here is an example for nginx (/etc/logrotate.d/nginx
):
/var/log/nginx/*.log {
daily
missingok
rotate 14
compress
delaycompress
notifempty
create 640 nginx adm
sharedscripts
postrotate
if [ -f /var/run/nginx.pid ]; then
kill -USR1 7
fi
endscript
}
The line that you will want to change is rotate 14
. This is the maximum age (in days) for log files. Files older than this setting will be pruned/deleted.
To generate a report from your web server's current log files:
goaccess \
-f /var/log/nginx/access.log* \
--log-format=COMBINED \
--ignore-crawlers \
--output html
Using Docker
If you're using Docker to run your web server, have a look at running GoAccess in Docker - detailed guide to running goaccess in its own container. Not mentioned in that guide is that you will need to use a mounted volume to store your web server's logs.