regina - Analytics for Nginx Websites
Ruling Empress Generating In-depth Nginx Analytics
regina
is a python program that visualizes data from the nginx access.log
It parses the log and stores the important data in an sqlite database.
It can then create an analytics html page that has lots of useful plots and numbers.
With regina
, you can collect usage data about your website over a long period of time, without any javascript or third parties.
Getting started
Dependencies
- nginx: You need a nginx webserver that outputs the access log in the
combined
format, which is the default - Python 3.10
- Python/matplotlib
Installation
You can install regina with python-pip:
cd regina
python3 -m pip install .
You can also install it system-wide using sudo python3 -m pip install .
If you also want to install the man-page and the zsh completion script:
sudo gzip /usr/share/man/man1/regina.1
sudo cp _regina.compdef.zsh /usr/share/zsh/site-functions/_regina
sudo chmod +x /usr/share/zsh/site-functions/_regina
Configuration
Create a regina directory , for example in ~/.regina
. It can be anywhere and have any name.
Copy the default configuration and template from the git directory to your regina directory
cp template.html ~/.regina/template.html
Now edit the configuration to fit your need. The variables you will definitively have to change are db
, access_log
, img_dir
, img_location
, template_html
, html_out_path
.
The default configuration should be documented well enough for you to know hat do do. I recommend only using absolute paths starting at /
.
Now you can run regina, she will collect data from the nginx log specified as access_log
in the configuration, create and fill the database db
,
create images and statistics and replace all variables in template_html
and output the result to html_out_path
. If html_out_path
is in your websever, you should now be able to access the generated site.
Automation
You will probably run regina
once per day, after nginx
has filled the daily access log. The easiest way to that is using a cronjob.
Run crontab -e and enter:
This will run your regina
command every day, ten minutes after midnight. After each day, nginx
appends a .x
to the log of the previous day, where x
is the number of days since the log was recorded.
So you will probably want append .1
to your access_log
.
Logfile permissions
By default, nginx
logs are -rw-r----- root root
so you can not access them as user. You could either run regina as root, which I strongly do not recommend or make a root-cronjob that changes ownership of the log after midnight.
Run sudo crontab -e and enter:
This will make you the owner of the log 9 minutes after midnight, just before regina
needs read access.
GeoIP
regina
can show you from which country or city a visitor is from, but you will need an ip2location database.
You can acquire such a database for free at ip2location.com. You will need to create an account and can download several different databases in different formats.
For regina
, download the IP-COUNTRY-REGION-CITY
as csv. By default, regina
only tells you which country a user is from. To see the individual cities for countries, append the two-letter country code to the get_cities_for_contries
option in the data-collection
section in the config file.
After that, oad the GeoIP-data into your database:
Depending on how many countries you specified, this might take a long time. You can delete the csv
afterwards.