regina - Analytics for Nginx Websites

Ruling Empress Generating In-depth Nginx Analytics

regina is a python python-logo program that visualizes data from the nginx access.log It parses the log and stores the important data in an sqlite sqlite-logo database. It can then create an analytics html page that has lots of useful plots and numbers.

With regina, you can collect usage data about your website over a long period of time, without any javascript or third parties.


Visualization options

regina can generate the following things:

plots:

  • unique visitor count / timespan
  • unique request count / timespan
  • referrer ranking (from which site people visit)
  • file ranking (accessed locations)
  • browser ranking
  • platform ranking (operating systems)
  • city ranking
  • country ranking

numbers:

  • mobile visitor percentage

All of those plots and numbers can be generated for the last x days (you can set x yourself) and for all times. The plots are saved as svg, but you can change the file type to something else.

Click here or on image on the right to see a analytics page that was generated by regina.

Plots generated by regina

Getting started

Dependencies

  • nginx: You need a nginx webserver that outputs the access log in the combined format, which is the default
  • Python 3.10
  • Python/matplotlib

Installation

You can install regina with python-pip:

git clone https://github.com/MatthiasQuintern/regina.git
cd regina
python3 -m pip install .

You can also install it system-wide using sudo python3 -m pip install .

If you also want to install the man-page and the zsh completion script:

sudo cp regina.1.man /usr/share/man/man1/regina.1
sudo gzip /usr/share/man/man1/regina.1
sudo cp _regina.compdef.zsh /usr/share/zsh/site-functions/_regina
sudo chmod +x /usr/share/zsh/site-functions/_regina

Configuration

Create a regina directory , for example in ~/.regina. It can be anywhere and have any name. Copy the default configuration and template from the git directory to your regina directory

cp default.conf ~/.regina/regina.conf
cp template.html ~/.regina/template.html

Now edit the configuration to fit your need. The variables you will definitively have to change are db, access_log, img_dir, img_location, template_html, html_out_path. The default configuration should be documented well enough for you to know hat do do. I recommend only using absolute paths starting at /.

Now you can run regina, she will collect data from the nginx log specified as access_log in the configuration, create and fill the database db, create images and statistics and replace all variables in template_html and output the result to html_out_path. If html_out_path is in your websever, you should now be able to access the generated site.

/usr/local/bin/regina --config /home/user/.regina/regina.conf --collect --visualize

Automation

You will probably run regina once per day, after nginx has filled the daily access log. The easiest way to that is using a cronjob. Run crontab -e and enter:

10 0 * * * /usr/local/bin/regina --config /home/user/.regina/regina.conf --collect --visualize

This will run your regina command every day, ten minutes after midnight. After each day, nginx appends a .x to the log of the previous day, where x is the number of days since the log was recorded. So you will probably want append .1 to your access_log.

Logfile permissions

By default, nginx logs are -rw-r----- root root so you can not access them as user. You could either run regina as root, which I strongly do not recommend or make a root-cronjob that changes ownership of the log after midnight. Run sudo crontab -e and enter:

9 0 * * * chown your-username: /var/log/nginx/nginx-access.log.1

This will make you the owner of the log 9 minutes after midnight, just before regina needs read access.

GeoIP

regina can show you from which country or city a visitor is from, but you will need an ip2location database. You can acquire such a database for free at ip2location.com. You will need to create an account and can download several different databases in different formats.
For regina, download the IP-COUNTRY-REGION-CITY as csv. By default, regina only tells you which country a user is from. To see the individual cities for countries, append the two-letter country code to the get_cities_for_contries option in the data-collection section in the config file. After that, oad the GeoIP-data into your database:

regina --config regina.conf --update-geoip path-to-csv

Depending on how many countries you specified, this might take a long time. You can delete the csv afterwards.


License

GNU General Public License 3