Central logfiles¶
Logfiles from all servers are streamed to the central log server using the systemd services systemd-journal-remote.service
on the server side and systemd-journal-upload.service
on the clients side.
After trying a number of central logfile solutions, this turned out to be the most suckless option.
You are welcome to configure a graphical / WebUI log management system on the log server - Blunix will be happy to help unless it includes elasticsearch.
How logs are streamed to the central log server¶
All logs are streamed in real time over the wireguard mesh to the central log server cus-util-prod-log-1
on port 19532 tcp. The transmission protocol is http.
The option to enable ssl encryption (httpS) for the connection is not used, as the systemd service does not run as root and hence has no access to the default letsencrypt certificate (/etc/letsencrypt/live/{{ inventory_hostname }}.{{ internal_private_domain }}
) installed on all servers. The connection is however of course already encrypted due to using the wireguard mesh.
Viewing and live-streaming logs¶
Login to cus-util-prod-log-1
via ssh, for example with cake
: cake ssh log
Logfiles of all hosts can be viewed and streamed live using the journalctl
cli tool.
All logfiles are saved in the file /var/log/journal/remote/all.journal
.
Man page of journalctl¶
man journalctl
Streaming all logs from all servers live as they come in¶
journalctl --file /var/log/journal/remote/all.journal -f
Filtering logfiles¶
Each log line is split up into components called "fields".
List avaialble output fields¶
journalctl --fields
Description of output fields¶
man systemd.journal-fields
View examples of output fields¶
journalctl --file /var/log/journal/remote/all.journal -n 10 -o verbose
View only specific fields¶
journalctl --file /var/log/journal/remote/all.journal -o cat --output-fields='_HOSTNAME,MESSAGE'
Grep production webservers nginx logs for specific strings¶
journalctl --file /var/log/journal/remote/all.journal -f _HOSTNAME=cus-www-prod-web-1 _HOSTNAME=cus-www-prod-web-2 _COMM=nginx --grep='404'
Matching specific fields¶
Fields can be given as arguments to journalctl
.
View logs of a specific host¶
journalctl --file /var/log/journal/remote/all.journal -f _HOSTNAME=cus-util-prod-git-1
View logs of multiple hosts¶
journalctl --file /var/log/journal/remote/all.journal -f _HOSTNAME=cus-www-prod-web-1 _HOSTNAME=cus-www-prod-web-2
Filter by systemd service unit name¶
journalctl --file /var/log/journal/remote/all.journal -f _COMM=sshd
journalctl --file /var/log/journal/remote/all.journal -f _COMM=nginx
Viewing logs from a specific time period¶
systemd-journal-remote
has a feature that can define the file size of /var/log/journal/remote/all.journal
before it is "rotated". However this feature is not in debian 11 at the time of this writing.
This is documented here: git.blunix.com/ansible-roles/role-systemd-journal-remote
Because of this, it is rather annoying to find a journal file from a specific timeframe. Additionally, the naming scheme for the rotated files is not human readable (and I did not find documentation as to how the files are named).
Hence, we recommend to look the the filesystem timestamps to select the right file:
root@cus-util-prod-log-1 ~ # ls -tlah /var/log/journal/remote/
total 6,6G
-rw-r----- 1 systemd-journal-remote systemd-journal-remote 56M 22. Feb 18:13 all.journal
-rw-r----- 1 systemd-journal-remote systemd-journal-remote 88M 20. Feb 10:53 all@c5fd66026f8b4134a83d07b8174ee8ce-00000000005ec084-0005f52e9513970a.journal
[...]
When you have found the right file, you can select a timerange like so:
journalctl --file /var/log/journal/remote/all@c5fd66026f8b4134a83d07b8174ee8ce-00000000005ec084-0005f52e9513970a.journal --since="2023-20-02 08:00:00" --until="2020-20-02 08:30:00"
You can also use relative time periods (like "since one hour ago")
journalctl --file /var/log/journal/remote/all.journal --since=-1h
journalctl --file /var/log/journal/remote/all.journal --since=today
Viewing logfiles in production¶
journalctl
is rather complex - we recommend to:
- read the manual page carefully (
man journalctl
) - create bash scripts within
cus-util-prod-log-1:/root/log-scripts/
for longerjournalctl
commands that are often used
Alerts¶
There is currently no proper solution within systemd-journal-remote to generate alerts / emails / similar on specific log strings.
You can however setup a WebUI central log solution for that. Most all of them accept input from journalctl
.
Problems¶
are documented and collected at git.blunix.com/ansible-roles/role-systemd-journal-remote
Please notify us if you find any additional problems.
Reasons we implemented it regardless of the problems¶
- no elasticsarch
- reliable
- streams old logs if any component (server or client) is down temporarily
- logs are signed on the server to detect tempering
journalctl
search and filtering is quite powerful