Zabbix Database Error: No such file or directory After unattended-upgrade

Category Observability & Operations

After unattended-upgrade automatically ran on one of my Zabbix servers, the Zabbix frontend started showing a confusing database error.

When I opened the Zabbix web UI, the page only showed:

Database error
No such file or directory

Database error
No such file or directory

At first glance, this looked like a missing database file or a broken database configuration.

However, the database files were not missing, and MariaDB itself was not corrupted.

The actual problem was caused by an automatic service restart transaction triggered by unattended-upgrade and needrestart.

MariaDB was stopped as part of that transaction. It was supposed to start again later in the same transaction, but the transaction became blocked because zabbix-server.service did not finish stopping.

In other words, MariaDB did not simply fail to start by itself. Its start job was waiting behind a stuck Zabbix server stop job.

During that time, the local MySQL/MariaDB socket did not exist, so the Zabbix frontend could not connect to the database and displayed:

Database error
No such file or directory

Database error
No such file or directory

Environment

The server was running:

Ubuntu 26.04
Zabbix Server 7.4
MariaDB 11.8
NGINX
PHP-FPM

Ubuntu 26.04
Zabbix Server 7.4
MariaDB 11.8
NGINX
PHP-FPM

The Zabbix frontend was served through NGINX and PHP-FPM.

Zabbix server used a local MariaDB database.

The symptom

The frontend error was:

Database error
No such file or directory

Database error
No such file or directory

This error can be misleading.

In this case, “No such file or directory” did not mean that the Zabbix database files were gone.

It meant the frontend could not connect to the local MariaDB socket:

/var/run/mysqld/mysqld.sock

/var/run/mysqld/mysqld.sock

or:

/run/mysqld/mysqld.sock

/run/mysqld/mysqld.sock

That socket only exists when MariaDB is running.

Because MariaDB had been stopped during the restart transaction, the socket was missing, and the Zabbix frontend could not connect to the database.

First check: service status

I first checked the service status:

systemctl status mariadb zabbix-server nginx php8.5-fpm
systemctl list-jobs

systemctl status mariadb zabbix-server nginx php8.5-fpm
systemctl list-jobs

The important part was that MariaDB was inactive:

mariadb.service: inactive (dead)

mariadb.service: inactive (dead)

At the same time, Zabbix server was not fully stopped. It was stuck in the stopping state:

zabbix-server.service: deactivating (stop-sigterm)

zabbix-server.service: deactivating (stop-sigterm)

This was the key clue.

MariaDB was not down because it had lost data or because its database files were damaged.

MariaDB had already been stopped by the restart transaction. It was supposed to start again, but the transaction could not continue because zabbix-server.service was still stuck in the stop phase.

The actual restart sequence

The automatic upgrade did not simply restart Zabbix server alone.

The chain looked like this:

unattended-upgrade
  ↓
needrestart
  ↓
bulk restart request for multiple services

unattended-upgrade
  ↓
needrestart
  ↓
bulk restart request for multiple services

needrestart requested a bulk restart of multiple services, including MariaDB, NGINX, PHP-FPM, and Zabbix server.

The important part was not the textual order of the service names. The important part was the effective transaction state observed on the server:

1. MariaDB was stopped successfully.
2. systemd attempted to stop zabbix-server.
3. zabbix-server did not finish stopping.
4. MariaDB's start job was waiting for the transaction to continue.
5. zabbix-server could not start again because its stop phase had not completed.

1. MariaDB was stopped successfully.
2. systemd attempted to stop zabbix-server.
3. zabbix-server did not finish stopping.
4. MariaDB's start job was waiting for the transaction to continue.
5. zabbix-server could not start again because its stop phase had not completed.

So the important point is:

MariaDB was supposed to start again, but its start job was waiting for zabbix-server.service to finish stopping.

MariaDB was supposed to start again, but its start job was waiting for zabbix-server.service to finish stopping.

The incident was blocked at this step:

stop zabbix-server

stop zabbix-server

Because Zabbix server did not finish stopping, systemd could not proceed to the later start jobs.

That left MariaDB inactive, the database socket missing, and the Zabbix frontend unable to connect to the database.

Checking systemd jobs

systemctl list-jobs showed that the transaction was still stuck.

The important state was conceptually:

zabbix-server.service stop/restart job was still running
mariadb.service start job was waiting

zabbix-server.service stop/restart job was still running
mariadb.service start job was waiting

That means MariaDB was not permanently broken.

It was queued to start again, but systemd had not reached that point yet because the Zabbix stop job had not completed.

Checking the journal

Next, I checked the logs around the time when the issue started:

journalctl --no-pager \
  -u mariadb \
  -u zabbix-server \
  -u apt-daily-upgrade \
  --since "2026-06-10 06:30:00" \
  --until "2026-06-10 17:05:00"

journalctl --no-pager \
  -u mariadb \
  -u zabbix-server \
  -u apt-daily-upgrade \
  --since "2026-06-10 06:30:00" \
  --until "2026-06-10 17:05:00"

The important timeline looked like this:

06:36:08 apt-daily-upgrade.service started
06:36:16 mariadb.service was stopped
06:36:16 zabbix-server received SIGTERM
06:36:18 mariadb.service was stopped

06:36:08 apt-daily-upgrade.service started
06:36:16 mariadb.service was stopped
06:36:16 zabbix-server received SIGTERM
06:36:18 mariadb.service was stopped

This showed that the problem started during an automatic upgrade.

unattended-upgrade ran automatically. After the package upgrade, needrestart detected that some running services were still using old shared libraries and attempted to restart them.

The restart transaction involved services such as:

mariadb.service
nginx.service
php8.5-fpm.service
zabbix-server.service

mariadb.service
nginx.service
php8.5-fpm.service
zabbix-server.service

That is the important part.

This was not a normal manual restart of only Zabbix server.

It was an automatic bulk restart involving both Zabbix server and its local database.

Zabbix server log

The Zabbix server log showed repeated database connection failures:

[Z3001] connection to database 'zabbix' failed:
Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)

database is down: reconnecting in 10 seconds

[Z3001] connection to database 'zabbix' failed:
Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)

database is down: reconnecting in 10 seconds

This explains why the stop job did not finish.

Zabbix server was shutting down, but its local database was already unavailable.

Instead of exiting quickly, Zabbix kept trying to reconnect to the database.

The result was a stuck shutdown path.

Why manual restart usually works

A normal manual restart usually looks like this:

sudo systemctl restart zabbix-server

sudo systemctl restart zabbix-server

In that case, MariaDB is still running while Zabbix server shuts down.

So the shutdown path is usually clean:

MariaDB is running
Zabbix server receives SIGTERM
Zabbix server finishes shutdown work
Zabbix server exits
systemd starts Zabbix server again

MariaDB is running
Zabbix server receives SIGTERM
Zabbix server finishes shutdown work
Zabbix server exits
systemd starts Zabbix server again

That is why manually restarting zabbix-server does not necessarily reproduce this issue.

The incident happened because needrestart requested a bulk restart involving multiple related services.

The problematic sequence was:

unattended-upgrade runs automatically
needrestart detects services using old libraries
needrestart requests a bulk restart of multiple services
systemd creates a restart transaction involving MariaDB and zabbix-server
MariaDB is stopped
zabbix-server receives SIGTERM
zabbix-server tries to shut down
zabbix-server can no longer connect to MariaDB
zabbix-server keeps reconnecting to the database
zabbix-server.service remains deactivating / stop-sigterm
mariadb.service start job keeps waiting
the MariaDB socket remains missing
the Zabbix frontend shows a database error

unattended-upgrade runs automatically
needrestart detects services using old libraries
needrestart requests a bulk restart of multiple services
systemd creates a restart transaction involving MariaDB and zabbix-server
MariaDB is stopped
zabbix-server receives SIGTERM
zabbix-server tries to shut down
zabbix-server can no longer connect to MariaDB
zabbix-server keeps reconnecting to the database
zabbix-server.service remains deactivating / stop-sigterm
mariadb.service start job keeps waiting
the MariaDB socket remains missing
the Zabbix frontend shows a database error

So the issue was not simply:

MariaDB stopped and never started again.

MariaDB stopped and never started again.

A more accurate explanation is:

MariaDB was stopped and was supposed to start again, but its start job was blocked because zabbix-server.service did not finish stopping.

MariaDB was stopped and was supposed to start again, but its start job was blocked because zabbix-server.service did not finish stopping.

Why systemd waited so long

The installed Zabbix systemd service used an infinite stop timeout:

TimeoutSec=infinity

TimeoutSec=infinity

This setting is not automatically wrong.

Zabbix may need time to flush data, sync state, or finish internal shutdown work. Killing it too quickly during a normal shutdown could be unsafe.

However, this setting becomes dangerous when Zabbix enters a bad shutdown path.

In this case, Zabbix was trying to shut down while the database was already unavailable. It kept logging:

database is down: reconnecting in 10 seconds

database is down: reconnecting in 10 seconds

Because the stop timeout was infinite, systemd did not automatically give up and kill the service.

So the service stayed in:

deactivating (stop-sigterm)

deactivating (stop-sigterm)

for a long time.

In my case, it stayed stuck for more than 10 hours during the original incident.

Reproducing the behavior

To confirm the behavior, I reproduced the failure path in a controlled test.

The test was:

sudo systemctl stop mariadb
timeout 600s sudo systemctl stop zabbix-server

sudo systemctl stop mariadb
timeout 600s sudo systemctl stop zabbix-server

This means:

Stop MariaDB first.
Then try to stop Zabbix server.
Wait up to 10 minutes.

Stop MariaDB first.
Then try to stop Zabbix server.
Wait up to 10 minutes.

The result was:

stop_rc=124
mariadb: inactive
zabbix-server: deactivating
zabbix-server SubState=stop-sigterm
TimeoutStopUSec=infinity

stop_rc=124
mariadb: inactive
zabbix-server: deactivating
zabbix-server SubState=stop-sigterm
TimeoutStopUSec=infinity

The timeout command returned 124, which means the stop command did not complete within 600 seconds.

During the test, the Zabbix log continued to repeat:

Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
database is down: reconnecting in 10 seconds

Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
database is down: reconnecting in 10 seconds

This confirmed the failure path:

If MariaDB is already stopped, zabbix-server may not finish stopping.

If MariaDB is already stopped, zabbix-server may not finish stopping.

Strictly speaking, this does not prove that Zabbix would never stop forever. But it did not stop within 10 minutes, and with TimeoutSec=infinity, systemd would not automatically interrupt it.

Restoring the service

After the test, I restored the services with:

sudo systemctl cancel zabbix-server.service mariadb.service
sudo systemctl kill -s KILL zabbix-server.service
sudo systemctl reset-failed zabbix-server mariadb
sudo systemctl start mariadb
sudo systemctl start zabbix-server

sudo systemctl cancel zabbix-server.service mariadb.service
sudo systemctl kill -s KILL zabbix-server.service
sudo systemctl reset-failed zabbix-server mariadb
sudo systemctl start mariadb
sudo systemctl start zabbix-server

Then I verified:

systemctl is-active mariadb zabbix-server nginx php8.5-fpm
systemctl list-jobs
curl -k -I https://localhost/

systemctl is-active mariadb zabbix-server nginx php8.5-fpm
systemctl list-jobs
curl -k -I https://localhost/

The services were back to normal:

mariadb: active
zabbix-server: active
nginx: active
php-fpm: active
No jobs running
HTTP 200 OK

mariadb: active
zabbix-server: active
nginx: active
php-fpm: active
No jobs running
HTTP 200 OK

Immediate recovery commands

If the server is already stuck in this state, first inspect the current status:

systemctl status mariadb zabbix-server nginx php8.5-fpm
systemctl list-jobs

systemctl status mariadb zabbix-server nginx php8.5-fpm
systemctl list-jobs

Also check whether upgrade or restart helper processes are still running:

ps -eo pid,ppid,user,stat,etime,cmd \
  | grep -E 'unattended-upgrade|apt.systemd.daily|needrestart|systemctl|zabbix_server|mariadbd' \
  | grep -v grep

ps -eo pid,ppid,user,stat,etime,cmd \
  | grep -E 'unattended-upgrade|apt.systemd.daily|needrestart|systemctl|zabbix_server|mariadbd' \
  | grep -v grep

If zabbix-server is stuck in deactivating and MariaDB is inactive or waiting, the recovery is to cancel the stuck jobs, kill the stuck Zabbix service, reset failed states, and then start MariaDB and Zabbix again.

Use these carefully:

sudo systemctl cancel apt-daily.service apt-daily-upgrade.service mariadb.service zabbix-server.service

sudo pkill -f 'unattended-upgrade|apt.systemd.daily|needrestart' || true

sudo systemctl kill -s KILL zabbix-server.service || true

sudo systemctl reset-failed apt-daily.service apt-daily-upgrade.service mariadb.service zabbix-server.service

sudo systemctl start mariadb
sudo systemctl restart zabbix-server

sudo systemctl cancel apt-daily.service apt-daily-upgrade.service mariadb.service zabbix-server.service

sudo pkill -f 'unattended-upgrade|apt.systemd.daily|needrestart' || true

sudo systemctl kill -s KILL zabbix-server.service || true

sudo systemctl reset-failed apt-daily.service apt-daily-upgrade.service mariadb.service zabbix-server.service

sudo systemctl start mariadb
sudo systemctl restart zabbix-server

Then verify:

systemctl status mariadb zabbix-server nginx php8.5-fpm
systemctl list-jobs

curl -k -I https://localhost/
curl -k https://localhost/ | grep -Ei 'Database error|No such file|Zabbix'

systemctl status mariadb zabbix-server nginx php8.5-fpm
systemctl list-jobs

curl -k -I https://localhost/
curl -k https://localhost/ | grep -Ei 'Database error|No such file|Zabbix'

If the frontend is fixed, it should return the Zabbix login page instead of the database error.

The correct long-term fix

The correct long-term fix is not to directly edit the packaged Zabbix systemd unit file.

Do not manually edit files like:

/usr/lib/systemd/system/zabbix-server.service

/usr/lib/systemd/system/zabbix-server.service

or:

/lib/systemd/system/zabbix-server.service

/lib/systemd/system/zabbix-server.service

Those files are owned by packages and may be overwritten during upgrades.

The better production fix is to control when critical services are restarted.

For a monitoring server with a local database, I do not want unattended-upgrade and needrestart to automatically restart Zabbix and MariaDB together at a random time.

There are two parts to my preferred fix:

1. Prevent needrestart from automatically restarting Zabbix and MariaDB together.
2. Add a reasonable systemd stop timeout as a safety net.

1. Prevent needrestart from automatically restarting Zabbix and MariaDB together.
2. Add a reasonable systemd stop timeout as a safety net.

Fix 1: prevent automatic needrestart restarts for Zabbix and MariaDB

If unattended upgrades are enabled, I would prevent needrestart from automatically restarting Zabbix server and MariaDB.

Create a drop-in config file:

sudo tee /etc/needrestart/conf.d/90-zabbix-mariadb.conf >/dev/null <<'EOF'
# Do not let needrestart automatically restart Zabbix or its local database.
# Restart these services manually during a maintenance window.

$nrconf{override_rc}{qr(^zabbix-server\.service$)} = 0;
$nrconf{override_rc}{qr(^mariadb\.service$)} = 0;
$nrconf{override_rc}{qr(^mysql\.service$)} = 0;
$nrconf{override_rc}{qr(^mysqld\.service$)} = 0;
EOF

sudo tee /etc/needrestart/conf.d/90-zabbix-mariadb.conf >/dev/null <<'EOF'
# Do not let needrestart automatically restart Zabbix or its local database.
# Restart these services manually during a maintenance window.

$nrconf{override_rc}{qr(^zabbix-server\.service$)} = 0;
$nrconf{override_rc}{qr(^mariadb\.service$)} = 0;
$nrconf{override_rc}{qr(^mysql\.service$)} = 0;
$nrconf{override_rc}{qr(^mysqld\.service$)} = 0;
EOF

This does not mean those services never need to be restarted.

It only means they should not be restarted automatically as part of an uncontrolled batch restart.

Instead, restart them manually during a maintenance window.

The safe order is:

sudo systemctl stop zabbix-server
sudo systemctl restart mariadb
sudo systemctl start zabbix-server
sudo systemctl restart nginx php8.5-fpm

sudo systemctl stop zabbix-server
sudo systemctl restart mariadb
sudo systemctl start zabbix-server
sudo systemctl restart nginx php8.5-fpm

The important part is:

Stop Zabbix while MariaDB is still available.
Restart MariaDB.
Start Zabbix again.
Restart the web layer if needed.

Stop Zabbix while MariaDB is still available.
Restart MariaDB.
Start Zabbix again.
Restart the web layer if needed.

This avoids the dangerous shutdown path where MariaDB is already gone while Zabbix server is still trying to stop.

Fix 2: add a systemd timeout override for Zabbix server

The second fix is a safety net.

Even if something goes wrong, I do not want zabbix-server.service to block forever.

Use a systemd override:

sudo systemctl edit zabbix-server

sudo systemctl edit zabbix-server

Add:

[Service]
TimeoutStopSec=5min

[Service]
TimeoutStopSec=5min

Then reload systemd:

sudo systemctl daemon-reload
systemctl show zabbix-server -p TimeoutStopUSec

sudo systemctl daemon-reload
systemctl show zabbix-server -p TimeoutStopUSec

This creates an override under:

/etc/systemd/system/zabbix-server.service.d/override.conf

/etc/systemd/system/zabbix-server.service.d/override.conf

This is better than editing the packaged unit file directly.

I prefer 5min instead of a very short value like 30s, because Zabbix may legitimately need time to shut down cleanly.

But waiting forever is not a good failure mode for a production monitoring server.

Did this guide save you time?

Support this site

Leave a Comment Cancel Reply