Fix “JDBC Socket error / Connection reset by peer” in WildFly with MariaDB (Stale Connection Pool)

This post explains a common (and confusing) failure mode when running EJBCA on WildFly with MariaDB/MySQL:

An exception has occurred.
JDBC exception executing SQL … [(conn=32) Socket error]

It often happens after the system has been idle for a while (overnight, after maintenance windows, or simply after long inactivity), and then suddenly the first Admin Web request fails.

The good news: this is usually not an “EJBCA bug”. It’s almost always a stale (half-dead) JDBC connection sitting inside WildFly’s datasource pool.

Symptom

WildFly (Hibernate/JPA) throws something like:

org.hibernate.exception.JDBCConnectionException: JDBC exception executing SQL
... [(conn=32) Socket error]
Caused by: java.sql.SQLNonTransientConnectionException: (conn=32) Socket error
Caused by: java.net.SocketException: Connection reset by peer

org.hibernate.exception.JDBCConnectionException: JDBC exception executing SQL
... [(conn=32) Socket error]
Caused by: java.sql.SQLNonTransientConnectionException: (conn=32) Socket error
Caused by: java.net.SocketException: Connection reset by peer

On the MariaDB side you may see a warning like:

Aborted connection 32 to db: 'ejbca' user: 'ejbca' host: 'localhost'
(Got timeout reading communication packets)

Aborted connection 32 to db: 'ejbca' user: 'ejbca' host: 'localhost'
(Got timeout reading communication packets)

What this error really means

WildFly uses a connection pool for your datasource (e.g. java:/EjbcaDS).
That pool tries to reuse existing TCP connections to MariaDB for performance.

But MariaDB can (and will) close connections for many reasons, for example:

Idle timeout (wait_timeout) closing inactive sessions after some time.
Network hiccups / TCP resets (even localhost can see resets if the peer closes).
DB restarts / upgrades / maintenance.
Firewall/NAT timeouts (common in real networks, less common on localhost).

When MariaDB closes a connection, WildFly’s pool does not automatically know that the socket is dead unless you enable validation.

So later, when EJBCA borrows that connection and runs a query, the driver discovers the socket is gone and you get Socket error / Connection reset.

This is the “stale connection” trap.

Why the default behavior can cause this (and why it’s not “auto-fixed” by default)

A pool can detect dead connections in two ways:

Validate when borrowing (validate-on-match)
Validate periodically in the background (background-validation)

Both add overhead (extra checks / extra round-trips). Many environments have:

DB timeouts configured very high (or disabled),
keepalives enabled,
stable long-lived connections,
or apps that tolerate a rare first-failure and retry.

Because of that, many server defaults are biased toward performance + fewer checks, not “maximum resilience”.

If you run something like EJBCA where the Admin UI must be reliable even after idle periods, you should explicitly configure the pool to be self-healing.

The fix: validate connections and recycle idle ones (so WildFly closes first)

There are two key goals:

Never hand out a dead connection from the pool
Let WildFly recycle idle connections before MariaDB does

Recommended baseline settings

Turn on background validation (periodic health check)
Set idle-timeout-minutes to a value smaller than MariaDB’s wait_timeout
Optionally define a simple validation query (SELECT 1) if you want the most explicit behavior

Tip: In many designs, you choose either validate-on-match or background-validation.
Background validation is usually enough to fix the “overnight stale connection” problem with lower per-request overhead.

Apply the fix (WildFly CLI)

Configure datasource validation + idle recycling

Below is a “low overhead, reliable after idle” profile:

sudo -u wildfly /opt/wildfly/bin/jboss-cli.sh --connect <<'EOF'
batch
/subsystem=datasources/data-source=EjbcaDS:write-attribute(name=validate-on-match,value=false)
/subsystem=datasources/data-source=EjbcaDS:write-attribute(name=background-validation,value=true)
/subsystem=datasources/data-source=EjbcaDS:write-attribute(name=background-validation-millis,value=60000)
/subsystem=datasources/data-source=EjbcaDS:write-attribute(name=idle-timeout-minutes,value=30)
/subsystem=datasources/data-source=EjbcaDS:write-attribute(name=check-valid-connection-sql,value="SELECT 1")
run-batch
reload
EOF

sudo -u wildfly /opt/wildfly/bin/jboss-cli.sh --connect <<'EOF'
batch
/subsystem=datasources/data-source=EjbcaDS:write-attribute(name=validate-on-match,value=false)
/subsystem=datasources/data-source=EjbcaDS:write-attribute(name=background-validation,value=true)
/subsystem=datasources/data-source=EjbcaDS:write-attribute(name=background-validation-millis,value=60000)
/subsystem=datasources/data-source=EjbcaDS:write-attribute(name=idle-timeout-minutes,value=30)
/subsystem=datasources/data-source=EjbcaDS:write-attribute(name=check-valid-connection-sql,value="SELECT 1")
run-batch
reload
EOF

Notes:

If the CLI prints something like Failed to establish connection ... right after reload, that can be normal because the management endpoint restarts. Re-run the same jboss-cli.sh command after a few seconds (the management interface is restarting after reload).
If you prefer “validate on every borrow” instead, flip the strategy:
- validate-on-match=true
- background-validation=false

Flush existing stale connections in the pool

This uses the operation name that is commonly available and matches your environment:

sudo -u wildfly /opt/wildfly/bin/jboss-cli.sh --connect \
  --commands='/subsystem=datasources/data-source=EjbcaDS:flush-all-connection-in-pool'

sudo -u wildfly /opt/wildfly/bin/jboss-cli.sh --connect \
  --commands='/subsystem=datasources/data-source=EjbcaDS:flush-all-connection-in-pool'

Test the datasource

sudo -u wildfly /opt/wildfly/bin/jboss-cli.sh --connect \
  --commands='/subsystem=datasources/data-source=EjbcaDS:test-connection-in-pool'

sudo -u wildfly /opt/wildfly/bin/jboss-cli.sh --connect \
  --commands='/subsystem=datasources/data-source=EjbcaDS:test-connection-in-pool'

You want:

"result" => [true]

"result" => [true]

These variables commonly show up in investigations:

wait_timeout / interactive_timeout: how long the server keeps idle connections
net_read_timeout / net_write_timeout: how long the server waits while reading/writing packets

Example:

SHOW VARIABLES LIKE 'wait_timeout';
SHOW VARIABLES LIKE 'net_read_timeout';
SHOW VARIABLES LIKE 'net_write_timeout';

SHOW VARIABLES LIKE 'wait_timeout';
SHOW VARIABLES LIKE 'net_read_timeout';
SHOW VARIABLES LIKE 'net_write_timeout';

The practical rule

Set WildFly:

idle-timeout-minutes shorter than MariaDB wait_timeout (converted to minutes)

So WildFly proactively closes idle connections and replaces them, instead of letting MariaDB kill them invisibly.

Did this guide save you time?

Support this site

Leave a Comment Cancel Reply