Skip to content

Broken database connections don't get replaced #6724

@mrog

Description

@mrog
ISSUE TYPE
  • Bug Report
COMPONENT NAME
Database connection
CLOUDSTACK VERSION
4.16, 4.17 (and likely some older versions, too.)
CONFIGURATION

CloudStack 4.16 and MySQL 8.0.28, running on different servers

OS / ENVIRONMENT

CentOS 7

SUMMARY

Something caused the database connection to be unusable. The root cause is unknown. The bug is that CloudStack detected the bad database connection and did nothing to correct it. This exception appeared many times in the log until we restarted the cloudstack-management service.

2022-09-07 13:31:30,626 ERROR [c.c.u.d.ConnectionConcierge] (ConnectionConcierge-1:ctx-3dd4c5d7) (logid:cda577ad) Unable to keep the db connection for LockController1
java.sql.SQLNonTransientConnectionException: No operations allowed after connection closed.
        at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:110)
        at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
        at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:89)
        at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:63)
        at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:73)
        at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:73)
        at com.mysql.cj.jdbc.ConnectionImpl.prepareStatement(ConnectionImpl.java:1659)
        at com.mysql.cj.jdbc.ConnectionImpl.prepareStatement(ConnectionImpl.java:1575)
        at org.apache.commons.dbcp2.DelegatingConnection.prepareStatement(DelegatingConnection.java:301)
        at org.apache.commons.dbcp2.DelegatingConnection.prepareStatement(DelegatingConnection.java:301)
        at com.cloud.utils.db.ConnectionConcierge$ConnectionConciergeManager.testValidity(ConnectionConcierge.java:147)
        at com.cloud.utils.db.ConnectionConcierge$ConnectionConciergeManager$1.runInContext(ConnectionConcierge.java:203)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.mysql.cj.exceptions.ConnectionIsClosedException: No operations allowed after connection closed.
        at jdk.internal.reflect.GeneratedConstructorAccessor197.newInstance(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
        ... 22 more

I examined the ConnectionConcierge class in CloudStack. I found that it was checking the health of the database connection, and it has code to reset the database connection, but the connection reset code doesn't get called from anywhere. I'm guessing this used to work, but it was broken during refactoring some time in the past.

STEPS TO REPRODUCE
1. Start MySQL
2. Start the CloudStack management service and let it run for a short time.
3. Stop MySQL, wait a couple of minutes, then start it again.
EXPECTED RESULTS
The CloudStack management service should reconnect to the database.
ACTUAL RESULTS
The CloudStack management service keeps trying to use the broken connections.  The management-server.log file shows lots of exceptions like the one in the bug description above.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions