fix: reconnect(): respect QoS and fail-safe by BMDan · Pull Request #254 · adafruit/Adafruit_CircuitPython_MiniMQTT

BMDan · 2026-02-12T05:38:34Z

See the two attached issues for more information.

The alternative to the overly-broad except would be to burn some RAM on storing a "true" copy separately somewhere in the class. This felt like a reasonable compromise to avoid that.

Closes #252
Closes #253

See #255 for a version of this PR that's more limited in scope, if you'd rather.

Closes adafruit#252 Closes adafruit#253

vladak

Stashing the topics is fine, to me. I am not comfortable with the broad catch, though.

vladak · 2026-02-23T09:59:26Z

+                while subscribed_topics:
+                    feed = subscribed_topics.pop()
+                    self.subscribe(*feed)
+        except Exception:


I wonder if the broad exception could be reduced to the MQTT exception ?

To be clear, no matter what the exception is, we re-raise it (see the bare raise a few lines below). This is the moral equivalent of a finally or defer clause; we aren't masking nor handling the exception, merely pausing its propagation long enough to make sure our object is left in a sane state. In fact, we could do it with finally, if you'd prefer.

As I see it, there are three options here, in addition to what I've implemented:

Track _original_subscriptions (or _remaining_original_subscriptions) in the object. This costs some space and complexity, but isn't otherwise too terrible. It does introduce an edge case where we would potentially subscribe to a topic twice, but that's probably not too awful.

Narrow the scope of the exception being caught. This reduces the likelihood that we stomp on someone else's toe, but reintroduces the risk that an error that is not within our caught scope (even something so prosaic as an IndexError arising partway through a re-subscription) could cause us to violate our API contract and not fully re-subscribe upon reconnect.

We could leave it to the caller to identify this situation. This feels like the worst option; it requires the caller either to issue spurious subscribe()s, or to look at our private class vars (_subscribed_topics). Plus, it means our guarantee of re-subscription upon reconnect cannot be relied upon.

Thanks for the detailed information about the thought process, really appreciated.

I think the key question here is whether anything besides MMQTTException being ~~thrown~~raised from the depths of the library code is expected to be recoverable (in general and also w.r.t. the internal MQTT object state). My take on this is that if there is, it should be wrapped in MMQTTException, i.e. I do not see the need for the broad exception catch.

Ah, I see. Thank you for laying that out so clearly!

To work out the best path, I think it's helpful to have a real scenario. One of the lines within the try-catch is:

self.logger.debug("Reconnected with broker")

Let us imagine that a custom global logging handler has a PotM bug, and that an exception will be raised from self.logger.debug if it is called at 4:56 A.M. on any Tuesday in March of 2026.

I put myself in the position of an engineer (who doesn't control the MiniMQTT library) who became aware of this bug when it triggered last week, but doesn't quite know how to reproduce it yet. Helpfully, the custom logger handler throws a corresponding CustomLoggingExceptions, and my kernel's a nice, clean loop, so I can do something like:

current_state = State() while True: try: current_state.run_main_loop_once() except CustomLoggingException as e: # upload lots of debugging info, then... pass

And inside of run_main_loop_once, we already had something like:

try: mqtt_client.ping() except MMQTTException: # Per docs for MMQTTException, "In general, the robust way to recover is to call reconnect()." mqtt_client.reconnect()

Perfect! Now I've got resilient code that won't crash in the face of a CLE, but will give me lots of debugging info.

Problem is, and we happen to fail our ping right around 4:55 A.M., and the time ticking over to 4:56 A.M. happens to occur partway through the reconnect loop, and if the next debug that gets called happens to be the one inside of the MiniMQTT library's reconnect(), then when I resume after that CLE, I will only be subscribed to a fraction of my topics. As you can see, that's quite a difficult scenario to debug.

However, I think the bigger issue is that, if that bug is found a different way that doesn't result in a partial re-subscription, then even given a very skilled programmer who is tasked with working around that bug (and, let us say, is somehow prevented from directly addressing the bug itself), their solution almost certainly would rely on reconnect()'s apparent semantics and thus would introduce a new, far more subtle bug that's incredibly difficult to reproduce. Indeed, even given an omniscient programmer who foresaw how reconnect() would be affected, their only options to handle it cleanly are:

Reach inside of their client to query _subscribed_topics and compare that to a locally-kept complete list, or

Tear down their client entirely and rebuild it from scratch any time an error occurs in an MQTT function.

Both of these require a lot more (branching) code, and both carry costs in at least two of the three categories of CPU, memory, and/or network traffic. Further, they require a level of defensive coding that seems unreasonable to expect from a consumer of this library.

Put simply, our API contract isn't supposed to require this sort of legwork from our upstream consumer. They were told that the resub_topics parameter worked in a particular way. I think it'd therefore be a bug if reconnect(resub_topics=True) didn't result in a full resubscription if called twice, even if the first call threw an exception of some kind, so long as the second of those calls succeeded. The whole idea of "reconnect and resubscribe" is to restore a known-good state. Let's do that.

vladak · 2026-02-23T12:00:55Z

Here's a test case, primarily meant for #253 however can serve as a test for #252 as well:

diff --git a/tests/test_reconnect.py b/tests/test_reconnect.py
index 52b8c76..f5f73fe 100644
--- a/tests/test_reconnect.py
+++ b/tests/test_reconnect.py
@@ -237,3 +237,71 @@ def test_reconnect_not_connected() -> None:
 
     assert user_data.get("disconnect") == False
     assert mqtt_client._connection_manager.close_cnt == 0
+
+
+def test_reconnect_subscribe_failure() -> None:
+    """
+    Test reconnect() will not lose previously subscribed topics on subscribe
+    failure inside reconnect().
+
+    This is a bit finicky as it relies on reconnect() calling subscribe() for each
+    topic separately and in reverse order. Also, it checks the internal
+    _subscribed_topics variable and assumes it stores the topics-to-be-subscribed
+    rather than already subscribed topics.
+    """
+    logging.basicConfig()
+    logger = logging.getLogger(__name__)
+    logger.setLevel(logging.DEBUG)
+
+    host = "localhost"
+    port = 1883
+
+    mqtt_client = MQTT.MQTT(
+        broker=host,
+        port=port,
+        ssl_context=ssl.create_default_context(),
+        connect_retries=1,
+    )
+
+    mocket = Mocket(
+        bytearray([
+            0x20,  # CONNACK
+            0x02,
+            0x00,
+            0x00,
+            0x90,  # SUBACK
+            0x03,
+            0x00,
+            0x01,
+            0x00,
+            0x00,
+            0x20,  # CONNACK
+            0x02,
+            0x00,
+            0x00,
+            0x90,  # SUBACK
+            0x02,
+            0x00,
+            0x02,
+            0x00,
+            0x90,  # SUBACK to make subscribe to bar fail
+            0x02,
+            0x00,
+            0x03,
+            0x80,
+        ])
+    )
+    mqtt_client._connection_manager = FakeConnectionManager(mocket)
+    mqtt_client.connect()
+
+    mqtt_client.logger = logger
+
+    topics = [("bar", 0), ("foo", 0)]
+    logger.info(f"subscribing to {topics}")
+    mqtt_client.subscribe(topics)
+
+    with pytest.raises(MQTT.MMQTTException):
+        logger.info("reconnecting")
+        mqtt_client.reconnect()
+
+    assert set(mqtt_client._subscribed_topics) == set(topics)

BMDan force-pushed the fix/reconnect_qos_and_drops branch from 819bb6d to a5a620b Compare February 12, 2026 05:40

BMDan mentioned this pull request Feb 12, 2026

fix: reconnect(): respect QoS #255

Open

fix: reconnect(): respect QoS and fail-safe

2451d2a

Closes adafruit#252 Closes adafruit#253

BMDan force-pushed the fix/reconnect_qos_and_drops branch from a5a620b to 2451d2a Compare February 12, 2026 05:45

test: support QoS-enhanced subscription tuples

695788b

vladak suggested changes Feb 23, 2026

View reviewed changes

doc: improve comments and errors

35f8d18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: reconnect(): respect QoS and fail-safe#254

fix: reconnect(): respect QoS and fail-safe#254
BMDan wants to merge 3 commits intoadafruit:mainfrom
BMDan:fix/reconnect_qos_and_drops

BMDan commented Feb 12, 2026 •

edited

Loading

Uh oh!

vladak left a comment

Uh oh!

Uh oh!

vladak Feb 23, 2026

Uh oh!

BMDan Feb 26, 2026

Uh oh!

vladak Mar 9, 2026

Uh oh!

BMDan Mar 9, 2026

Uh oh!

vladak commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

BMDan commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vladak left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vladak Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

BMDan Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

vladak Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

BMDan Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

vladak commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BMDan commented Feb 12, 2026 •

edited

Loading