Skip to content

Conversation

@jvsena42
Copy link
Member

@jvsena42 jvsena42 commented Jan 16, 2026

Fixes #652
Fixes #676
Fixes #662

This PR fixes the "Bitkit keeps stopping" crash that occurs after wallet restoration when LDK's transaction sync encounters an unconfirmed channel funding transaction on Electrum.

Description

The root cause was that after wallet restoration, LDK couldn't find a confirmed channel's funding transaction on Electrum, because it is a zero-conf channel restored before the transaction is confirmed, causing an infinite retry loop until timeout.
Also the retry flow was setting the node state as initialLifecycleState on repository, even it being running on service

Preview

master-with-crash.webm
fixed.webm

Uploading migration.webm…

QA Notes

1. Verify sync error handling

  1. Receive 10K sats on CJIT
  2. Reset and restored the wallet before the channel is confirmed
  3. Verify the app remains responsive and doesn't crash
  4. Perform LN transactions

2. Migration

  1. Instal the RN app
  2. Receive 10K sats on CJIT
  3. Install this branch and trigger migration
  4. Verify the app remains responsive and doesn't crash
  5. Perform LN transactions

@jvsena42 jvsena42 changed the title fix: sync exception not caught causing app crash fix: sync exception caught causing app crash Jan 19, 2026
@jvsena42 jvsena42 marked this pull request as ready for review January 19, 2026 16:53
@jvsena42 jvsena42 requested a review from Copilot January 19, 2026 16:55

This comment was marked as outdated.

@claude

This comment has been minimized.

@jvsena42 jvsena42 marked this pull request as draft January 19, 2026 17:25
@jvsena42

This comment was marked as resolved.

@jvsena42 jvsena42 marked this pull request as ready for review January 19, 2026 17:32
@claude

This comment has been minimized.

@jvsena42

This comment was marked as resolved.

@jvsena42 jvsena42 requested a review from ovitrif January 20, 2026 10:00
@jvsena42 jvsena42 self-assigned this Jan 20, 2026
@jvsena42 jvsena42 changed the title fix: sync exception caught causing app crash fix: sync exception causing infinite loop on start Jan 20, 2026
@jvsena42 jvsena42 mentioned this pull request Jan 20, 2026
@jvsena42 jvsena42 enabled auto-merge January 20, 2026 12:35
@piotr-iohk
Copy link
Collaborator

I managed to crash Bitkit on this branch, could be not related with the changes, nor the original issue though.
What I did was:

  • Receive 10K sats on CJIT
  • reset and restore wallet (although CJIT channel was already opened)
adb logcat -d | grep -A 50 "01-20 11:48:18.381.*FATAL EXCEPTION: DefaultDispatcher-worker-6"

01-20 11:48:18.381 12278 12322 E AndroidRuntime: FATAL EXCEPTION: DefaultDispatcher-worker-6
01-20 11:48:18.381 12278 12322 E AndroidRuntime: Process: to.bitkit.dev, PID: 12278
01-20 11:48:18.381 12278 12322 E AndroidRuntime: to.bitkit.utils.ServiceError$MnemonicNotFound: Mnemonic not found
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssStoreIdProvider.getVssStoreId(VssStoreIdProvider.kt:23)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient$setup$2$1$1.invokeSuspend(VssBackupClient.kt:38)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient$setup$2$1$1.invoke(Unknown Source:8)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient$setup$2$1$1.invoke(Unknown Source:4)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndspatched(Undispatched.kt:66)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturnIgnoreTimeout(Undispatched.kt:50)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.TimeoutKt.setupTimeout(Timeout.kt:149)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.TimeoutKt.withTimeout(Timeout.kt:44)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.TimeoutKt.withTimeout-KLykuaI(Timeout.kt:72)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient$setup$2.invokeSuspend(VssBackupClient.kt:34)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient$setup$2.invoke(Unknown Source:8)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient$setup$2.invoke(Unknown Source:4)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndspatched(Undispatched.kt:66)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturn(Undispatched.kt:43)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.BuildersKt__Builders_commonKt.withContext(Builders.common.kt:165)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.BuildersKt.withContext(Unknown Source:1)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient.setup-gIAlu-s(VssBackupClient.kt:32)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient.setup-gIAlu-s$default(VssBackupClient.kt:32)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.repositories.BackupRepo$startObservingBackups$1.invokeSuspend(BackupRepo.kt:123)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:34)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:100)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:124)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:89)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:586)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:820)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:717)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:704)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	Suppressed: kotlinx.coroutines.internal.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@c714896, Dispatchers.IO]
01-20 11:48:18.385   506 12606 I DropBoxManagerService: add tag=data_app_crash isTagEnabled=true flags=0x2
01-20 11:48:18.394   506   521 W ActivityTaskManager:   Force finishing activity to.bitkit.dev/to.bitkit.ui.MainActivity
01-20 11:48:18.414 12278 12322 I Process : Sending signal. PID: 12278 SIG: 9
01-20 11:48:18.414   506   536 W BroadcastQueue: Background execution not allowed: receiving Intent { act=android.intent.action.DROPBOX_ENTRY_ADDED flg=0x10 (has extras) } to com.google.android.gms/.stats.service.DropBoxEntryAddedReceiver
01-20 11:48:18.415   506   536 W BroadcastQueue: Background execution not allowed: receiving Intent { act=android.intent.action.DROPBOX_ENTRY_ADDED flg=0x10 (has extras) } to com.google.android.gms/.chimera.GmsIntentOperationService$PersistentTrustedReceiver
01-20 11:48:18.417  1064  1064 D MainContentCaptureSession: Flushing 1 event(s) for act:com.google.android.apps.nexuslauncher/.NexusLauncherActivity [state=2 (ACTIVE), disabled=false], reason=FULL
01-20 11:48:18.421   506  1731 W AppSearchIcing: icing-search-engine.cc:217: Error: 5, Message: Document (com.google.android.googlequicksearchbox$OneSearchZeroStateGoogleSuggestions/default, zp) not found.
01-20 11:48:18.426 14758 15605 E OpenGLRenderer: Unable to match the desired swap behavior.
01-20 11:48:18.427  1064  2152 D OneSearchSuggestProvider: Created the binder channel successfully for end point service =com.google.android.apps.search.googleapp.search.suggest.plugins.onesearch.server.OneSearchSuggestService , mChannel=Q0{delegate=L0{logId=557, target=directaddress:///AndroidComponentAddress%5BIntent%20%7B%20act=grpc.io.action.BIND%20cmp=com.google.android.googlequicksearchbox/com.google.android.apps.search.googleapp.search.suggest.plugins.onesearch.server.OneSearchSuggestService%20%7D%5D}} , mOneSearchConnection=y1.J@66c1ae1
01-20 11:48:18.428  1064  2002 E OpenGLRenderer: Unable to match the desired swap behavior.
01-20 11:48:18.430  1665  1942 I AiAiEcho: Predicting[0]: 
01-20 11:48:18.430  1665  1942 I AiAiEcho: Ranked targets strategy: SORT, count: 0, ranking metadata: 
01-20 11:48:18.431  1665  1942 I AiAiEcho: #postPredictionTargets: Sending updates to UISurface lockscreen with targets# 0
01-20 11:48:18.431  1665  1942 I AiAiEcho: #postPredictionTargets: Sending updates to UISurface home with targets# 0
01-20 11:48:18.468   506  2087 W InputMethodManagerService: Got RemoteException sending setActive(false) notification to pid 12278 uid 10232
01-20 11:48:18.469  2087  2087 I binder_alloc: 12278: binder_alloc_buf, no vma
01-20 11:48:18.469  2087  2087 I binder  : 506:2087 transaction failed 29189/-3, size 120-0 line 3371
01-20 11:48:18.469  1543  1543 I GoogleInputMethodService: GoogleInputMethodService.onFinishInput():3227 
01-20 11:48:18.470   506   569 I ActivityManager: Process to.bitkit.dev (pid 12278) has died: fg +50 FGS 
01-20 11:48:18.471   506   538 I libprocessgroup: Successfully killed process cgroup uid 10232 pid 12278 in 0ms

Accoriging to AI:
Root Cause: Race condition - VssBackupClient.setup() is being called before the mnemonic is available in the keychain. The VSS store ID provider needs the mnemonic to generate the unique store ID, but BackupRepo.startObservingBackups() is triggered too early in the startup sequence.

AI proposed fix:

Summary of Changes

1. VssBackupClient.kt

  • Changed setup() return type from Unit to Boolean:
  • Returns true if setup succeeded
  • Returns false if mnemonic is not available yet (no crash)
  • Throws on other errors (network issues, etc.)
  • Added idempotency check: If already set up successfully, returns immediately without re-initializing
  • Moved mnemonic check outside runCatching: Now checks for mnemonic availability before attempting setup, preventing the MnemonicNotFound exception

2. BackupRepo.kt

  • Added setupVssClientWithRetry() function that:
  • Retries setup up to 10 times with linear backoff (1s, 2s, 3s, ...)
  • Stops retrying if stopObservingBackups() is called
  • Logs progress and final status
  • Changed startObservingBackups() to call the retry function instead of directly calling setup()

Behavior

  • When the Lightning node starts, backup observation begins
  • VSS client setup is attempted
  • If mnemonic is not available yet (race condition), setup returns false and retries after 1 second
  • Retries continue with increasing delay until mnemonic becomes available (up to 10 attempts)
  • Once mnemonic is available, setup succeeds and backups work normally
  • This prevents the crash while ensuring backups eventually start working once the mnemonic is saved to keychain.

@jvsena42
Copy link
Member Author

I managed to crash Bitkit on this branch, could be not related with the changes, nor the original issue though. What I did was:

  • Receive 10K sats on CJIT
  • reset and restore wallet (although CJIT channel was already opened)
adb logcat -d | grep -A 50 "01-20 11:48:18.381.*FATAL EXCEPTION: DefaultDispatcher-worker-6"

01-20 11:48:18.381 12278 12322 E AndroidRuntime: FATAL EXCEPTION: DefaultDispatcher-worker-6
01-20 11:48:18.381 12278 12322 E AndroidRuntime: Process: to.bitkit.dev, PID: 12278
01-20 11:48:18.381 12278 12322 E AndroidRuntime: to.bitkit.utils.ServiceError$MnemonicNotFound: Mnemonic not found
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssStoreIdProvider.getVssStoreId(VssStoreIdProvider.kt:23)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient$setup$2$1$1.invokeSuspend(VssBackupClient.kt:38)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient$setup$2$1$1.invoke(Unknown Source:8)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient$setup$2$1$1.invoke(Unknown Source:4)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndspatched(Undispatched.kt:66)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturnIgnoreTimeout(Undispatched.kt:50)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.TimeoutKt.setupTimeout(Timeout.kt:149)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.TimeoutKt.withTimeout(Timeout.kt:44)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.TimeoutKt.withTimeout-KLykuaI(Timeout.kt:72)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient$setup$2.invokeSuspend(VssBackupClient.kt:34)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient$setup$2.invoke(Unknown Source:8)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient$setup$2.invoke(Unknown Source:4)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndspatched(Undispatched.kt:66)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturn(Undispatched.kt:43)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.BuildersKt__Builders_commonKt.withContext(Builders.common.kt:165)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.BuildersKt.withContext(Unknown Source:1)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient.setup-gIAlu-s(VssBackupClient.kt:32)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.data.backup.VssBackupClient.setup-gIAlu-s$default(VssBackupClient.kt:32)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at to.bitkit.repositories.BackupRepo$startObservingBackups$1.invokeSuspend(BackupRepo.kt:123)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:34)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:100)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:124)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:89)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:586)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:820)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:717)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:704)
01-20 11:48:18.381 12278 12322 E AndroidRuntime: 	Suppressed: kotlinx.coroutines.internal.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@c714896, Dispatchers.IO]
01-20 11:48:18.385   506 12606 I DropBoxManagerService: add tag=data_app_crash isTagEnabled=true flags=0x2
01-20 11:48:18.394   506   521 W ActivityTaskManager:   Force finishing activity to.bitkit.dev/to.bitkit.ui.MainActivity
01-20 11:48:18.414 12278 12322 I Process : Sending signal. PID: 12278 SIG: 9
01-20 11:48:18.414   506   536 W BroadcastQueue: Background execution not allowed: receiving Intent { act=android.intent.action.DROPBOX_ENTRY_ADDED flg=0x10 (has extras) } to com.google.android.gms/.stats.service.DropBoxEntryAddedReceiver
01-20 11:48:18.415   506   536 W BroadcastQueue: Background execution not allowed: receiving Intent { act=android.intent.action.DROPBOX_ENTRY_ADDED flg=0x10 (has extras) } to com.google.android.gms/.chimera.GmsIntentOperationService$PersistentTrustedReceiver
01-20 11:48:18.417  1064  1064 D MainContentCaptureSession: Flushing 1 event(s) for act:com.google.android.apps.nexuslauncher/.NexusLauncherActivity [state=2 (ACTIVE), disabled=false], reason=FULL
01-20 11:48:18.421   506  1731 W AppSearchIcing: icing-search-engine.cc:217: Error: 5, Message: Document (com.google.android.googlequicksearchbox$OneSearchZeroStateGoogleSuggestions/default, zp) not found.
01-20 11:48:18.426 14758 15605 E OpenGLRenderer: Unable to match the desired swap behavior.
01-20 11:48:18.427  1064  2152 D OneSearchSuggestProvider: Created the binder channel successfully for end point service =com.google.android.apps.search.googleapp.search.suggest.plugins.onesearch.server.OneSearchSuggestService , mChannel=Q0{delegate=L0{logId=557, target=directaddress:///AndroidComponentAddress%5BIntent%20%7B%20act=grpc.io.action.BIND%20cmp=com.google.android.googlequicksearchbox/com.google.android.apps.search.googleapp.search.suggest.plugins.onesearch.server.OneSearchSuggestService%20%7D%5D}} , mOneSearchConnection=y1.J@66c1ae1
01-20 11:48:18.428  1064  2002 E OpenGLRenderer: Unable to match the desired swap behavior.
01-20 11:48:18.430  1665  1942 I AiAiEcho: Predicting[0]: 
01-20 11:48:18.430  1665  1942 I AiAiEcho: Ranked targets strategy: SORT, count: 0, ranking metadata: 
01-20 11:48:18.431  1665  1942 I AiAiEcho: #postPredictionTargets: Sending updates to UISurface lockscreen with targets# 0
01-20 11:48:18.431  1665  1942 I AiAiEcho: #postPredictionTargets: Sending updates to UISurface home with targets# 0
01-20 11:48:18.468   506  2087 W InputMethodManagerService: Got RemoteException sending setActive(false) notification to pid 12278 uid 10232
01-20 11:48:18.469  2087  2087 I binder_alloc: 12278: binder_alloc_buf, no vma
01-20 11:48:18.469  2087  2087 I binder  : 506:2087 transaction failed 29189/-3, size 120-0 line 3371
01-20 11:48:18.469  1543  1543 I GoogleInputMethodService: GoogleInputMethodService.onFinishInput():3227 
01-20 11:48:18.470   506   569 I ActivityManager: Process to.bitkit.dev (pid 12278) has died: fg +50 FGS 
01-20 11:48:18.471   506   538 I libprocessgroup: Successfully killed process cgroup uid 10232 pid 12278 in 0ms

Accoriging to AI: Root Cause: Race condition - VssBackupClient.setup() is being called before the mnemonic is available in the keychain. The VSS store ID provider needs the mnemonic to generate the unique store ID, but BackupRepo.startObservingBackups() is triggered too early in the startup sequence.

AI proposed fix:

Summary of Changes

1. VssBackupClient.kt

  • Changed setup() return type from Unit to Boolean:
  • Returns true if setup succeeded
  • Returns false if mnemonic is not available yet (no crash)
  • Throws on other errors (network issues, etc.)
  • Added idempotency check: If already set up successfully, returns immediately without re-initializing
  • Moved mnemonic check outside runCatching: Now checks for mnemonic availability before attempting setup, preventing the MnemonicNotFound exception

2. BackupRepo.kt

  • Added setupVssClientWithRetry() function that:
  • Retries setup up to 10 times with linear backoff (1s, 2s, 3s, ...)
  • Stops retrying if stopObservingBackups() is called
  • Logs progress and final status
  • Changed startObservingBackups() to call the retry function instead of directly calling setup()

Behavior

  • When the Lightning node starts, backup observation begins
  • VSS client setup is attempted
  • If mnemonic is not available yet (race condition), setup returns false and retries after 1 second
  • Retries continue with increasing delay until mnemonic becomes available (up to 10 attempts)
  • Once mnemonic is available, setup succeeds and backups work normally
  • This prevents the crash while ensuring backups eventually start working once the mnemonic is saved to keychain.

Thanks! It is a different issue #553 . It was closed because we couldn't reproduce it again

Copy link
Collaborator

@ovitrif ovitrif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utAck

@jvsena42 jvsena42 merged commit 9e6fb8c into master Jan 20, 2026
18 checks passed
@jvsena42 jvsena42 deleted the fix/bitkit-not-responding branch January 20, 2026 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ldk-node Error Rocket screen showed again after home screen [Bug] Bitkit keeps stopping Error

4 participants