Skip to content

Conversation

@Andezion
Copy link
Collaborator

Problem

The SQL plugin had several TODO items:

  • No visibility into data freshness - Users couldn't see when tables were last refreshed
  • Missing time_msec field support - Timestamp fields from JSON schemas weren't properly handled
  • Limited pagination - Only chainmoves and channelmoves used pagination, causing performance issues on large datasets

Solution

1. TODO 2: Refresh time tracking in API
Added tracking of table refresh completion times:

  • New fields in table_desc: last_refresh_time and has_been_refreshed
  • Updated listsqlschemas to expose last_refresh_seconds_ago
  • New sqlstatus command to monitor refresh status with detailed info:
    • has_been_refreshed, needs_refresh, refreshing flags
    • last_refresh_seconds_ago - time since last update
    • refresh_duration_seconds - current refresh duration
    • last_created_index - for pagination tracking

2. TODO 8: time_msec field type support
Added FIELD_TIME_MSEC enum type for millisecond timestamps:

  • Maps to INTEGER SQL type for efficient storage
  • Automatically handles JSON time_msec fields
  • Enables SQL datetime functions on timestamp fields

3. TODO 10: General pagination API
Extended pagination from 2 tables to 6 tables using refresh_by_created_index:

  • listhtlcs - now uses pagination with 'htlcs' wait
  • listforwards - now uses pagination with 'forwards' wait
  • listinvoices - now uses pagination with 'invoices' wait
  • listsendpays - now uses pagination with 'sendpays' wait

About TODO 11: Normalize account_id fields.
The problem is - account_id in chainmoves and channelmoves tables is TEXT and highly duplicated (same values like "wallet", "external" repeated thousands of times).

I propose such solution:

  • Create accounts table with unique account_id values
  • Change chainmoves/channelmoves to reference accounts table via INTEGER foreign key
  • Create VIEWs for backward compatibility with current API

That will give us:

  • Reduced storage (account_id stored once, not N times)
  • Faster queries (INTEGER joins vs TEXT comparison)
  • Referential integrity via foreign keys

Implementation may be like this:

-- New table
CREATE TABLE accounts (
    id INTEGER PRIMARY KEY,
    account_id TEXT UNIQUE
);

-- Modify existing tables
ALTER TABLE chainmoves ADD COLUMN account_id_ref INTEGER REFERENCES accounts(id);

-- Backward compatible view
CREATE VIEW chainmoves_compat AS 
SELECT c.*, a.account_id 
FROM chainmoves c 
JOIN accounts a ON c.account_id_ref = a.id;

Should this be automatic migration or optional flag?
Keep old column as deprecated or remove completely?

Changelog

  • Changelog-Added: Added sqlstatus command to monitor table refresh status
  • Changelog-Added: Added time_msec field type support for timestamp fields
  • Changelog-Changed: listsqlschemas now includes last_refresh_seconds_ago to show data freshness
  • Changelog-Changed: listhtlcs, listforwards, listinvoices, and listsendpays now use pagination for better performance

Important

26.04 FREEZE March 11th: Non-bugfix PRs not ready by this date will wait for 26.06.

RC1 is scheduled on March 23rd

The final release is scheduled for April 15th.

Checklist

Before submitting the PR, ensure the following tasks are completed. If an item is not applicable to your PR, please mark it as checked:

  • The changelog has been updated in the relevant commit(s) according to the guidelines.
  • Tests have been added or modified to reflect the changes.
  • Documentation has been reviewed and updated as needed.
  • Related issues have been listed and linked, including any that this PR closes.
  • Important All PRs must consider how to reverse any persistent changes for tools/lightning-downgrade

@rustyrussell
Copy link
Contributor

Hi! Thanks for this patch series; you've chosen an interesting rabbit hole to dig into!

Since it's one your first contributions, I'm going to nitpick. In particular, you should never use merge; we rebase only. This keeps our tree linear and clean (at cost of some rebasing).

Firstly, I apologize for the brevity of these "TODO": they were reminders to myself and not really for external consumption.

  1. The line "refresh time in API". This was a last resort if we cannot get the performance we needed; a query would be able to indicate that it can handle stale data, thus obviating the need to refresh. We haven't needed this yet, and there are other low-hanging fruit.
  2. The line "time_msec" specifically refers to the two numbers in listforwards (received_time and resolved_time). These are the only doubles we use; it might have been cleaner to have had the JSON print them out as integers (msec as suggested here, or nsec if we wanted to maintain current precision). But we didn't, so that ship has probably sailed.
  3. The line "General pagination API" is indeed a change we want, but your current approach is insufficient.

In particular, these tables (unlike chainmoves and channelmoves) can have entries deleted or changed. We need to spot that, and at the minimum, reload them.

Fortunately, such deletions are usually done occasionally by autoclean, so we don't have to be clever here. We can simply use the wait API to ask if there have been any new deletions, and if so, reload the entire table.

Handling changes is harder! We could do the same thing (reload if anything changes), but changes are more common. A first pass might be to implement this, with tests, to make sure that works. Then refine it: use the wait API async to monitor changes. Keep track of which records changed, and reload only those when we go to refresh. Note three complications:

  1. We may see changes for records we haven't seen yet. This is OK!
  2. We don't actually print out the created_index in the wait update return, which would be incredibly useful for this (it's a primary key for the sql plugin!), so we should do that everywhere first.
  3. If changes come too fast, we will get a raw response which doesn't contain any information, so we still need to do a full reload in this case.

Feel free to reach out if you have any questions!

@Andezion Andezion force-pushed the sql-plugin-refresh-time-and-time-msec branch from 1cf1dae to 8011f41 Compare January 19, 2026 11:00
Changelog-Added: Added sqlstatus command to monitor table refresh status
Changelog-Added: Added time_msec field type support for timestamp fields
Changelog-Changed: listsqlschemas now includes last_refresh_seconds_ago to show data freshness
Changelog-Changed: listhtlcs, listforwards, listinvoices, and listsendpays now use pagination for better performance
@Andezion Andezion force-pushed the sql-plugin-refresh-time-and-time-msec branch from 8011f41 to 239a1ab Compare January 19, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants