Skip to content

Fixes #39298 - Use bulk insert for new fact values on persisted hosts#10979

Open
pablomh wants to merge 3 commits into
theforeman:developfrom
pablomh:registration/bulk-insert-fact-values
Open

Fixes #39298 - Use bulk insert for new fact values on persisted hosts#10979
pablomh wants to merge 3 commits into
theforeman:developfrom
pablomh:registration/bulk-insert-fact-values

Conversation

@pablomh
Copy link
Copy Markdown
Contributor

@pablomh pablomh commented May 7, 2026

Summary

Replace per-fact create! loop with a single insert_all for new facts on persisted hosts, reducing DB round-trips by ~94% (108 INSERTs → 6).

Changes

  • fact_importer.rb: add_new_facts partitions into compose facts (nil values, handled individually) and regular facts (bulk inserted via insert_all)
  • bulk_insert_new_fact_values: new method using FactValue.insert_all with unique_by: [:fact_name_id, :host_id]
  • Counter initialization: @counters = { added: 0, updated: 0, deleted: 0 } prevents nil access

How to test

  1. bundle exec rake test TEST=test/unit/fact_importer_test.rb
  2. Register a host and verify facts are imported correctly

Replace per-fact create! loop with a single insert_all for new facts
on persisted hosts. Compose/parent facts (nil values from
StructuredFactImporter) are excluded and handled individually.

Also initializes @counters with default zero values to prevent
potential nil access.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#8823 was the last attempt at improving fact import performance. Have you had a look at that? The benchmark code may also be useful as a reference.

Comment thread app/services/fact_importer.rb
'Ansible'
end

def initialize(host, facts = {})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this doesn't use the parent:

def initialize(host, facts = {})
      # Try to assign these facts to the correct host as per the facts say
      # If that host isn't created yet, the host parameter will contain it
      host = Host.find_by(:name => facts[:ansible_facts][:ansible_fqdn] ||
                                    facts[:ansible_facts][:fqdn]) ||
              host
    super(host, facts[:ansible_facts])
end

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — the duplicated initialization is pre-existing tech debt. Refactoring it to use super would change the host resolution and fact extraction logic, so we kept it out of scope for this PR.

Comment thread app/services/fact_importer.rb Outdated
- Restore comment on add_new_fact explaining the build/create! choice
- Revert counter initialization to {} -- explicit defaults belong in the
  additive import PR (foreman#10980) where they are needed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@pablomh pablomh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key difference is which method each PR targets: #8823 applied upsert_all to update_facts (existing facts changing value), where the current code already uses update_all with no callbacks — so there was no measurable gain. Our PR applies insert_all to add_new_facts (facts being created for the first time), where the current code calls create! per fact with individual INSERTs.

During registration this is the dominant path — all facts are new (100 per RHEL 8 host, 117 per RHEL 9/10 host) — and at concurrent scale the per-fact INSERTs compound. Measured on a 16-CPU Satellite with only this patch applied (no other changes), mixed workload (50% RHEL 8, 25% RHEL 9, 25% RHEL 10):

Concurrency INSERTs/host (before → after) AR P50 (before → after) AR P95 (before → after)
152 100 (el8) / 117 (el9,el10) → 1 2,610ms → 1,297ms (−50%) 8,214ms → 3,350ms (−59%)
304 100 (el8) / 117 (el9,el10) → 1 4,579ms → 2,004ms (−56%) 11,922ms → 6,232ms (−48%)
456 100 (el8) / 117 (el9,el10) → 1 8,492ms → 2,619ms (−69%) 13,036ms → 6,884ms (−47%)
608 100 (el8) / 117 (el9,el10) → 1 8,701ms → 3,215ms (−63%) 13,963ms → 7,566ms (−46%)
760 100 (el8) / 117 (el9,el10) → 1 9,789ms → 3,666ms (−63%) 14,138ms → 7,120ms (−50%)

Comment thread app/services/fact_importer.rb Outdated
Comment thread app/services/fact_importer.rb Outdated
insert_all handles nil values correctly, so compose facts (nil-valued
hierarchy placeholders) can go through the bulk path. No need to
partition and handle them separately.

Also add safe navigation on fact_names[name]&.id per review suggestion.

Tested on a live Satellite with 464 facts (70 compose/nil-valued):
all inserted correctly via insert_all.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants