Fixes #39298 - Use bulk insert for new fact values on persisted hosts#10979
Fixes #39298 - Use bulk insert for new fact values on persisted hosts#10979pablomh wants to merge 3 commits into
Conversation
Replace per-fact create! loop with a single insert_all for new facts on persisted hosts. Compose/parent facts (nil values from StructuredFactImporter) are excluded and handled individually. Also initializes @counters with default zero values to prevent potential nil access. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| 'Ansible' | ||
| end | ||
|
|
||
| def initialize(host, facts = {}) |
There was a problem hiding this comment.
I'm surprised this doesn't use the parent:
def initialize(host, facts = {})
# Try to assign these facts to the correct host as per the facts say
# If that host isn't created yet, the host parameter will contain it
host = Host.find_by(:name => facts[:ansible_facts][:ansible_fqdn] ||
facts[:ansible_facts][:fqdn]) ||
host
super(host, facts[:ansible_facts])
endThere was a problem hiding this comment.
Agreed — the duplicated initialization is pre-existing tech debt. Refactoring it to use super would change the host resolution and fact extraction logic, so we kept it out of scope for this PR.
- Restore comment on add_new_fact explaining the build/create! choice
- Revert counter initialization to {} -- explicit defaults belong in the
additive import PR (foreman#10980) where they are needed
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
pablomh
left a comment
There was a problem hiding this comment.
The key difference is which method each PR targets: #8823 applied upsert_all to update_facts (existing facts changing value), where the current code already uses update_all with no callbacks — so there was no measurable gain. Our PR applies insert_all to add_new_facts (facts being created for the first time), where the current code calls create! per fact with individual INSERTs.
During registration this is the dominant path — all facts are new (100 per RHEL 8 host, 117 per RHEL 9/10 host) — and at concurrent scale the per-fact INSERTs compound. Measured on a 16-CPU Satellite with only this patch applied (no other changes), mixed workload (50% RHEL 8, 25% RHEL 9, 25% RHEL 10):
| Concurrency | INSERTs/host (before → after) | AR P50 (before → after) | AR P95 (before → after) |
|---|---|---|---|
| 152 | 100 (el8) / 117 (el9,el10) → 1 | 2,610ms → 1,297ms (−50%) | 8,214ms → 3,350ms (−59%) |
| 304 | 100 (el8) / 117 (el9,el10) → 1 | 4,579ms → 2,004ms (−56%) | 11,922ms → 6,232ms (−48%) |
| 456 | 100 (el8) / 117 (el9,el10) → 1 | 8,492ms → 2,619ms (−69%) | 13,036ms → 6,884ms (−47%) |
| 608 | 100 (el8) / 117 (el9,el10) → 1 | 8,701ms → 3,215ms (−63%) | 13,963ms → 7,566ms (−46%) |
| 760 | 100 (el8) / 117 (el9,el10) → 1 | 9,789ms → 3,666ms (−63%) | 14,138ms → 7,120ms (−50%) |
insert_all handles nil values correctly, so compose facts (nil-valued hierarchy placeholders) can go through the bulk path. No need to partition and handle them separately. Also add safe navigation on fact_names[name]&.id per review suggestion. Tested on a live Satellite with 464 facts (70 compose/nil-valued): all inserted correctly via insert_all. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Replace per-fact
create!loop with a singleinsert_allfor new facts on persisted hosts, reducing DB round-trips by ~94% (108 INSERTs → 6).Changes
fact_importer.rb:add_new_factspartitions into compose facts (nil values, handled individually) and regular facts (bulk inserted viainsert_all)bulk_insert_new_fact_values: new method usingFactValue.insert_allwithunique_by: [:fact_name_id, :host_id]@counters = { added: 0, updated: 0, deleted: 0 }prevents nil accessHow to test
bundle exec rake test TEST=test/unit/fact_importer_test.rb