Skip to content

chore: optimize localdns provisioning polling and iptables setup#8401

Open
jingwenw15 wants to merge 5 commits intomainfrom
jingwenwu/localdns-perf-optimizations
Open

chore: optimize localdns provisioning polling and iptables setup#8401
jingwenw15 wants to merge 5 commits intomainfrom
jingwenwu/localdns-perf-optimizations

Conversation

@jingwenw15
Copy link
Copy Markdown
Member

@jingwenw15 jingwenw15 commented Apr 24, 2026

Summary

  • Reduce polling intervals from 1s to 0.1s in start_localdns() and wait_for_localdns_ready() since CoreDNS typically starts in <100ms
  • Batch iptables rules using iptables-restore instead of individual calls to avoid repeated xtables lock acquisition
  • Remove now-unused IPTABLES variable

Estimated total savings: ~1.3–2.7s during node provisioning.

Telescope perf runs:
Screenshot 2026-04-24 at 1 36 06 PM

Reduce polling intervals from 1s to 0.1s in start_localdns() and
wait_for_localdns_ready() since CoreDNS typically starts in <100ms.
Batch iptables rules using iptables-restore instead of individual
calls to avoid repeated xtables lock acquisition. Estimated total
savings: ~1.3–2.7s during node provisioning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 24, 2026 20:34
@jingwenw15 jingwenw15 changed the title perf: optimize localdns provisioning polling and iptables setup feat: optimize localdns provisioning polling and iptables setup Apr 24, 2026
@jingwenw15 jingwenw15 changed the title feat: optimize localdns provisioning polling and iptables setup chore: optimize localdns provisioning polling and iptables setup Apr 24, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets node provisioning latency in the Linux localdns bootstrap path by tightening readiness polling and batching iptables rule installation to reduce xtables lock contention.

Changes:

  • Reduced polling sleep interval in start_localdns() and wait_for_localdns_ready() from 1s to 0.1s, adjusting attempt counts accordingly.
  • Switched pod-traffic conntrack-skip rule installation to a single iptables-restore call instead of multiple iptables invocations.
  • Removed the previously used IPTABLES command wrapper variable.

Comment thread parts/linux/cloud-init/artifacts/localdns.sh Outdated
Comment thread parts/linux/cloud-init/artifacts/localdns.sh
Update add_iptable_rules_to_skip_conntrack_from_pods tests to mock
iptables-restore via PATH instead of the old IPTABLES variable, and
assert on the new iptables-restore input format (*raw, -A rules, COMMIT).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
iptables requires -j (jump target) to be last in the rule. The comment
match module must come before it, otherwise iptables-restore rejects
the rule as invalid syntax.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 24, 2026 20:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

Comment thread spec/parts/linux/cloud-init/artifacts/localdns_spec.sh
Comment thread parts/linux/cloud-init/artifacts/localdns.sh Outdated
Comment thread spec/parts/linux/cloud-init/artifacts/localdns_spec.sh Outdated
Comment thread spec/parts/linux/cloud-init/artifacts/localdns_spec.sh
Place -m comment immediately after chain name in iptables-restore input
so that iptables -S displays comment before protocol match, matching the
Cilium eBPF host routing regex. The previous ordering placed comment
after --dport which caused nft backend to display it after the protocol
match extension.

Add ValidateLocalDNSIptablesRules e2e validator that checks:
- localdns.sh uses iptables-restore (batched rules)
- NOTRACK rules exist in both OUTPUT and PREROUTING chains
- Comment tag is present for cleanup logic
- NOTRACK is functional (no conntrack entries for localdns DNS traffic)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Drop NOTRACK rules temporarily, do a DNS lookup, and verify conntrack
entries appear. This proves the conntrack check is actually capable of
detecting entries and isn't silently passing. Rules are restored after.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 24, 2026 22:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread e2e/validators.go
Comment on lines +1548 to +1568
saved_rules=$(sudo iptables -w -t raw -S | grep "localdns: skip conntrack")
sudo iptables -w -t raw -S | grep "localdns: skip conntrack" | while IFS= read -r rule; do
# Convert -A to -D to delete the rule
sudo iptables -w -t raw $(echo "$rule" | sed 's/^-A/-D/') 2>/dev/null || true
done

# Flush any leftover conntrack entries before the negative test
for ip in 169.254.10.10 169.254.10.11; do
sudo conntrack -D -d "$ip" -p udp --dport 53 2>/dev/null || true
done

# Do a DNS lookup without NOTRACK — this SHOULD create conntrack entries
dig bing.com @169.254.10.10 +short +timeout=2 +tries=1 > /dev/null 2>&1 || true

ct_dns_neg=$(sudo conntrack -L -d 169.254.10.10 -p udp --dport 53 2>/dev/null | wc -l)
echo "Conntrack entries for 169.254.10.10:53 without NOTRACK: $ct_dns_neg"

# Restore NOTRACK rules
echo "$saved_rules" | while IFS= read -r rule; do
sudo iptables -w -t raw $rule 2>/dev/null || true
done
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The negative-test restore path builds iptables commands by expanding the iptables -S output into $rule unquoted (sudo iptables ... $rule). Since iptables -S prints the comment as --comment "localdns: skip conntrack", unquoted expansion will split that into multiple argv tokens (quotes become literal characters), so both deletion and restoration are likely to fail and make this validator flaky (and may leave rules removed). Consider deleting by rule number (using iptables -L --line-numbers like localdns.sh cleanup does) and restoring rules via a known-safe mechanism (e.g., iptables-restore payload or re-adding rules with properly quoted args), and don’t ignore restore errors.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants