Conversation
Changes a variable name in hack/gen-configmap-data-source.sh to be more descriptive for easier readability.
Updates the README to be easier to read. Mainly rewording to fix some grammatical issues, along with some removal of duplicate information. Log output at the end was removed as the contents didn't seem relevant for a user trying to run the daemon.
…penshift-4.22-linuxptp-daemon OCPBUGS-77572: Updating ose-linuxptp-daemon-container image to be consistent with ART for 4.22
Signed-off-by: Vitaly Grinberg <vgrinber@redhat.com>
Three related issues caused incorrect T-BC behavior during upstream port failover: 1. ptp4l offsets were never reported to the T-BC state machine, so getLargestOffset returned FaultyPhaseOffset (unfilled window) and freeRunCondition could not detect large ptp4l offsets. Fix: add a tunable 1-second averaged ptp4l offset event (sendPtp4lOffsetEvent) fed via a sliding window (ptp4lOffsetEventWindowSize setting), and teach freeRunCondition/getLargestOffset to use it while skipping empty windows. 2. When all PTP ports are lost, sendPtp4lEvent uses a fallback iface, leaving stale LOCKED DataDetails on inactive ports. isSourceLostBC then incorrectly reports source as not lost. Fix: AddEvent now propagates SourceLost to all LOCKED details, not just the event's own interface. 3. downstreamAnnounceIWF runs slow PMC calls in a goroutine. If the BC transitions to FREERUN mid-flight, the goroutine unconditionally overwrites clockClass with stale upstream data. Fix: add context-based cancellation (cancel-on-supersede in updateDownstreamData) plus applyIfLockedBC state guards around data mutations. Assisted by Cursor AI Signed-off-by: Vitaly Grinberg <vgrinber@redhat.com>
Signed-off-by: Vitaly Grinberg <vgrinber@redhat.com>
Assisted-By: cursor
Assisted-by: Cursor
Set pins via dpll
Generated-by: Cursor
…stream_staging_main Upstream to downstream staging main
Add API to enable / disable leap second sources.
By default (and if leapSources is omitted) all satellite
sources are enabled. To disable sources, specify them
under the plugin gnss->leapSources section:
e825:
devices:
- eno8703
gnss:
disabled: false
leapSources:
navic: false
Signed-off-by: Vitaly Grinberg <vgrinber@redhat.com>
Readme updates
When cloud-event-proxy restarts and linuxptp-daemon re-emits cached port role events, the Raw field does not contain a trailing newline. This causes all port events to be concatenated on the socket, and cloud-event-proxy fails to parse them (strconv.ParseInt: parsing "port": invalid syntax), resulting in missing clock class metrics. Ensure each port role log line has a trailing newline before writing to the event socket. Signed-off-by: Jack Ding <jackding@gmail.com>
OCPBUGS-78296: Allow non-GNSS leap second sources
Extend aws-ci action workflow to save artifacts
…pDevice -Accept lspci VPD results even when PartNumber is empty (provisional VPD), -default LinkSpeed and FEC to unknown when ethtool cannot determine them. -Skip non-PCI and virtual-function NICs before running ethtool, and lower remaining skip messages to debug level, eliminating log noise from container/virtual interfaces -Collect VPD once per NIC via the PTP-exposing port
Generated-by: Cursor
Fix missing newline in re-emitted port role logs to event socket
…stream_staging_main Upstream to downstream staging main
Add github action to create a sync PR every hour
This commit fixes an issue where if the clock_id is not set in synce4lConf, it is incorrectly set to 0, instead of the correct value previously extracted from the network device list. This issue was due to the clock_id being pulled from the initial value of the config, not the actual object that was mutated.
Signed-off-by: Vitaly Grinberg <vgrinber@redhat.com>
Fix VPD collection and ensure LinkSpeed/FEC always reported in NodePtpDevice
OCPBUGS-78711: Fix clock_id being set to 0 in SyncE config
Move builder image to non-docker image so that we do not get hit with pull limits
The issue was that the expectWorker was not exiting when exp was closed instead it return errors causing more the process to eventually crash due to to many go routines
…mage Update Dockerfile builder image
…ilure If something goes wrong with gpsd (or we're just unlucky with initialization timing races), we can attempt to run the ublox init code before gpsd is actually running, which is silently ignored and can lead to both ts2phc and our daemon's GNSS monitoring to fail. This change largely fixes the 1st part, by performing ublx protocol detection as part of the object initialization, and returning an error if the initialization fails. The monitoring framework will already retry this registration every 1s until it succeeds, so with the error return functioning, we will get appropriate retries until gpsd is running and ubxtool can talk to it. Signed-off-by: Jim Ramsay <jramsay@redhat.com>
Signed-off-by: Jim Ramsay <jramsay@redhat.com>
Generated-by: Cursor
Provides a FORK_REMOTE to users allow to push to a fork and then create the PR against the downstream. Also uses worktrees to better isolate the changes. Users can control where the work tree is created (default /tmp). There is also a --keep-worktree flag if the user wishes to inspect.
…_fixup-4.22 OCPBUGS-77480: Fix ubxtool initialization race conditions
…process_failure Fix pmc looping when pmc process is killed
…26-03-19 OCPBUGS-77480, OCPBUGS-78711, OCPBUGS-78814: Sync from upstream (19-Mar-2026)
Centralizes the newline suffix check inside writeLogToSocket so all callers get consistent newline termination automatically, instead of each call site handling it independently. Signed-off-by: Jack Ding <jackding@gmail.com>
When cloud-event-proxy crashes and restarts, clock_class metrics for most ptp4l configs disappear because: 1. EmitClockClassLogs() skipped configs where pmc.parentDS was nil, even though the clock class data was available in clkSyncState. Remove this unnecessary guard since EmitClockClass() already handles missing data gracefully. 2. emitClockClass() used utils.EmitClockClass() which writes directly to the socket without reconnect-and-retry logic. On broken pipe the data was silently lost. Switch to writeLogToSocket() which handles reconnection, consistent with EmitClockSyncLogs, EmitPortRoleLogs, and EmitProcessStatusLog. 3. Clean up now-unused code: utils.EmitClockClass, utils.IsBrokenPipe, signalBrokenPipe, brokenPipeCh, and their associated tests. Signed-off-by: Jack Ding <jackding@gmail.com>
Fix clock class metrics lost after cloud-event-proxy restart
Three big fixes necessary in order to correctly report GM state: 1: GM state should be s1 if ts2phc is in holdover. 2: ts2phc should exit holdover after timeout 3: dpll should stay PTP_NOTSET upon loss of gnss.
…te_on_gnss_loss Fix reporting of GM state
…sync Improve usptream-sync script
Replace GNU-specific sed and grep syntax with POSIX equivalents so the script works on both macOS and Linux: - sed multi-line join -> paste -sd ',' + sed - grep -oP (Perl lookbehind) -> grep -oE with two-stage filter Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…stream_staging_main Upstream to downstream staging main
Add GNR-D T-BC documentation
Fix upstream-sync.sh macOS compatibility
Adds utils.CheckMetricSanity to prevent any metric update from being emitted with an empty process or interface name across the daemon. If the labels are missing, the update is dropped and a stack trace is logged to warn developers of the source error. Assisted-by: gemini-2.5-pro Signed-off-by: Jim Ramsay <jramsay@redhat.com>
…e_metric OCPBUGS-78552: Generic sanity check for metrics
Configure AWS timeout to 2 hours for long jobs
Two issues caused clock_class metrics to be missing after cloud-event-proxy crashed and recovered: 1. writeLogToSocket returned false when conn was nil (set by another goroutine handling a broken pipe) without attempting reconnection. Concurrent writers like EmitClockClassLogs silently lost data. Fix: call reconnectEventSocket when conn is nil, blocking until any in-progress reconnection completes. 2. clkSyncState was never populated with clock class values in T-BC/HA configurations. The clockClassRequestCh handler and UpdateClockClass set e.clockClass (EventHandler level) but not clkSyncState entries. EmitClockClass and the classTicker rely on clkSyncState for re-emission, so they had nothing to emit. Fix: store clock class in clkSyncState when received via clockClassRequestCh, via new storeClockClassLocked helper. Signed-off-by: Jack Ding <jackding@gmail.com>
Fix race condition in writeLogToSocket dropping writes during reconnect
OCPBUGS-78552: Sync from upstream (24-Mar-2026)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Upstream PRs included