Skip to content

Clarify details about incremental copy.#2777

Closed
rodenkew wants to merge 1 commit intoMicrosoftDocs:mainfrom
rodenkew:patch-1
Closed

Clarify details about incremental copy.#2777
rodenkew wants to merge 1 commit intoMicrosoftDocs:mainfrom
rodenkew:patch-1

Conversation

@rodenkew
Copy link
Copy Markdown
Contributor

@rodenkew rodenkew commented Mar 9, 2026

A customer expressed confusion that rows with a NULL value in the incremental column were not being copied during a subsequent load.

This indicates that the documentation could use a bit of clarification.

This PR includes some further clarification.

This PR also moves a paragraph to keep the Incremental Copy information "together."

Thank you for contributing to Microsoft Fabric documentation

Fill out these items before submitting your pull request:

If you are working internally at Microsoft:

Bug 5026148: Docs: User expects CopyJob to copy rows with NULL Incremental Column value

Who is your primary Skilling team contact? @mention them individually tag them and let them review the PR before signing off.

@yexu@microsoft.com

For internal Microsoft contributors, check off these quality control items as you go

  • 1. Check the Acrolinx report: Make sure your Acrolinx Total score is above 80 minimum (higher is better) and with no spelling issues. Acrolinx ensures we are providing consistent terminology and using an appropriate voice and tone, and helps with localization.

  • 2. Successful build with no warnings or suggestions: Review the build status to make sure all files are green (Succeeded).

  • 3. Preview the pages:: Click each Preview URL link to view the rendered HTML pages on the review.learn.microsoft.com site to check the formatting and alignment of the page. Scan the page for overall formatting, and look at the parts you edited in detail.

  • 4. Check the Table of Contents: If you are adding a new markdown file, make sure it is linked from the table of contents.

  • 5. #sign-off to request PR review and merge: Once the pull request is finalized and ready to be merged, indicate so by typing #sign-off in a new comment in the Pull Request. If you need to cancel that sign-off, type #hold-off instead. Signing off means the document can be published at any time. Note, this is a formatting and standards review, not a technical review.

Merge and publish

  • After you #sign-off, there is a separate PR Review team that will review the PR and describe any necessary feedback before merging.
  • The review team will use the comments section in the PR to provide feedback if changes are needed. Address any blocking issues and sign off again to request another review.
  • Once all feedback is resolved, you can #sign-off again. The PR Review team reviews and merges the pull request into the specified branch (usually the main branch or a release- branch).
  • From the main branch, the change is merged into the live branch several times a day to publish it to the public learn.microsoft.com site.

A customer expressed confusion that rows with a NULL value in the incremental column were not being copied during a subsequent load.

This indicates that the documentation could use a bit of clarification.

This PR includes some further clarification.
@prmerger-automator
Copy link
Copy Markdown
Contributor

@rodenkew : Thanks for your contribution! The author(s) and reviewer(s) have been notified to review your proposed change.

@learn-build-service-prod
Copy link
Copy Markdown
Contributor

Learn Build status updates of commit ef8abb9:

✅ Validation status: passed

File Status Preview URL Details
docs/data-factory/what-is-copy-job.md ✅Succeeded

For more details, please refer to the build report.

@rodenkew
Copy link
Copy Markdown
Contributor Author

rodenkew commented Mar 9, 2026

#sign-off

@prmerger-automator
Copy link
Copy Markdown
Contributor

Invalid command: '#sign-off'. Only the assigned author of one or more file in this PR can sign off. @whhender

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Clarifies Copy job incremental copy behavior—especially around watermark-based subsequent loads—and reorganizes content so incremental-copy reset guidance sits with other incremental copy details.

Changes:

  • Expanded the “Incremental copy (CDC, Watermark)” section to describe how subsequent loads determine what to copy for watermark, CDC, and files.
  • Added an explicit note explaining why rows with NULL in the incremental (watermark) column aren’t copied in subsequent loads.
  • Moved the “Reset incremental copy” subsection to sit directly under the incremental copy content.

Typically, an incremental column holds a date/time value or an increasing number.
If your database has CDC enabled, you don’t need to choose an incremental column — Copy job automatically detects the changes.

Note that if you are using a watermark to copy incrementally from a database, subsequent loads do not copy any rows with a "null" value in that column, because the "null" value is considered _less_ than any other value.
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explanation for why rows with NULL values aren't copied is technically incorrect: in most databases NULL isn't "less" than other values; comparisons like watermark > lastValue evaluate to UNKNOWN/false for NULL, so those rows don't match the incremental filter. Consider rephrasing to explain that subsequent loads filter on values greater than the last recorded watermark, and rows with NULL in the incremental column don't satisfy that filter (also format NULL as code, not quoted text).

Suggested change
Note that if you are using a watermark to copy incrementally from a database, subsequent loads do not copy any rows with a "null" value in that column, because the "null" value is considered _less_ than any other value.
Note that if you use a watermark column to copy incrementally from a database, subsequent loads don't copy rows where that column is `NULL`, because those rows don't satisfy the filter that selects values greater than the last recorded watermark.

Copilot uses AI. Check for mistakes.
- Databases: Only new or updated rows are copied. If Change Data Capture (CDC) is enabled, inserted, updated, and deleted rows are included.
- Storage: Only files with a newer LastModifiedTime are copied.
In incremental copy, every run after the initial full copy (called a "subsequent load") transfers only certain changes. Copy job automatically tracks and manages the state of the last successful run, so it knows what data to copy next.
- When copying from a database using an incremental column ("watermark column"), each subsequent load copies only rows with a value in that column larger than any row previously copied.
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description of watermark behavior is a bit misleading: Copy job tracks the last successful run’s stored watermark value, so the rule is typically "greater than the last recorded watermark" (not "larger than any row previously copied"). Rewording will better match how the state is tracked and reduce confusion about what value is being compared against.

Suggested change
- When copying from a database using an incremental column ("watermark column"), each subsequent load copies only rows with a value in that column larger than any row previously copied.
- When copying from a database using an incremental column ("watermark column"), Copy job stores the watermark value from the last successful run and, for each subsequent load, copies only rows with a value in that column greater than this stored watermark.

Copilot uses AI. Check for mistakes.
See more details for [Change data capture (CDC) in Copy Job](/fabric/data-factory/cdc-copy-job).

When a copy job fails, you don’t need to worry about data loss. Copy job always resumes from the state of the last successful run. A failure does not change the state managed by copy job.
If a copy job fails, you don’t need to worry about data loss. Copy job always resumes from the end of the last successful run. A failure does not change the state managed by Copy job.
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence says the job "resumes from the end of the last successful run", but earlier/elsewhere the concept is the stored state of the last successful run (watermark/CDC checkpoint). "End" can read like row-level continuation and could be inaccurate. Suggest aligning the wording to "resumes from the state/checkpoint of the last successful run" and also keeping product naming consistent ("Copy job" vs "Copy Job").

Copilot uses AI. Check for mistakes.

### Reset incremental copy

You have the flexibility in managing incremental copy, including the ability to reset it back to a full copy on the next run. This is incredibly useful when there’s a data discrepancy between your source and destination—you can simply let Copy Job perform a full copy in the next run to resolve the issue, then continue with incremental updates afterward.
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor style/consistency issues in this paragraph: add spaces around the em dash (the file uses spaced em dashes elsewhere), and use consistent product casing ("Copy job" vs "Copy Job"). Also consider softening promotional phrasing like "incredibly useful" to keep the tone more neutral and instructional.

Copilot uses AI. Check for mistakes.
@ttorble
Copy link
Copy Markdown
Contributor

ttorble commented Mar 9, 2026

@rodenkew did you intend to open this PR in the private repo? It looks like you have access to the private repo, and if so, you should use that for content updates. I'll leave this PR open in case @whhender wants to review it here. If you decide to open a new PR in the private repo, please close this PR. Thanks.

Can you review the proposed changes?

IMPORTANT: When the changes are ready for publication, adding a #sign-off comment is the best way to signal that the PR is ready for the review team to merge.

#label:"aq-pr-triaged"
@MicrosoftDocs/public-repo-pr-review-team

@github-actions
Copy link
Copy Markdown

This pull request has been inactive for 14 days, and an inactive label has been added. If you are finished with your changes don't forget to sign off in the comments of the pull request to request a review and merge. If you want to continue working without merging, simply add a comment about why you want to keep this PR open. If this PR is inactive with no comments for 14 more days, it will be closed automatically.
Thank you!
Microsoft Fabric Docs team
PS: Mention us in the comments @MicrosoftDocs/fabric-docs-team if you need assistance.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

This pull request has been inactive for 28 days, and an auto-close label has been added. At this time, the system is closing the PR automatically. If you decide to continue working on your changes, that's no problem. At the bottom of the pull request, simply select the Reopen pull request button.
Thank you!
Microsoft Docs team
PS: Mention us in the comments @MicrosoftDocs/fabric-docs-team if you need assistance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants