Skip to content

[fix][io] Fix loading offset issues in Kafka Adaptor#31

Open
kasparjarek wants to merge 1 commit into
apache:masterfrom
kasparjarek:fix/kafka-connect-adaptor-offset-converter
Open

[fix][io] Fix loading offset issues in Kafka Adaptor#31
kasparjarek wants to merge 1 commit into
apache:masterfrom
kasparjarek:fix/kafka-connect-adaptor-offset-converter

Conversation

@kasparjarek

@kasparjarek kasparjarek commented May 27, 2026

Copy link
Copy Markdown

Fixes #30

Pulsar Kafka Connect adaptor previously used the same converters for both data and offset storage. This could cause various issues. For example when data were using AvroConverter, offsets were serialized using MockSchemaRegistryClient (in memory only). After a connector restart, the fresh MockSchemaRegistryClient had no schema records, causing deserialization to fail with "Subject Not Found; error code: 40401" and the connector losing its offset position.

Kafka Connect do not reuse the data converters for offset, but creates new JSON converters configured with schema.enable set to false. Thus the adaptor was changed to have the same behavior.

This is a breaking change for connectors that previously stored offsets in a non-JSON format, those offsets do not have to be readable after upgrade. The offsets probably weren't readable even before this fix (as we can see for the Avro converter). But since we cannot be sure what converters users used and how they behaved, this change should be probably included in a major release.

Note: The tests for adaptor currently do not compile. We are waiting for matching pulsar artifact to be published. The fix was verified E2E by manual testing.

…ect Adaptor

Pulsar Kafka Connect adaptor previously used the same converters for both data and offset storage. This coudl cause various issues. For example when data were using AvroConverter, offsets were serialized using MockSchemaRegistryClient (in memory only). After a connector restart, the fresh MockSchemaRegistryClient had no schema records, causing deserialization to fail with "Subject Not Found; error code: 40401" and the connector losing its offset position.

Kafka Connect do not reuse the data converters for offset, but creates new JSON converters configured with `schema.enable` set to `false`. Thus the adaptor was changed to have the same behavior.

This is a breaking change for connectors that previously stored offsets in a non-JSON format, those offsets do not have to be readable after upgrade. The offsets probably weren't readable even before this fix (as we can see for the Avro converter). But since we cannot besure what converters users used and how they behaved, this change should be probably included in a major release.

Note: The tests for adaptor currently do not compile. We are waiting for matching pulsar artifact to be publihsed. The fix was verified E2E by manual testing.

Fixes apache#30
@lhotari

lhotari commented May 27, 2026

Copy link
Copy Markdown
Member

Note: The tests for adaptor currently do not compile. We are waiting for matching pulsar artifact to be published. The fix was verified E2E by manual testing.

What type of problem are you facing with the tests?

@kasparjarek

Copy link
Copy Markdown
Author

What type of problem are you facing with the tests?

@lhotari the tests for the adaptor are currently disabled (code link) with a comment:

// KCA tests depend on pulsar-broker internals that have changed since the
// last released version. Tests will compile once matching pulsar artifacts
// are published. Skip for now to unblock CI.

When trying to run, it's failing with errors like class file for org.apache.pulsar.tests.TestRetrySupport not found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Kafka Connect Adaptor cannot load offsets when using non-JSON converters

2 participants