Describe the bug
The error is:
za.co.absa.standardization.ValidationException: A fatal schema validation error occurred. Due to errors: Validation error for column 'DateTime', pattern 'yyyy-MM-dd'T'HH:mm:ss.SSSXXX': For input string: ""
za.co.absa.standardization.Standardization$.validateSchemaAgainstSelfInconsistencies(Standardization.scala:70)
za.co.absa.standardization.Standardization$.standardize(Standardization.scala:43)
Pattern: yyyy-MM-dd'T'HH:mm:ss.SSSXXX
Example value: 2025-07-18T16:07:51.569+02:00
The value can be parsed correctly with both java.time and SimpleDateFormat:
import java.text.SimpleDateFormat
import java.util.Date
val strValue = "2025-07-18T16:07:51.569+02:00"
val formatter = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX")
val date: Date = formatter.parse(strValue) // Outputs: date: java.util.Date = Fri Jul 18 16:07:51 CEST 2025
I was able to trace the execution to this method:
|
private def extractSeconds(value: String): Long = { |
The Standardization library tries to:
- Check if the pattern contains fraction of seconds
- If yes, tries to strip fraction of seconds from the pattern and from the strng value
- But it does it incorrectly in presence of timezone.
To Reproduce
Steps to reproduce the behavior OR commands run:
- Create a CSV file with a timestamp column as above.
- Try to convert the timestamp string field to timestamp type using the above pattern
You can use this unit test to reproduce locally:
test("datetime") {
val parser = DateTimeParser("yyyy-MM-dd'T'HH:mm:ss.SSSXXX")
val str = "2025-07-18T16:07:51.569+02:00"
val result = parser.parseTimestamp(str)
println(result)
}
Expected behavior
The pattern should not cause the exception.
Business Value
--
Screenshots
--
Additional context
--
Describe the bug
The error is:
Pattern:
yyyy-MM-dd'T'HH:mm:ss.SSSXXXExample value:
2025-07-18T16:07:51.569+02:00The value can be parsed correctly with both
java.timeandSimpleDateFormat:I was able to trace the execution to this method:
spark-data-standardization/src/main/scala/za/co/absa/standardization/types/parsers/DateTimeParser.scala
Line 97 in a13a54d
The Standardization library tries to:
To Reproduce
Steps to reproduce the behavior OR commands run:
You can use this unit test to reproduce locally:
Expected behavior
The pattern should not cause the exception.
Business Value
--
Screenshots
--
Additional context
--