-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Parser Issues Log
Issue 2025-01-14-001: Missing SQL DataTypes
Date: 2025-01-14
Reporter: Semantic Analyzer Team
Severity: Medium
Component: AST DataType enum (include/db25/ast/node_types.hpp)
Description
The parser's DataType enum is missing several SQL data types that are commonly used in production databases. When integrating with the semantic analyzer, we found that the catalog system needs to support additional types that aren't represented in the parser's AST.
Missing Types
- NUMERIC - Fixed-point decimal type (often aliased to DECIMAL but semantically distinct)
- BYTEA - PostgreSQL-style binary data (different from BLOB)
- JSON - JSON text type
- JSONB - Binary JSON type (PostgreSQL)
- UUID - Universally unique identifier
- SERIAL/BIGSERIAL - Auto-incrementing integer types
- MONEY - Currency type
- BIT/VARBIT - Bit string types
- INET/CIDR - Network address types
- MACADDR - MAC address type
- XML - XML data type
Current Workaround
The semantic analyzer currently maps these types as follows:
- NUMERIC → Decimal
- BYTEA → Blob
- JSON/JSONB → Text
- Others → Unknown
Recommendation
Consider extending the DataType enum to support these additional SQL types for better compatibility with various SQL dialects (PostgreSQL, MySQL, SQL Server, etc.).
Impact
- Type checking accuracy reduced for these types
- Cannot distinguish between semantically different types (e.g., JSON vs regular TEXT)
- May affect query optimization hints in the future
Issue 2025-01-14-002: DataType Naming Convention Inconsistency
Date: 2025-01-14
Reporter: Semantic Analyzer Team
Severity: Low
Component: AST DataType enum
Description
The DataType enum uses PascalCase (e.g., Integer, VarChar) while SQL keywords are typically uppercase. This creates inconsistency when converting between AST representation and SQL text.
Recommendation
Consider using uppercase naming for DataType enum values to match SQL convention:
Integer→INTEGERVarChar→VARCHARSmallInt→SMALLINT
This would make the code more consistent with SQL standards and reduce conversion logic.