Skip to content

Missing SQL DataTypes #5

@chiradip

Description

@chiradip

Parser Issues Log

Issue 2025-01-14-001: Missing SQL DataTypes

Date: 2025-01-14
Reporter: Semantic Analyzer Team
Severity: Medium
Component: AST DataType enum (include/db25/ast/node_types.hpp)

Description

The parser's DataType enum is missing several SQL data types that are commonly used in production databases. When integrating with the semantic analyzer, we found that the catalog system needs to support additional types that aren't represented in the parser's AST.

Missing Types

  1. NUMERIC - Fixed-point decimal type (often aliased to DECIMAL but semantically distinct)
  2. BYTEA - PostgreSQL-style binary data (different from BLOB)
  3. JSON - JSON text type
  4. JSONB - Binary JSON type (PostgreSQL)
  5. UUID - Universally unique identifier
  6. SERIAL/BIGSERIAL - Auto-incrementing integer types
  7. MONEY - Currency type
  8. BIT/VARBIT - Bit string types
  9. INET/CIDR - Network address types
  10. MACADDR - MAC address type
  11. XML - XML data type

Current Workaround

The semantic analyzer currently maps these types as follows:

  • NUMERIC → Decimal
  • BYTEA → Blob
  • JSON/JSONB → Text
  • Others → Unknown

Recommendation

Consider extending the DataType enum to support these additional SQL types for better compatibility with various SQL dialects (PostgreSQL, MySQL, SQL Server, etc.).

Impact

  • Type checking accuracy reduced for these types
  • Cannot distinguish between semantically different types (e.g., JSON vs regular TEXT)
  • May affect query optimization hints in the future

Issue 2025-01-14-002: DataType Naming Convention Inconsistency

Date: 2025-01-14
Reporter: Semantic Analyzer Team
Severity: Low
Component: AST DataType enum

Description

The DataType enum uses PascalCase (e.g., Integer, VarChar) while SQL keywords are typically uppercase. This creates inconsistency when converting between AST representation and SQL text.

Recommendation

Consider using uppercase naming for DataType enum values to match SQL convention:

  • IntegerINTEGER
  • VarCharVARCHAR
  • SmallIntSMALLINT

This would make the code more consistent with SQL standards and reduce conversion logic.


Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions