Skip to content

Review and address ignored ruff lint rules for enhanced code quality #99

@jonpspri

Description

@jonpspri

Overview

During the expansion of ruff lint rules in pyproject.toml, we added comprehensive rule coverage but had to ignore 12 rule categories that are currently incompatible with DataBeak's codebase patterns. This issue tracks the systematic review and resolution of these ignored rules to achieve even higher code quality.

Currently Ignored Rules

Documentation Rules

  • D (pydocstyle) - Docstring formatting and completeness requirements
  • DOC (pydoclint) - Advanced docstring linting and validation

Code Quality Rules

  • C90 (mccabe complexity) - Function complexity limits
  • PL (Pylint) - Comprehensive static analysis rules
  • TRY (tryceratops) - Exception handling best practices

Optimization Rules

  • PERF (Perflint) - Performance optimization suggestions
  • FLY (flynt) - f-string conversion recommendations
  • FURB (refurb) - Code modernization suggestions

Library-Specific Rules

  • NPY (NumPy-specific rules) - NumPy best practices
  • PD (pandas-vet) - Pandas usage optimization
  • PGH (pygrep-hooks) - Pattern-based code analysis

Import/Type Rules

  • TID252 (flake8-tidy-imports) - Absolute vs relative import preferences
  • TC (flake8-type-checking) - TYPE_CHECKING import organization

Implementation Strategy

Phase 1: Documentation Enhancement (Low Risk)

  • Review D (pydocstyle) violations and add missing docstrings
  • Assess DOC (pydoclint) requirements and improve documentation quality
  • Goal: Better API documentation and developer experience

Phase 2: Code Quality Improvements (Medium Risk)

  • Analyze C90 (mccabe complexity) violations and refactor complex functions
  • Review PL (Pylint) suggestions for code quality improvements
  • Implement TRY (tryceratops) exception handling improvements
  • Goal: More maintainable and robust error handling

Phase 3: Performance Optimization (Medium Risk)

  • Evaluate PERF (Perflint) suggestions for actual performance benefits
  • Apply FLY (flynt) f-string conversions where appropriate
  • Consider FURB (refurb) modernization suggestions
  • Goal: Better performance and modern Python patterns

Phase 4: Library-Specific Enhancements (Low-Medium Risk)

  • Implement relevant NPY (NumPy) best practices for statistical operations
  • Apply beneficial PD (pandas-vet) optimizations for DataFrame operations
  • Review PGH (pygrep-hooks) patterns for security and correctness
  • Goal: Optimized library usage and better practices

Phase 5: Import Organization (Low Risk)

  • Evaluate TID252 import style preferences vs DataBeak patterns
  • Implement TC (flake8-type-checking) if beneficial for performance
  • Goal: Cleaner import organization and potentially faster imports

Implementation Guidelines

Priority Assessment

  1. High Priority: Rules that improve security, reliability, or maintainability
  2. Medium Priority: Rules that enhance performance or developer experience
  3. Low Priority: Style-only rules that don't affect functionality

Evaluation Criteria

  • Compatibility: Does the rule conflict with DataBeak's architecture?
  • Value: Does fixing violations provide meaningful benefit?
  • Effort: Is the fix proportional to the improvement gained?
  • Risk: Could changes introduce bugs or break existing functionality?

Implementation Process

  1. Small batches: Address 1-2 rule categories per PR
  2. Test thoroughly: Ensure no regressions in functionality
  3. Document decisions: Record why rules are kept ignored vs implemented
  4. Gradual rollout: Enable rules incrementally as violations are fixed

Success Metrics

  • Reduced ignore list: Fewer ignored rule categories over time
  • Maintained quality: Zero new violations in enabled rule categories
  • Better code: Measurable improvements in maintainability/performance
  • Clear documentation: Comprehensive API documentation
  • Team productivity: Easier onboarding and development experience

Related Context

This issue was created following the implementation of:

The enhanced rule set provides an opportunity to further improve DataBeak's already high code quality standards while maintaining the clean architecture achieved in recent PRs.

Timeline

Target: Address 2-3 rule categories per quarter to gradually improve code quality without overwhelming development cycles.

Next Steps: Start with Phase 1 (Documentation) as it has the lowest risk and highest developer experience impact.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions