Skip to content

Improve data cleaning heuristics#7

Merged
NeurArk merged 1 commit into
mainfrom
6dm95g-NeurArk/am-liorer-le-nettoyage-des-donn-es
May 20, 2025
Merged

Improve data cleaning heuristics#7
NeurArk merged 1 commit into
mainfrom
6dm95g-NeurArk/am-liorer-le-nettoyage-des-donn-es

Conversation

@NeurArk
Copy link
Copy Markdown
Owner

@NeurArk NeurArk commented May 20, 2025

Summary

  • handle boolean normalization and date parsing more carefully
  • impute numeric values using the median
  • add heuristics-based type inference for converter
  • include messy sample dataset
  • update README and tests
  • allow sample data via .gitignore
  • fix package discovery

Testing

  • ruff check .
  • pytest -q

@NeurArk NeurArk merged commit 301bbd4 into main May 20, 2025
2 checks passed
@NeurArk NeurArk deleted the 6dm95g-NeurArk/am-liorer-le-nettoyage-des-donn-es branch May 20, 2025 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant