Replace page classifier with dit, add -fpt flag#2404
Merged
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed changes
Replace built-in Naive Bayes page classifier with dit (20 page types, 8 form types, 79 field types). Add
-fpt/-filter-page-typeflag for filtering by any page type(s). Deprecate-fepas alias for-fpt error.common/pagetypeclassifier/withdit.Classifier-fptflag (e.g.-fpt login,captcha,parked)-fepwith info messageFormswith form type and field classificationsCloses #2403
Proof
httpx -u https://github.com/login -json— KnowledgeBase showsPageType: login+ Forms-fpt loginfilters login pages,-fpt errorfilters error pages-fpt login,errorfilters multiple types, case-insensitive-fepbackward compat filters error pages + shows deprecation messagego build ./...andgo test ./...passChecklist
Summary by CodeRabbit
Chores
New Features
Deprecations
Removals
Documentation