Skip to content

Mod04 bias assignment#11

Open
Kingk1342 wants to merge 4 commits intobatmandoescalc:mainfrom
Kingk1342:mod04_Bias_assignment
Open

Mod04 bias assignment#11
Kingk1342 wants to merge 4 commits intobatmandoescalc:mainfrom
Kingk1342:mod04_Bias_assignment

Conversation

@Kingk1342
Copy link

This was completed

Copilot AI review requested due to automatic review settings February 26, 2026 00:18
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes a Module 04 bias assignment by enhancing a bot prediction model with threshold tuning capabilities and adding comprehensive analysis. The changes include hyperparameter optimization of the GradientBoostingClassifier, implementation of a customizable prediction threshold to balance false positives and false negatives, and thoughtful discussion of the model's bias implications.

Changes:

  • Enhanced predict_bot function with optional threshold parameter for custom decision boundaries
  • Added threshold optimization code to find the optimal cutoff that minimizes misclassification rate
  • Tuned GradientBoostingClassifier hyperparameters for better performance (increased n_estimators, reduced learning rate, added early stopping)
  • Completed discussion questions analyzing model confidence, false positive ramifications, and false negative implications

Reviewed changes

Copilot reviewed 2 out of 4 changed files in this pull request and generated 2 comments.

File Description
mod02_test_bot_predictor.ipynb Added threshold parameter to prediction function, implemented threshold search optimization, added model execution outputs, and completed all discussion questions with detailed answers
mod02_build_bot_predictor.py Updated GradientBoostingClassifier hyperparameters with more conservative settings including early stopping and validation monitoring
.gitignore Added .venv/ directory to exclude virtual environment from version control
pycache/mod02_build_bot_predictor.cpython-313.pyc Binary compiled Python cache file (should not be committed)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"outputs": [],
"source": [
"y_pred_train = predict_bot(X_train, model)\n",
"y_pred_test = predict_bot(X_test, model, threshold=0.57)"
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The threshold used here (0.57) does not match the best threshold found by the optimization (0.56 shown in the output at line 180). This inconsistency means the predictions are not using the optimal threshold that minimizes misclassification rate. Update this to use threshold=0.56 to match the optimization result.

Suggested change
"y_pred_test = predict_bot(X_test, model, threshold=0.57)"
"y_pred_test = predict_bot(X_test, model, threshold=0.56)"

Copilot uses AI. Check for mistakes.
"id": "54e546c7",
"metadata": {},
"source": [
"With a test misclassification rate of about 10.8%, the model is correct about 9 out of 10 times, so there is moderate confidence in its ability to flag bots. The low false positive rate (~0.8%) means real users are rarely labeled as bots, which helps trust and support. The high false negative rate (~80%) means most bots are still missed, so the model is better at avoiding wrong accusations of humans than at catching every bot. For low-stakes filtering or triage this may be acceptable; for strict moderation or security, the high miss rate on bots would limit how much to rely on it alone.Type your answer here."
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the placeholder text "Type your answer here." at the end of this markdown cell. The answer has been provided, so this remnant text should be deleted.

Suggested change
"With a test misclassification rate of about 10.8%, the model is correct about 9 out of 10 times, so there is moderate confidence in its ability to flag bots. The low false positive rate (~0.8%) means real users are rarely labeled as bots, which helps trust and support. The high false negative rate (~80%) means most bots are still missed, so the model is better at avoiding wrong accusations of humans than at catching every bot. For low-stakes filtering or triage this may be acceptable; for strict moderation or security, the high miss rate on bots would limit how much to rely on it alone.Type your answer here."
"With a test misclassification rate of about 10.8%, the model is correct about 9 out of 10 times, so there is moderate confidence in its ability to flag bots. The low false positive rate (~0.8%) means real users are rarely labeled as bots, which helps trust and support. The high false negative rate (~80%) means most bots are still missed, so the model is better at avoiding wrong accusations of humans than at catching every bot. For low-stakes filtering or triage this may be acceptable; for strict moderation or security, the high miss rate on bots would limit how much to rely on it alone."

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants