Skip to content

Enhance xGitGuard Scanner with BERT Model for Advanced Secret Detection #34

@adhit-r

Description

@adhit-r

Details:
Transformer-based models are better for this problem as they capture the context around lines of code. In general, random forest models do not perform well on high-dimensional data. For sequential data, proposed transformer models work better than existing models, which are better suited for non-sequential data.

The solution:
We propose to enhance the xGitGuard scanner by integrating a BERT model specifically trained for secret detection.

The steps include:

  1. Training and building models using BERT:
    Develop machine learning models focused on secret detection using BERT architecture.

  2. Integrating BERT into scanners:
    Seamlessly integrate the trained BERT model into the xGitGuard scanner, enhancing its ability to detect sensitive information with higher accuracy.

Alternatives:
Any other pre-trained models like PaLM, Gemini, or any GPT models.

Additional context:
Requires considerable training data.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesthelp wantedExtra attention is needed

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions