-
Notifications
You must be signed in to change notification settings - Fork 3
Containerfile scan #484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Containerfile scan #484
Conversation
…erns - Detect COPY/ADD patterns that include entire build context (secrets, .git, etc.) - Support for wildcards: ., ./, *, ./* patterns - Check for .dockerignore file presence - Detect multistage builds (multiple FROM statements) - Fetch container registry metadata (tag, last update) - Clean output with url, file, content, has_dockerignore, is_multistage, registry info - Supports --owned, --member, --repo, --namespace, --search filters - Output message: 'Identified'
- Added pkg/github/container/ with scanner.go, types.go, patterns.go - Added internal/cmd/github/container/container.go command interface - Registered container command in internal/cmd/github/github.go - Mirrors GitLab container scan functionality - Detects dangerous COPY/ADD patterns that expose secrets - Checks for .dockerignore and multistage builds - Fetches container registry metadata when available
GitHub: - Use Search.Code API to find all Dockerfile/Containerfile files - Support scanning up to 50 Dockerfiles per repository - Search across entire repository tree recursively GitLab: - Added findDockerfiles function with common search paths - Checks root and 12+ common subdirectories (docker/, app/, services/, etc) - Supports up to 50 Dockerfiles per project Both: - Log count of discovered Dockerfiles - Scan all found Dockerfiles instead of just first one - Improved detection for monorepos and multi-service projects
- Use git.Repositories.ListTree() for true recursive file discovery - No hardcoded directory paths - scans entire repository tree - Implements breadth-first search using queue for directory traversal - Efficiently finds all Dockerfiles at any nesting level - Successfully found 8 Dockerfiles in omnibus-mirror/distribution repo - Maintains 50 Dockerfile limit to prevent scanning huge repos Both GitHub and GitLab now use proper recursive discovery: - GitHub: Search.Code API with filename matching - GitLab: Repositories.ListTree with recursive traversal
- New flag to filter for public repositories only - Adds 'is:public' to GitHub search query when flag is used - Can be combined with other flags like --search, --repo - Updated help text to mention --public option - Tested successfully scanning public repos
- Remove all dockerignore file checks from GitLab and GitHub container scanners - Remove HasDockerignore field from shared Finding struct - Optimize registry metadata fetch: use ListRegistryRepositoryTags data directly instead of calling GetRegistryRepositoryTagDetail for each tag - Add registry_last_update back to logging output - Implement recursive tree API for GitLab Dockerfile discovery with depth limit (max 2 levels) - Add timing instrumentation for Dockerfile discovery and registry metadata fetch phases Performance improvements: - Dockerfile discovery: ~122s -> 347-817ms (95%+ faster) - Registry metadata: Skip expensive per-tag detail API calls - Total container scan time: ~2m47s -> ~2.5s (94% faster on tested repos) This eliminates unnecessary API calls and improves security scanning throughput significantly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements container image scanning functionality for detecting dangerous Dockerfile/Containerfile patterns in both GitLab and GitHub repositories. The feature identifies risky build patterns like COPY . / that may expose sensitive files.
Changes:
- Added container scanning packages for GitLab and GitHub with pattern detection logic
- Implemented shared container scanning utilities including pattern definitions and multistage build detection
- Added comprehensive e2e tests for GitLab container scanning functionality
- Integrated container scan commands into GitLab and GitHub CLI interfaces
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/e2e/gitlab/container/container_test.go | E2e tests for GitLab container scanning with various scenarios |
| pkg/gitlab/container/types.go | ScanOptions struct for GitLab container scanning |
| pkg/gitlab/container/scanner.go | Main scanner implementation for GitLab projects |
| pkg/gitlab/container/patterns.go | Wrapper delegating to shared pattern functions |
| pkg/github/container/types.go | ScanOptions struct for GitHub container scanning |
| pkg/github/container/scanner.go | Main scanner implementation for GitHub repositories |
| pkg/github/container/patterns.go | Wrapper delegating to shared pattern functions |
| pkg/container/types.go | Shared type definitions for container scanning |
| pkg/container/scanner.go | Shared scanning utility functions |
| pkg/container/patterns.go | Default dangerous pattern definitions |
| internal/cmd/gitlab/gitlab.go | Registered container scan command in GitLab CLI |
| internal/cmd/gitlab/container/container.go | GitLab container scan command implementation |
| internal/cmd/github/github.go | Registered container scan command in GitHub CLI |
| internal/cmd/github/container/container.go | GitHub container scan command implementation |
| containers_github_com | Log output file showing error handling |
| if strings.Contains(r.URL.Path, "/repository/files") && strings.Contains(r.URL.Path, "Dockerfile") { | ||
| // Repository tree endpoint - for finding Dockerfiles | ||
| if strings.Contains(r.URL.Path, "/repository/tree") { |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreachable code: line 48 checks for '/repository/files' and 'Dockerfile', but line 50 checks for '/repository/tree'. Since the first condition requires '/repository/files' in the path, the nested check for '/repository/tree' can never be true.
| // Find the most recent version | ||
| var mostRecentVersion *github.PackageVersion | ||
| for _, ver := range versions { | ||
| if ver.GetCreatedAt().Time.After(mostRecentVersion.GetCreatedAt().Time) || mostRecentVersion == nil { |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential nil dereference: The condition checks mostRecentVersion.GetCreatedAt() before verifying if mostRecentVersion is nil. The condition should be reordered to mostRecentVersion == nil || ver.GetCreatedAt().Time.After(mostRecentVersion.GetCreatedAt().Time) to avoid calling methods on a nil pointer.
| if ver.GetCreatedAt().Time.After(mostRecentVersion.GetCreatedAt().Time) || mostRecentVersion == nil { | |
| if mostRecentVersion == nil || ver.GetCreatedAt().Time.After(mostRecentVersion.GetCreatedAt().Time) { |
| repoName := repo.GetName() | ||
|
|
||
| // List container packages for the repository | ||
| packages, _, err := client.Organizations.ListPackages(ctx, owner, &github.PackageListOptions{ |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code assumes the repository owner is an organization, but repositories can also be owned by individual users. For user-owned repositories, this API call will fail. Consider using a check to distinguish between user and organization repositories, or use the appropriate API method for user packages.
| } | ||
|
|
||
| elapsed := time.Since(startTime) | ||
| log.Debug().Str("project", project.PathWithNamespace).Int("found", len(dockerfiles)).Dur("elapsed_ms", elapsed).Msg("Dockerfile search complete") |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field name 'elapsed_ms' suggests milliseconds, but the value is a time.Duration which is in nanoseconds. Either rename the field to 'elapsed' or convert the duration to milliseconds to match the field name.
| log.Debug().Str("project", project.PathWithNamespace).Int("found", len(dockerfiles)).Dur("elapsed_ms", elapsed).Msg("Dockerfile search complete") | |
| log.Debug().Str("project", project.PathWithNamespace).Int("found", len(dockerfiles)).Int("elapsed_ms", int(elapsed.Milliseconds())).Msg("Dockerfile search complete") |
| Str("repo", repo.Path). | ||
| Str("tag", mostRecentTag.Name). | ||
| Str("last_update", metadata.LastUpdate). | ||
| Dur("elapsed_ms", elapsed). |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field name 'elapsed_ms' suggests milliseconds, but the value is a time.Duration which is in nanoseconds. Either rename the field to 'elapsed' or convert the duration to milliseconds to match the field name.
| Dur("elapsed_ms", elapsed). | |
| Int64("elapsed_ms", elapsed.Milliseconds()). |
frjcomp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add tests for the gh scan artipacked command as well.
|
|
||
| // ParseCustomPatterns parses a comma-separated string of patterns into a slice of Pattern objects | ||
| // The patterns are treated as regex strings | ||
| func ParseCustomPatterns(patternsStr string) []Pattern { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove custom patterns
| FileName string | ||
| MatchedPattern string | ||
| LineContent string | ||
| PatternSeverity string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed. also remove all references of severity
| return sharedcontainer.DefaultPatterns() | ||
| } | ||
|
|
||
| // ParseCustomPatterns parses a comma-separated string of patterns by delegating to the shared package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed anymore
| } | ||
| } | ||
|
|
||
| func fetchRepositories(ctx context.Context, client *github.Client, patterns []sharedcontainer.Pattern, opts ScanOptions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not implement these flags again, the gh scan command already implemented it and gh renovate enum as well. reuse existing code
| lines := strings.Split(content, "\n") | ||
|
|
||
| // Check against all patterns | ||
| for _, pattern := range patterns { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is shared logic. move it to the pkg/container package and use it here and in the gitlab implementations.
| return sharedcontainer.IsMultistage(string(decodedContent)) | ||
| } | ||
|
|
||
| func scanDockerfile(git *gitlab.Client, project *gitlab.Project, file *gitlab.File, fileName string, patterns []sharedcontainer.Pattern, isMultistage bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is shared logic mostly, extract and use it from the shared package
| @@ -0,0 +1,5 @@ | |||
| 2026-01-20T09:35:57Z info Log level set to info (default) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this file!
No description provided.