Skip to content

Glob pattern with multiple wildcard segments becomes unresponsive at scale — guidance needed #11621

@VladimirFilipovic

Description

@VladimirFilipovic

Describe the bug

We're running into an issue where the tail input plugin stops discovering files when using a glob pattern with multiple wildcard segments across a large directory hierarchy.

Pattern that stops working at scale: /srv/syslog/remote//2026////auth.log
Pattern that works: /srv/syslog/remote//2026/03/25//auth.log

Reducing the wildcard segments from 4 to 1 (by pinning month/day) restores file discovery. This started happening after the number of first-level directories under remote/ grew to ~100.

We're not sure if this is a bug, a known limitation of glob expansion, or if we're misconfiguring something. Any guidance appreciated.

To Reproduce

  1. Create a directory structure: /srv/syslog/remote/{host1..host100}/2026/{01..03}/{01..28}/{00..23}/auth.log
  2. Configure a tail input with path: '/srv/syslog/remote//2026////auth.log' and ignore_older: '65m'
  3. Start Fluent Bit — no files are collected
  4. Change path to /srv/syslog/remote//2026/03/25//auth.log (pinning month/day)
  5. Restart Fluent Bit — files are collected as expected

Expected behavior

We expected the glob pattern to continue working as the directory count grew. If there's a practical upper bound on glob expansion, it would be helpful to know what it is so we can design
around it.

Environment

  • Fluent Bit version: 4.2
  • Server OS: Ubuntu 24.04 (x86_64)
  • Configuration:
service:
  flush: '1'
  http_listen: '127.0.0.1'
  http_port: '9597'
  http_server: 'on'
  log_level: 'warn'
  parsers_file: 'parsers.conf'
  storage.backlog.mem_limit: '1GB'
  storage.max_chunks_up: '512'
  storage.metrics: 'on'
  storage.path: '/etc/fluent-bit/fluentbit-db'
  storage.sync: 'normal'

pipeline:
  inputs:
    - name: tail
      tag: tail_logs.example
      buffer_chunk_size: '1MB'
      buffer_max_size: '5MB'
      db: '/etc/fluent-bit/fluentbit-db/fluentbit-tail-example.db'
      db.locking: 'true'
      ignore_active_older_files: 'true'
      ignore_older: '65m'
      path: '/srv/syslog/remote/*/2026/*/*/*/auth.log'
      path_key: 'filepath'
      storage.pause_on_chunks_overlimit: 'off'
      storage.type: 'filesystem'
      threaded: 'true'

There are 13 tail inputs with the same glob structure targeting different log files (auth.log, syslog.log, kern.log, daemon.log, cron.log, mail.log, local*.log, user.log, alert.log,
ntp.log, audit.log, authpriv.log, invld.log).

  • Filters/Plugins: tail input, http output, parser and nest processors

Additional context

  • The directory structure is /srv/syslog/remote/{hostname}/YYYY/MM/DD/HH/.log — roughly 100 hostnames under remote/.
  • The pattern used to work when there were fewer hosts. It broke as the host count grew.
  • Pinning month and day to reduce the wildcard segments to 1 restores functionality.
  • systemd unit has LimitNOFILE=524288, so file descriptor limits are not the constraint.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions