Skip to content

Add small-allocation fast path in tlsf_malloc#17

Merged
jserv merged 1 commit intomasterfrom
fastpath
Feb 8, 2026
Merged

Add small-allocation fast path in tlsf_malloc#17
jserv merged 1 commit intomasterfrom
fastpath

Conversation

@jserv
Copy link
Copy Markdown
Collaborator

@jserv jserv commented Feb 8, 2026

For sizes below BLOCK_SIZE_SMALL (FL=0 range), SL mapping is linear at ALIGN_SIZE granularity. This bypasses log2floor, round_block_size, and mapping by probing t->sl[0] directly and computing the SL index with a shift. Falls through to the generic block_find_free path when FL=0 bins are empty or the request exceeds the small range.


Summary by cubic

Add a fast path in tlsf_malloc for small allocations (size < BLOCK_SIZE_SMALL) using linear SL mapping in the FL=0 range. This skips log2floor and size rounding, reduces work per call, and speeds up tiny allocations while preserving the existing fallback path.

Written for commit b8fc2bb. Summary will update on new commits.

For sizes below BLOCK_SIZE_SMALL (FL=0 range), SL mapping is linear at
ALIGN_SIZE granularity. This bypasses log2floor, round_block_size, and
mapping by probing t->sl[0] directly and computing the SL index with a
shift. Falls through to the generic block_find_free path when FL=0 bins
are empty or the request exceeds the small range.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 8, 2026

WCET Results (x86-64)

TLSF WCET Analysis
==================
Timer:      cycles
Cache:      hot
Pool:       4194304 bytes (4.0 MB)
Iterations: 5000 (warmup: 500)
Sizes:      16 64 256 1024 4096 bytes

--- malloc_worst (small alloc from single huge block) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         73         98         98        122        196        465       94.1       12.4
      64         73         98         98         98        123        269       94.7        9.5
     256         73         98         98         98        172        196       91.4       11.5
    1024         73         98         98         98        196        367       92.5       12.1
    4096         73         98         98         98        172        220       90.5       12.0

--- malloc_best (exact bin hit, no split) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         49         73         74         74         74        147       69.8        9.0
      64         49         73         74         74         74        147       69.8        9.1
     256         49         74         98         98         98        147       76.5        9.1
    1024         49         74         98         98         98        172       75.2        9.8
    4096         49         74         98         98         98         98       75.3        9.5

--- free_worst (sandwiched between two free blocks) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         49         74         98         98         98       5414       77.5       76.2
      64         49         74         98         98         98        196       76.8       11.0
     256         49         74         98         98        147        196       75.4       10.1
    1024         49         74         74         98        147        245       74.7        9.6
    4096         49         74         74         98        123      17762       77.9      250.3

--- free_best (no merge (used neighbors)) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         49         49         74         74        122        221       59.5       12.5
      64         49         49         74         74        123        245       60.2       12.7
     256         49         49         74         74        122      17910       64.8      252.7
    1024         49         73         74         74        123        147       63.5       12.6
    4096         49         49         74         74        123        196       61.2       12.7

--- worst/best ratio (p99) ---
    size     malloc       free
      16      1.26x      1.32x
      64      1.32x      1.65x
     256      1.00x      1.32x
    1024      1.26x      1.32x
    4096      1.26x      1.32x

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 8, 2026

WCET Results (arm64)

TLSF WCET Analysis
==================
Timer:      ticks
Cache:      hot
Pool:       4194304 bytes (4.0 MB)
Iterations: 5000 (warmup: 500)
Sizes:      16 64 256 1024 4096 bytes

--- malloc_worst (small alloc from single huge block) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         24         32         32         32         40         48       31.3        2.4
      64         24         32         32         32         40         40       31.1        2.6
     256         16         32         32         32         40         40       31.0        2.8
    1024         16         32         32         32         40       1128       31.3       15.8
    4096         16         32         32         40         40         40       31.1        2.7

--- malloc_best (exact bin hit, no split) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         24         24         32         32         32         40       25.6        3.2
      64         24         24         32         32         32         32       25.8        3.3
     256         16         24         32         32         32         40       25.4        3.1
    1024         16         24         32         32         32         32       25.5        3.1
    4096         16         24         32         32         32       1592       25.8       22.4

--- free_worst (sandwiched between two free blocks) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         16         24         32         32         32       1560       25.8       21.9
      64         16         24         32         32         32         32       25.4        3.1
     256         16         24         32         32         32       1600       25.6       22.5
    1024         16         24         32         32         32         32       25.5        3.2
    4096         16         24         32         32         32         32       25.7        3.3

--- free_best (no merge (used neighbors)) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         16         24         24         24         24         32       20.4        4.0
      64          8         24         24         24         24         32       20.3        4.0
     256         16         24         24         24         24         32       20.5        4.0
    1024         16         24         24         24         24         32       20.3        4.0
    4096          8         24         24         24         24       1544       20.7       21.9

--- worst/best ratio (p99) ---
    size     malloc       free
      16      1.00x      1.33x
      64      1.00x      1.33x
     256      1.00x      1.33x
    1024      1.00x      1.33x
    4096      1.00x      1.33x

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

@jserv jserv merged commit 2760509 into master Feb 8, 2026
10 checks passed
@jserv jserv deleted the fastpath branch February 8, 2026 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant