Skip to content

Nested timers#2486

Merged
thorstenhater merged 21 commits intoarbor-sim:masterfrom
marvinKaster:hierarchical_timer
Feb 4, 2026
Merged

Nested timers#2486
thorstenhater merged 21 commits intoarbor-sim:masterfrom
marvinKaster:hierarchical_timer

Conversation

@marvinKaster
Copy link
Collaborator

Introducing nested timers. You can now create timed sections within other timed sections. The spent time will be outputted in a hierarchical format.

Code example

PE(first_timer);
//...
PE(second_timer);
//...
PL(second_timer);
//...
PL(first_timer);

Example output of the ring benchmark

REGION                         CALLS       WALL    WALLCHILDS    WALLCAPTURED%    THREAD        %    
arbor                              2    285.048       285.129            100.0    35.631    100.0    
  run                              0    279.246       273.738             98.0    34.906     98.0    
    update                       199    273.264       218.751             80.1    34.158     95.9    
      advance                  19999    218.640       218.333             99.9    27.330     76.7    
        integrate              19999    218.067       135.328             62.1    27.258     76.5    
          cable              3999999     55.818         0.000              0.0     6.977     19.6    
          current_pas        3999999     24.780         0.000              0.0     3.098      8.7    
          ionupdate          3999999      6.131         0.000              0.0     0.766      2.2    
          zero               3999999      5.964         0.000              0.0     0.745      2.1    
          state_hh           3999999      5.259         0.000              0.0     0.657      1.8    
          threshold          3999999      5.157         0.000              0.0     0.645      1.8    
          samples            3999999      4.836         0.000              0.0     0.604      1.7    
          stimuli            3999999      4.567         0.000              0.0     0.571      1.6    
          current_expsyn     3999999      4.297         0.000              0.0     0.537      1.5    
          current_hh         3999999      4.189         0.000              0.0     0.524      1.5    
          state_expsyn       3999999      4.132         0.000              0.0     0.517      1.4    
          state_pas          3999999      3.698         0.000              0.0     0.462      1.3    
          post               3999999      3.518         0.000              0.0     0.440      1.2    
          setup                19999      2.981         0.000              0.0     0.373      1.0    
          event_expsyn            93      0.000         0.000              0.0     0.000      0.0    
        clear                  19999      0.134         0.000              0.0     0.017      0.0    
        sampledeliver          19999      0.099         0.000              0.0     0.012      0.0    
        samplesetup            19999      0.034         0.000              0.0     0.004      0.0    
      spikes                   19999      0.111         0.000              0.0     0.014      0.0    
    enqueue                    19999      0.458         0.354             77.3     0.057      0.2    
      merge_cell_events        19999      0.306         0.123             40.2     0.038      0.1    
        reserve                19999      0.031         0.000              0.0     0.004      0.0    
        tree                   19999      0.029         0.000              0.0     0.004      0.0    
        setup                  19999      0.022         0.000              0.0     0.003      0.0    
        generators             19999      0.022         0.000              0.0     0.003      0.0    
        push_back              19999      0.019         0.000              0.0     0.002      0.0    
      sort                     19999      0.026         0.000              0.0     0.003      0.0    
      clear                    19999      0.022         0.000              0.0     0.003      0.0    
    exchange                     199      0.016         0.013             81.4     0.002      0.0    
      exchange                   199      0.008         0.006             79.5     0.001      0.0    
        gather                   199      0.005         0.000              0.0     0.001      0.0    
        sort                     199      0.001         0.000              0.0     0.000      0.0    
        remote                   199      0.000         0.000              0.0     0.000      0.0    
        post_process             199      0.000         0.000              0.0     0.000      0.0    
      walkspikes                 199      0.003         0.000              0.0     0.000      0.0    
      gatherlocal                199      0.001         0.000              0.0     0.000      0.0    
      spikeio                    199      0.001         0.000              0.0     0.000      0.0    
  init                             0      5.882         5.883            100.0     0.735      2.1    
    group                          0      5.881         5.716             97.2     0.735      2.1    
      factory                     99      5.715         0.000              0.0     0.714      2.0    
      targets_and_sources         99      0.001         0.000              0.0     0.000      0.0    
    sources                        0      0.001         0.000              0.0     0.000      0.0    
    targets                        0      0.001         0.000              0.0     0.000      0.0    
  update                           0      0.000         0.000             74.0     0.000      0.0    
    update_conns                   0      0.000         0.001            367.1     0.000      0.0    
      local                        0      0.000         0.000              0.0     0.000      0.0    
      sort                         0      0.000         0.000              0.0     0.000      0.0    
      remote                       0      0.000         0.000             43.6     0.000      0.0    
        connections                0      0.000         0.000              0.0     0.000      0.0    
        destructure                0      0.000         0.000              0.0     0.000      0.0    
        sort                       0      0.000         0.000              0.0     0.000      0.0    
      collect_gids                 0      0.000         0.000              0.0     0.000      0.0    
      destructure                  0      0.000         0.000              0.0     0.000      0.0    
      index                        0      0.000         0.000              0.0     0.000      0.0    
      raw                          0      0.000         0.000              0.0     0.000      0.0    
      generated                    0      0.000         0.000              0.0     0.000      0.0    
      partition                    0      0.000         0.000              0.0     0.000      0.0    
    generators                     0      0.000         0.000              0.0     0.000      0.0  

@thorstenhater
Copy link
Contributor

Hi @marvinKaster,

you are running into a CI issue that has been fixed in main. Please rebase and it should be gone.

@thorstenhater
Copy link
Contributor

Hi @marvinKaster,

#2490 has a fix for the last failing test. Could you give it a review?

Best,
TH

Copy link
Contributor

@thorstenhater thorstenhater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @marvinKaster,

thanks, that will make life a bit easier. I've left a few suggestions.

@@ -12,12 +12,16 @@
}

// leave a profling region
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly a note for later updates. We could also offer a RAII macro here now, in the style of:

struct AutoProf {
 AutoProf(const auto& label): lbl(label) { enter(lbl); }
 ~AutoProf() { leave(lbl); }
  
  std::string lbl;
};

Even some macro magick could be used to generate unique names based on __FILE__ and __LINE__.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See other comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly the allocation of strings is problematic. Anyhow, not needed. I played around with tracy, which is extremely nice to work with, but I recon not really playing well with MPI.

#define PL arb::profile::profiler_leave
#define PL(name) \
{ \
static std::size_t region_id_ = arb::profile::profiler_region_id(#name); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possible extension: we could record FILE LINE FUNCTION when we set up the region.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it and AutoProf a try today, and somehow made everything slow. Maybe that is an extension for another PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was not a request to add it here.

void recorder::leave(region_id_type index, const std::vector<std::string>& names) {
if(current_timer_stack[current_timer_stack.size()-1] != index) {
throw std::runtime_error("recorder::leave without matching recorder::enter Trying to leave "+names[index] + " but currently in "+names[current_timer_stack[current_timer_stack.size()-1]]);
throw std::runtime_error("recorder::leave without matching recorder::enter Trying to leave "+names[index] + " but currently in "+names[current_timer_stack[current_timer_stack.size()-1]] + + " TimerStack: " + timer_stack_to_string(current_timer_stack, names));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ + looks off to me?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that. Interesting that it compiled with it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It compiles and work like you wanted it to
https://godbolt.org/z/n8jhnoj7h

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason is that C/C++ have unary plus and minus:

int x = 42;
-x; // negate
+x; // same as x
-x + +x; // 0

see https://godbolt.org/z/6Thd4PWMv
That it works for strings might tell us something about C++ and operator overloading ;)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I would guess then the string literal is interpreted as char*, which has an unary +.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically that, yes.

@thorstenhater
Copy link
Contributor

Now CI is running into an error on UTF-8 decode?

@marvinKaster
Copy link
Collaborator Author

@thorstenhater I have no idea how removing the '+' in leave() causes an UTF-8 error in the python call to profiler_summary / print_profiler_summary. I will investigate

@thorstenhater
Copy link
Contributor

Possibly an interaction with MPI?

@marvinKaster
Copy link
Collaborator Author

I tried reproducing the error locally, but it works every time for me. A rerun of the job was also successful. It seems to be hard to reproduce.

I don't think MPI is involved. As far as I understand, the methods don't call MPI at any point.

@thorstenhater
Copy link
Contributor

The suspicion was that the output gets passed through github's interface and two interleaved MPI ranks produced something weird.

@thorstenhater thorstenhater merged commit b890b87 into arbor-sim:master Feb 4, 2026
40 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants