I am profiling a program, which contains some logic that I have written and calls to the Neural Network Library Libtorch (C++ frontend for Pytorch)
Logic which I have written is purely singlethreaded while
Libtorch's function are multithreaded (like NN inferencing).
When I run the profilers it reports that about ~50% of the time is spent on libtorch's functions
Does that "50%" mean that half of my program execution is spent on libtorch
Or is that "50%" Spread across 4 threads Meaning that around 20% of the execution is spent on libtorch's functions. (assuming high degree of parallelism)
Note: Libtorch's multithreading works by dividing the problem to 4 equally sized micro-problems and executing them on each thread
P.S. If you have a suggestion for a better title to this forum thread comment it
i believe it is pure %. So 1/2 your *processing effort* was spent on libtorch.
this does not mean that 1/2 your 'wall clock time' was spent on it. That could be 20% or whatever, depending on how it load balanced and what else the computer was doing in the background and so on. Most profilers I have seen are always looking at how much time each function spent churning inside the cpu(s), in other words -- so if you have 2 threads doing equal work running the same function for 1 second, the profiler will report that that function spent 2 seconds, and if the program ran for 5 seconds, it may report 10 seconds worth of crunching. You can check that ... add up the profiler total times vs how long the program ran...