I have created a function F1 in a C++ DLL among many other functions.
When executed simply from another C++ program, F1 runs normally with logical duration (about 1.8 seconds on my PC)
When executed from another, more ambiguous, program i.e. a C# application managing a C++ Dll which itself is calling my function F1, there are 2 scenarios :
1. if run in parallel processing (OpenMP) : 8.2 seconds
2. if run without parallel processing : 4 seconds.
In both cases, it takes a longer duration for F1 to execute.
For info, I am using the std::clock_t tool to measure time.
My question is : WHY this difference in behavior ?
Is it due to a DLL calling another DLL and all managed by a C# program ? knowing that many other functions are called the same way ?
Or might it be due to another reason ?
apparently the 'marshalling' system is doing unnecessary copies of data and bloating of cpu types into classes (eg bool) and such. There may be some options under that. It may be possible to convert your inputs and result to a single thing (like a large string) so it only has to marshal one entity (but then you need to add code to handle the I/O part yourself). It may also be possible to use the 'managed' c++ sublanguage as a go-between.
basically, its the compiler/tools 'helping you' that is slowing it down.
I would like however, to stress the fact that the function F1 is in C++ and is called by another C++ parent DLL.
The time stamp is also only in C++.
So the whole time consumption is occuring in C++ actually.
I mentioned C# just to say that those C++ Dlls are working within a larger scale application.
Here is a sample code :
void F_parent() // IN C++ parent DLL
std::clock_t tstart = clock(); // start of clock
F1(); // Call to F1 which is a C++ DLL
std::clock_t tend = clock () ; // end of clock
cout << tend - tstart << endl;
Not from what you are showing. Any chance the c++ parent is loading and unloading the child dll every time it is called, and that in a loop or something? Or is F1 doing something slow every time it is called, like grabbing dynamic memory and releasing it, or whatever -- it would have to be something that can be optimized in a local program but not across the library boundary, and I don't know that level of the linker/compiler optimize.
I know that isnt very helpful. I would focus on whether it is doing dll load/unload first.
I do not remember. I know there is a way to see that in the debugger or somehow but I can't recall how; you will need to web-search that. All I know is, its doable.
Look at this page, maybe that listdlls approach?
What you need is some sort of library from M$ that lets you see what you have loaded in your own process. May have to dig around to find something like that, but there should be something out there.
How many times are you calling F1()? If you call it once then the time difference seems unusual. If you call it 100 million times, and it does something simple, then the overhead of using the DLL might becomes significant.
well, you have to unravel why it crashed first, then figure out once it is stable what the real times are. A time given on a run that crashed has no value.
3 calls ... is very low. Even if it unloaded and reloaded the library that should not take 6 extra seconds to do. Hmm.
you can try static link the library to the parent but that means change the child required recompile of parent, a little ugly. It is manageable, though: you make the parent the 'project' and it would manage compiling both the parent and child together, making sure you get what you need when you have to rebuild.