how use multithread inside a class?

Pages: 123
For CPU-bound processing (which I believe you have here), the maximum number of threads started should be no more than 1 less than the number of cpu's available (1 cpu is for control, other things etc). As starting a thread is expensive you'll often use a thread pool. For MS, see eg https://learn.microsoft.com/en-us/windows/win32/procthread/thread-pools . The last thing you want to be doing is repeatedly creating new threads. This is very expensive. Some of the std::algorithms can now be used with parallel threads and it has been proven that having code just use parallel std::algorithm instead of serial mode can not only give no performance improvement but can actually make the code slower! To obtain good parallel thread support for CPU-bound tasks often requires a rethink as to how the algorithm used is implemented. If you look at cpu history graphs, you should see for the cpu's used for the parallel process (cpu-bound) uniform usage which should be close to 100%
Note that NewSetPixel is not thread-safe. It cannot safely be used to set the same pixel from different threads. Either you will have to rewrite your code to prevent it from happening, or you would need to add some kind of "synchronization" (mutex, atomic, etc.) but that will probably reduce the performance even more.

If you are drawing multiple lines, maybe you could speed up the line drawing by drawing some lines in one thread and some lines in another thread, but you will still run into the problem above unless you can guarantee that lines from the different threads do not overlap.
Last edited on
with multithread i get:
"Process returned 0 (0x0) execution time : 82.786 sPress any key to continue."
i learned the C\C++, by me: tutorials, books and forums and experience... what i mean is: we can make algorithms, but we don't learn how we can speed up... only learn, for exemple, the SetPixel() is much more slow than DIB's... but the loops and array's can be slow too... and i don't know how win speed. we can learn Multithread... but, like you see, is more slow.
without multithread, the same code, i get:
"Process returned 0 (0x0) execution time : 0.109 sPress any key to continue."
but with 40FPS.. i only draw 1 rectangle with Z. imagine 10 rectangles... it will be too much slow.
and yes i have on 3 (some Portuguese, my language)more forums.
or my code isn't prepared for multithread or something.
Peter87: i can learn more. but i don't see a different algorithm for do the same. the C\C++ have several tecnics\tips for we win more speed, but i need learn more
using the pragrma(before the includes) without multhread, i win much more speed:
1
2
3
4
5
6
7
8
9
10
11
12
#pragma GCC optimize("Ofast")
#pragma GCC target("avx,avx2,fma")

#include <iostream>
#include <thread>
#include <windows.h>
#include <math.h>

using namespace std;
class image
{
public://.......... 


~160FPS instead only 40FPS.
but drawing 2 rectangles, i get ~56FPS
You didn't use any compiler optimizations before? Normally you would do this from the command line, makefile or whatever you use to invoke the compiler. It's kind of pointless to try and optimize with compiler optimizations turned off if you ask me...

Perhaps the following questions are stupid, because I'm not familiar with this DIB API and I can't say I have analyzed everything you do in detail, but here we go...

Are you sure the performance bottleneck is in your pixel drawing code and not in the DIB functions (BitBlt, FillRect, etc.)?

You are not creating a new image object every frame, are you?

When drawing a filled rectangle it seems like you are drawing to the same pixel locations multiple times. Just to test, count how many times you call NewSetPixel and keep track of how many unique pixel locations {X, Y} you set when drawing a single rectangle. If you do it correctly I think you should have the same number of calls as you have unique positions (or at least the two numbers should be pretty close) otherwise you'll waste a lot of time.

Are you sure this DIB API is meant for high performance rendering? Setting individual pixels and transferring them to the GPU each frame is always going to be slower than if you can draw it on the GPU directly using something like OpenGL.
Last edited on
"Are you sure the performance bottleneck is in your pixel drawing code and not in the DIB functions (BitBlt, FillRect, etc.)?"
i think so... because i did some tests:
1 - no DrawRectangle() and no DrawLine(), i get ~1300FPS;
2 - just draw a line(DrawLine()) i get more or less 1200FPS;
(these values can go to 999 or less... in same loop)
3 - just 1 filled rectangle i get 118FPS.
both values tested with:
1
2
#pragma GCC optimize("Ofast")
#pragma GCC target("avx,avx2,fma") 

without otimizations '#pragma':
1 - draw nothing is more or less 1150;
2 - Draw a line 1150FPS;
3 - Draw a filled Rectangle 40FPS;
i think the DIB's are the more faster GDI API's.

"You are not creating a new image object every frame, are you?"
no.. only a 'auto' variable that i changed now and seems better.

"When drawing a filled rectangle it seems like you are drawing to the same pixel locations multiple times. Just to test, count how many times you call NewSetPixel and keep track of how many unique pixel locations {X, Y} you set when drawing a single rectangle. If you do it correctly I think you should have the same number of calls as you have unique positions (or at least the two numbers should be pretty close) otherwise you'll waste a lot of time."
i changed that now:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
void DrawRectangle(float PosX, float PosY, float PosZ, float Width, float Height, float Depth, COLORREF Color = RGB(255,0,0), bool Filled = false)
    {
        if(Filled==false)
        {
            DrawLine( PosX, PosY, PosZ,PosX + Width, PosY, PosZ + Depth, Color);
            DrawLine( PosX, PosY, PosZ, PosX, PosY + Height, PosZ, Color);
            DrawLine( PosX + Width, PosY, PosZ + Depth, PosX + Width, PosY+Height, PosZ + Depth, Color);
            DrawLine( PosX, PosY + Height, PosZ, PosX + Width, PosY + Height, PosZ + Depth, Color);
        }
        else
        {
            for(int i = 0; i<Height; i++)
                DrawLine( PosX, PosY + i, PosZ,PosX + Width, PosY +i, PosZ + Depth, Color);

        }
    }

the code on 'for' is wrong... only works for Vertical rectangle and not horizontal... i tested now.. but thinking on speed.... these code is killing the CPU :(
and yes the rectangle is big:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
void DrawRectangle(float PosX, float PosY, float PosZ, float Width, float Height, float Depth, COLORREF Color = RGB(255,0,0), bool Filled = false)
    {
        if(Filled==false)
        {
            DrawLine( PosX, PosY, PosZ,PosX + Width, PosY, PosZ + Depth, Color);
            DrawLine( PosX, PosY, PosZ, PosX, PosY + Height, PosZ, Color);
            DrawLine( PosX + Width, PosY, PosZ + Depth, PosX + Width, PosY+Height, PosZ + Depth, Color);
            DrawLine( PosX, PosY + Height, PosZ, PosX + Width, PosY + Height, PosZ + Depth, Color);
        }
        else
        {
            for(int i = 0; i<Height; i++)
                DrawLine( PosX, PosY + i, PosZ,PosX + Width, PosY +i, PosZ + Depth, Color);

        }
    }

the best way is filling it, but i don't know filling algorithm with Z, only a normal rectangle. what i can do is just 1 entire loop for draw the rectangle instead call DrawLine() several times... maybe with that i can win more speed.
i win more speed heres what i learned today from these youtube video( https://www.youtube.com/watch?v=RSJcnBw0oWY&ab_channel=SurajSharma ):
1 - the type of 'Build Project' is 'Release'... much more faster than 'Debug';
2 - using the '++i' instead 'i++' seems more faster;
3 - before calculation i just tested if the pixels are on avatar\surface or not:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
//Draw Line using the Steps\ Incrementation and start from their origin:
        float X = X0;
        float Y = Y0;
        float Z = Z0;
        BYTE R = GetRValue(LineColor);
        BYTE G = GetGValue(LineColor);
        BYTE B = GetBValue(LineColor);
        
        //Before the loop we create and initializate the variables:
        int PosX=0;
        int PosY=0;
        float EyeDistance = 500;
        float Perspective=0;
        size_t pixelOffset=0;
        
        //on loop using '++i' instead 'i++' is more faster:
        for(int i =0; i <LineDistance; ++i)
        {
            //for we win speed, 1st we test if the pixel is on Surface\Avatar...
            //if is we add the coordenates steps and we go to the next cycle on the loop:
            if(Z<0 && PosX>=ImageWidth && PosX<0 && PosY>=ImageHeight && PosY<0)
            {
                X+=XSteps;
                Y+=YSteps;
                Z+=ZSteps;
                continue;
            }
            
            //For every steps we calculate the perspective, because the Z is diferent:
            Perspective = EyeDistance/(EyeDistance+Z);

            //The 3D to 2D convertion(i use 300 of eye distance, but we can change it):
            PosX = trunc(X*Perspective);
            PosY = trunc(Y*Perspective);
            
            //the pointer is 1D array, so we calculate the index using the PixelSize and ScanLineSize:
            pixelOffset = (PosY) *scanlineSize + (PosX) *pixelSize;
            
            Pixels[pixelOffset+2]=R;//Red;
            Pixels[pixelOffset+1]=G;//Green;
            Pixels[pixelOffset+0]=B;//Blue.

            //Increment steps(integer results):
            X+=XSteps;
            Y+=YSteps;
            Z+=ZSteps;
        }

using these changes and continue with '#pragma'.. drawing 2 rectangles i get: ~90FPS.
these changes was on DrawLine().. tomorrow i will try change the DrawRectangle().
thanks for all.
at least, now, i get more about otimization and what i can change more... with time i will learn more for win experience.
thank you so much for all to all
correct me 1 thing: on compiler options i added 2 libraries'*.a', using Release, they still be combined on exe too?(on Debug they are, on Release, i don't know)
even i added the C++ libs too
Last edited on
1 - the type of 'Build Project' is 'Release'... much more faster than 'Debug';

"Release" essentially means you enable compiler optimizations.

"Debug" disables (most) compiler optimizations and enables debug-symbols to make it easier to debug your code. It might also enable extra checks that slows down the code but this depends a lot on the compiler/IDE that you use and how it has been configured.

2 - using the '++i' instead 'i++' seems more faster;

For built-in types you won't notice any difference. For custom types (e.g. iterator classes) it might sometimes make a small difference but even then it's unlikely that it will be enough to be noticeable.
Last edited on
@Cambalinho

1 - the type of 'Build Project' is 'Release'... much more faster than 'Debug';


I feel sorry that after 9 years of being a member of this site, you have only just learnt this :+|

Running a program compiled as Debug, is like trying to win Formula One in first gear only ..... then trying to micro tune the aerodynamics (multi threading your code) ; then using a stopwatch to 1 milionth of a second (1600 FPS ????)

I am sorry, this is all sounding ridiculous right now.
I don't know GCC, but with VS you have optimisation options - eg small/fast code etc
sorry only now. but it's life.
"I feel sorry that after 9 years of being a member of this site, you have only just learnt this :+|"
honestly i'm learning C\C++ even CodeBlocks or VS by me and not by eschool.
so isn't easy understand if 'Debug' is much more slow than 'Release'.
now i need get more speed on DrawRectangle(), so i must update it ;)
but lets see what i had learned for speed up the code:
1 - using Release instead Debug project type;
2 - using '#pragma' commands:
1
2
#pragma GCC optimize("Ofast")
#pragma GCC target("avx,avx2,fma") 

on VS i don't know the commands... and, i belive, the Compiler options have it's own options;
3 - we must review the code, because we can use more speed than we think and without notice... like i did changing that 'if' before do some calculations;
4 - using loops, we can win more speed... on video he said that '++i' is more faster than 'i++', honestly, for now, i didn't have seen the big different.
5 - maybe theres more tips\ideas, but i need learn much more ;)
(wasn't easy find that youtube video)
if anyone know some link for a learn more about speed, please tell me.
thanks to all to all
4). Unless otherwise needed, it's good practice to use pre-inc rather than post-inc to get into the habit of doing it. The issue arises with post-inc as the value returned has to be the value before the inc - which usually means a copy of the object being undertaken. This can be negligible (such as for an integral - which is why you wouldn't see any [if at all] change here) or expensive depending upon the requirements of the copy-constructor of the object being incremented. Also pre-inc returns a ref and post-inc returns a value. The same applies to pre/post decrement.

use of trunc. As the required type from trunc() is int, you might try just using a simple cast:

1
2
PosX = int(X*Perspective);
PosY = int(Y*Perspective);


What is this code supposed to do?

1
2
3
4
5
6
7
if(Z<0 && PosX>=ImageWidth && PosX<0 && PosY>=ImageHeight && PosY<0)
            {
                X+=XSteps;
                Y+=YSteps;
                Z+=ZSteps;
                continue;
            }


If the if condition is true, then won't the for loop just loop doing nothing (other than incrementing X, Y, Z - which aren't used past the for loop) until the for condition becomes true as the variables tested in the if statement aren't altered in the if body and continue?

If there is further code later which does use X, Y and Z then why not just change X, Y, Z using a simple calculation outside of the loop?

Possibly something like (NOT tried):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
float X = X0;
float Y = Y0;
float Z = Z0;
BYTE R = GetRValue(LineColor);
BYTE G = GetGValue(LineColor);
BYTE B = GetBValue(LineColor);

//Before the loop we create and initializate the variables:
int PosX = 0;
int PosY = 0;
float EyeDistance = 500;
float Perspective = 0;
size_t pixelOffset = 0;
int i = 0;

//on loop using '++i' instead 'i++' is more faster:
for (; i < LineDistance && !(Z < 0 && PosX >= ImageWidth && PosX < 0 && PosY >= ImageHeight && PosY < 0); ++i) {
	//For every steps we calculate the perspective, because the Z is diferent:
	Perspective = EyeDistance / (EyeDistance + Z);

	//The 3D to 2D convertion(i use 300 of eye distance, but we can change it):
	PosX = int(X * Perspective);
	PosY = int(Y * Perspective);

	//the pointer is 1D array, so we calculate the index using the PixelSize and ScanLineSize:
	pixelOffset = PosY * scanlineSize + PosX * pixelSize;

	Pixels[pixelOffset + 2] = R;//Red;
	Pixels[pixelOffset + 1] = G;//Green;
	Pixels[pixelOffset + 0] = B;//Blue.

	//Increment steps(integer results):
	X += XSteps;
	Y += YSteps;
	Z += ZSteps;
}

// Only if X, Y, Z are required later on

if (i < LineDistance) {
	auto d = lineDistance - i;

	X *= XSteps * d;
	Y *= Ysteps * d;
	X *= ZSteps * d;
}


Once the for loop is entered, will Z, PosX and PosY ever become < 0? If they can be < 0 only before the for loop is entered, then this can be tested first so something like:

1
2
3
if (Z > 0 && PosX > 0 && posY > 0) {
//on loop using '++i' instead 'i++' is more faster:
	for (; i < LineDistance && !(PosX >= ImageWidth && PosY >= ImageHeight); ++i) {


Last edited on
thanks for all
now i get ~120FPS
correct me anotherthing: using pointers or references on function parameters is more faster too?
Using pointer/ref is faster for parameters for objects that have a length fairly larger than the size of a pointer or where the object uses dynamic memory.

[PS see my post above as your reply crossed with changes to my post]
1
2
3
4
5
6
7
if(Z<0 && PosX>=ImageWidth && PosX<0 && PosY>=ImageHeight && PosY<0)
            {
                X+=XSteps;
                Y+=YSteps;
                Z+=ZSteps;
                continue;
            }

avoiding calculations outside the image\surface\avatar. doing these i win more speed. i can change it and the incrementation step is done too.
ok.. i changed it to:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
for(int i =-1; i <LineDistance; ++i)
        {
            //for we win speed, 1st we test if the pixel is on Surface\Avatar...
            //if is we add the coordenates steps and we go to the next cycle on the loop:
            if((Z>0 && PosX<ImageWidth && PosX>=0) || (PosY<ImageHeight && PosY>=0))
            {
                //For every steps we calculate the perspective, because the Z is diferent:
                Perspective = EyeDistance/(EyeDistance+Z);

                //The 3D to 2D convertion(i use 300 of eye distance, but we can change it):
                PosX = int(X*Perspective);
                PosY = int(Y*Perspective);

                //the pointer is 1D array, so we calculate the index using the PixelSize and ScanLineSize:
                pixelOffset = (PosY) *scanlineSize + (PosX) *pixelSize;

                Pixels[pixelOffset+2]=R;//Red;
                Pixels[pixelOffset+1]=G;//Green;
                Pixels[pixelOffset+0]=B;//Blue.
            }

            //Increment steps(integer results):
            X+=XSteps;
            Y+=YSteps;
            Z+=ZSteps;
        }


i need another correction: if i use pre-increment, the 'i' must start on '-1', right?(for not lose the 1st index)
now i'm totally confused :(
i can create a vertical rectangle without problems:
https://imgur.com/ghKyi7J
1
2
img.DrawRectangle(0,20,0, 800,500,1000, RGB(255,0,0),true);
        img.DrawRectangle(800,20,0, 800,500,1000, RGB(255,0,0),true);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
struct Position3D
{
    float X;
    float Y;
    float Z;
};
vector<Position3D> GetLinePoints3D(Position3D Origin, Position3D Destination)
{

    vector<Position3D> LinePoints;
    Position3D LinePoint;
    //Getting Line Distance(float results) coordenates and line:
    float DX = abs(Destination.X - Origin.X);
    float DY = abs(Destination.Y - Origin.Y);
    float DZ = abs(Destination.Z - Origin.Z);
    float LineDistance =sqrt((DX * DX) + (DY * DY) + (DZ * DZ));


    //Getting the Steps incrementation(float results) from their distances and LineDistance:

    float XSteps;
    if(DX==0)
        XSteps =0;
    else
        XSteps = DX/LineDistance;

    float YSteps;
    if(DY==0)
        YSteps = 0;
    else
        YSteps = DY/LineDistance;
    float ZSteps;
    if(DZ==0)
        ZSteps = 0;
    else
        ZSteps = DZ/LineDistance;

    //Draw Line using the Steps\ Incrementation and start from their origin:
    float X = Origin.X;
    float Y = Origin.Y;
    float Z = Origin.Z;
    for(int i =0; i <LineDistance; ++i)
    {
        if (Z>=0)
        {
            LinePoint.X=X;
            LinePoint.Y=Y;
            LinePoint.Z=Z;
            LinePoints.push_back(LinePoint);
        }
        X+=XSteps;
        Y+=YSteps;
        Z+=ZSteps;
    }
    return LinePoints;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void DrawRectangle(float PosX, float PosY, float PosZ, float Width, float Height, float Depth, COLORREF Color = RGB(255,0,0), bool Filled = false)
    {
        if(Filled==false)
        {
            DrawLine( PosX, PosY, PosZ,PosX + Width, PosY, PosZ + Depth, Color);
            DrawLine( PosX, PosY, PosZ, PosX, PosY + Height, PosZ, Color);
            DrawLine( PosX + Width, PosY, PosZ + Depth, PosX + Width, PosY+Height, PosZ + Depth, Color);
            DrawLine( PosX, PosY + Height, PosZ, PosX + Width, PosY + Height, PosZ + Depth, Color);
        }
        else
        {
           vector<Position3D> LeftVerticalLine = GetLinePoints3D({PosX,PosY,PosZ} , {PosX,PosY+Height,PosZ});
           vector<Position3D> RightVerticalLine = GetLinePoints3D({PosX+Width,PosY,PosZ+Depth} , {PosX+Width,PosY+Height,PosZ+Depth});
           for(size_t i=0;i<LeftVerticalLine.size(); i++ )
                DrawLine(LeftVerticalLine[i].X,LeftVerticalLine[i].Y, LeftVerticalLine[i].Z, RightVerticalLine[i].X,LeftVerticalLine[i].Y, RightVerticalLine[i].Z,  Color);

        }
    }

but now i'm confused how i create the floor rectangle :(
the rectangle is done\drawed from left to right. but i'm missing something :(
i did:
img.DrawRectangle(800,500,0, 800,500,1000, RGB(255,0,0),true);
but i get a run error: "Process terminated with status -1073741510 (0 minute(s), 8 second(s))"
i must find these error.
Last edited on
finally i did something:
1
2
3
4
5
void DrawPlane(float PosX, float PosY, float PosZ, float Width, float Height, float Depth, COLORREF Color = RGB(255,0,0))
    {
        for(int i=-1; i<Width; ++i)
            DrawLine(PosX+i,PosY,PosZ, PosX+Width+i,PosY,Depth, RGB(0,255,0),true);
    }

using these function i can draw a plane that it's like a floor\sky or similar.
in time i will add texture ;)
on these moment i have 2 rectangles and 2 planes and i get ~65FPS.
maybe i need more work for speed, but seems great.. thanks for all to all
Haver you used a profiler on the code to see where are the 'bottlenecks'? As you're using GCC, consider gprof
http://www.math.utah.edu/docs/info/gprof_toc.html
honestly what is 'bottlenecks'?
i will see the link ;)
Pages: 123