Assignment operator bottlenecking simula

Forum

Forum
General C++ Programming
Assignment operator bottlenecking simula

Assignment operator bottlenecking simulation

I am running a Monte Carlo simulation of a polymer. The entire configuration of the current state of the system is given by the object called Grid. This is my definition of Grid:

class Grid{

public:
    std::vector <Polymer> PolymersInGrid;                    // all the polymers in the grid 
    int x;                                                   // length of x-edge of grid 
    int y;                                                   // length of y-edge of grid 
    int z;                                                   // length of z-edge of grid 
    double kT;                                               // energy factor 
    double Emm_n ;                                           // monomer-solvent when Not aligned 
    double Emm_a ;                                           // monomer-solvent when Aligned
    double Ems;                                              // monomer-solvent interaction
    double Energy;                                           // energy of grid 
    std::map <std::vector <int>, Particle> OccupancyMap;     // a map that gives the particle given the location
    

    Grid(int xlen, int ylen, int zlen, double kT_, double Emm_a_, double Emm_n_, double Ems_): x (xlen), y (ylen), z (zlen), kT (kT_), 
Emm_n(Emm_n_), Emm_a (Emm_a_), Ems (Ems_) {        // Constructor of class
        // this->instantiateOccupancyMap(); 
    };


    // Destructor of class 
    ~Grid(){                                    

    }; 

    // assignment operator that allows for a correct transfer of properties. Important to functioning of program. 
    Grid& operator=(Grid other){
        std::swap(PolymersInGrid, other.PolymersInGrid); 
        std::swap(Energy, other.Energy); 
        std::swap(OccupancyMap, other.OccupancyMap);
        return *this; 
    } 
.
.
.
}

I can go into the details of the object Polymer and Particle, if required.

In my driver code, this is what I am going:
Define maximum number of iterations.
1. Defining a complete Grid G.
2. Creating a copy of G called G_.
3. I am perturbing the configuration of G_.
4. If the perturbance on G_ is accepted per the Metropolis criterion, I assign G_ to G (G=G_).
5. Repeat steps 1-4 until maximum number of iterations is achieved.

This is my driver code:

auto start = std::chrono::high_resolution_clock::now(); 
Grid G_ (G); 
    int acceptance_count = 0; 
    for (int i{1}; i< (Nmov+1); i++){

        // choose a move 
        G_ = MoveChooser(G, v);  

        if ( MetropolisAcceptance (G.Energy, G_.Energy, G.kT) ) {
            // accepted
            // replace old config with new config

			acceptance_count++; 
			std::cout << "Number of acceptances is " << acceptance_count << std::endl;
            G = G_;
        }


        else {
            // continue;
        }

        if (i % dfreq == 0){
            G.dumpPositionsOfPolymers (i, dfile) ;
            G.dumpEnergyOfGrid(i, efile, call) ; 
        }
        // G.PolymersInGrid.at(0).printChainCoords();


    }
    
    
    
    auto stop = std::chrono::high_resolution_clock::now(); 
    
    auto duration = std::chrono::duration_cast<std::chrono::milliseconds> (stop-start); 

    std::cout << "\n\nTime taken for simulation: " << duration.count() << " milliseconds" << std::endl;

This is the interesting part: if I run the simulation using condition that do not have lots of "acceptances" (low temperatures, bad solvent), the simulation runs pretty fast. However, if there are a large number of acceptances, the simulation gets incredibly slow. **My hypothesis is that my assignment operator = is slowing down my simulation.**
I ran some tests:

number of acceptances = 25365, wall-clock time = 717770 milliseconds (!)

number of acceptances = 2165, wall-clock time = 64412 milliseconds

number of acceptances = 3000, wall-clock time = 75550 milliseconds

And the trend continues.
Could anyone advise me on how to possibly make this more efficient? Is there a way to bypass the slowdown I am experiencing, I think, due to the = operator?

I would really appreciate any advice you have for me!

Last edited on

salem c (3723)

> **My hypothesis is that my assignment operator = is slowing down my simulation.**
Don't guess, measure.
https://en.wikipedia.org/wiki/Gprof

number of acceptances = 25365, wall-clock time = 717770 milliseconds (!) => 28.3 mS/acceptance
number of acceptances = 2165, wall-clock time = 64412 milliseconds => 29.75 mS/acceptance
number of acceptances = 3000, wall-clock time = 75550 milliseconds => 25.1 mS/acceptance
Seems like a pretty linear relationship to me.

Maybe you want to focus on your simulation converging on an answer quicker, so it doesn't take 25K iterations to get there.

Like from the profile, if you find computing an acceptance takes 20mS, and the assignment 5mS, then your attempt to "optimise" assignment is going to leave you disappointed.

Before you dig a hole, make sure you're digging in the right place.

Oh, and before you start mauling the code, make sure you have source control in place.
https://en.wikipedia.org/wiki/Git

JLBorges (13770)

This assignment operator Grid& operator=(Grid other){ // ... first makes a full copy of the grid object on the right hand side.
Would it be possible (and faster) to transfer just the relevant properties without having to make a copy?

jonnin (11497)

swap is not doing anything. That may be said above? 'Other' is not passed by reference, so the changes made to it by swap do not 'keep' ... and swap is likely using a 3 way temporary method that is safe for everything, not something cute like registers / call stack / xor etc.

if you need to actually swap 2 objects instead of copy them, do it with pointers and swap the pointers. Not saying dynamic memory, just a reference pointer to your real storage in a working container.

just assign the values if you don't need the swap, and whether you need the swap or not, as said above pass it in by reference, see if those 2 ideas help. But, as noted, this may not be the actual main bottleneck.

Last edited on

seeplus (6653)

swap in this context is a way of doing the assignment for those 3 member variables. As other is passed by value (ie copy constructor), and then it's values swapped with this values as required, then when other destructor is called the old values of the class will be freed.

A timing issue is that the copy copies 2 vectors and a map. This can be expensive if these have a large number of members...

The performance issue is likely L15 in driver - G = G_;
As G_ is not used after, then this could be a move assignment rather than a copy assignment. This would avoid a potentially expensive copy.

I suggest that you add a move assignment to class Grid and then change driver L15 to:

G = std::move(G_);

mencecpp (10)

@seeplus, thank you for your response.
You are right. I am not using G_.

I have written a move operator and copy constructor as follows:

    Grid (const Grid &g){
        this->x = g.x; 
        this->y = g.y; 
        this->z = g.z;
        this->kT = g.kT; 
        this->Emm_n = g.Emm_n; 
        this->Emm_a = g.Emm_a; 
        this->Ems = g.Ems; 
        this->Energy = g.Energy; 
        this->OccupancyMap = g.OccupancyMap; 
    }

    // assignment operator that allows for a correct transfer of properties. Important to functioning of program. 
    Grid& operator=(Grid&& other){
        // swap creates unnecessary copies which take time to write 
        // std::swap(PolymersInGrid, other.PolymersInGrid); 
        // std::swap(Energy, other.Energy); 
        // std::swap(OccupancyMap, other.OccupancyMap);
        this->PolymersInGrid = other.PolymersInGrid;  
        this->OccupancyMap = other.OccupancyMap;
        this->Energy = other.Energy; 
        return *this; 
    }

However, this is bringing along an ocean of errors. Could you advice me on how to go about this?

seeplus (6653)

No, that isn't a move assignment as it's doing assignments.

Thinking about it, it's probably better to keep your original operator=() and provide a move constructor for the class. Something like (not tried):

// Move constructor as part of class
Grid(Grid&& g) noexcept: PolymersInGrid(std::move(g.PolkymersInGrid)), 
 OccupancyMap(std::move(g.OccupancyMap)), energy(g.energy) {}

...

// This will call copy or move constructor as appropriate
Grid& operator=(Grid other) noexcept {
    std::swap(PolymersInGrid, other.PolymersInGrid); 
    std::swap(Energy, other.Energy); 
    std::swap(OccupancyMap, other.OccupancyMap);
    return *this;

Note when using move with standard classes (vector, map etc), your move class functions needs to be marked as noexcept - otherwise it's done as a copy and not a move.

Have you got a copy constructor?

Last edited on

seeplus (6653)

PS. Hang on a minute. There's no dynamic memory usage in the class - so why provide a destructor? The default copy/move constructor/assignment will do just fine. So why provide a copy/move assignment? Why not simply:

class Grid {
public:
	std::vector <Polymer> PolymersInGrid;                    // all the polymers in the grid 
	int x;                                                   // length of x-edge of grid 
	int y;                                                   // length of y-edge of grid 
	int z;                                                   // length of z-edge of grid 
	double kT;                                               // energy factor 
	double Emm_n;                                           // monomer-solvent when Not aligned 
	double Emm_a;                                           // monomer-solvent when Aligned
	double Ems;                                              // monomer-solvent interaction
	double Energy;                                           // energy of grid 
	std::map <std::vector <int>, Particle> OccupancyMap;     // a map that gives the particle given the location

	Grid(int xlen, int ylen, int zlen, double kT_, double Emm_a_, double Emm_n_, double Ems_) : x(xlen), y(ylen), z(zlen), kT(kT_),
		Emm_n(Emm_n_), Emm_a(Emm_a_), Ems(Ems_) {}
//...
};

and driver L15 as:

G = std::move(G_);

will automatically generate the required move assignment.

See:
https://en.cppreference.com/w/cpp/language/move_assignment
re implicitly declared move assignment

https://en.cppreference.com/w/cpp/language/copy_assignment
re implicitly declared copy assignment

Last edited on

mencecpp (10)

Thank you @seeplus, this helped quite a bit! I appreciate your advice!

lastchance (6980)

Why are you trying to make entire copies of a "grid" every time? It's going to be costly shunting containers around unnecessarily.

It's not entirely clear what you are doing (other than using the Metropolis algorithm). Surely there only needs to be one master "occupancy" map and collection of polymers? Your acceptance or otherwise in the next step of the chain depends only on the energy levels.

Last edited on

dhayden (5799)

When copying a Grid, is it really valid to copy some of the members and not others?

mencecpp (10)

@lastchance:
I am performing a perturbation on my Polymers using MoveChooser, then checking if that perturbation is good or not using Metropolis, and then assigning the new perturbed configuration. Yes, the only thing that is changing is "PolymersInGrid" and as a result, "OccupancyMap".

Would you say that using some operator for Grid is pretty unnecessary, and it would be most efficient if I simply exchanged positions of polymers, and updated only the OccupancyMap?

mencecpp (10)

maybe instead of G = std::move(G_), the easiest thing to do might be

1
2

G.PolymersInGrid = G_.PolymersInGrid 
G.OccupancyMap = G_.OccupancyMap

Do you think this makes sense @lastchance?

lastchance (6980)

You aren't giving much detail of what you are actually doing - a reference would help. What is in your large containers PolymersInGrid and OccupancyMap that you need multiple versions of them?

The only things that you need to propose and check a potential next step in the chain (your "acceptance" criteria) are ACCESS to a master grid (e.g. to check that you don't self-intersect), possibly physical position, and the energy levels (to evaluate the probability exp(-dE/k.T) that the change is physically possible at that temperature T).

At the moment I can't see any good reason to either copy (which is very expensive) or move (which is unnecessary) the large containers PolymersInGrid and OccupancyMap. It looks like you simply need access to master versions, not creating new versions in every step.

Also, are you sure that some of the entities you are referring to as "polymers" aren't actually their constituent parts ("monomers")? You could do with supplying some detail to explain what you are actually doing.

Last edited on

salem c (3723)

Still waiting to see some profile data.

Otherwise, you're just playing pin the tail on the donkey.

seeplus (6653)

maybe instead of G = std::move(G_), the easiest thing to do might be

G.PolymersInGrid = G_.PolymersInGrid
G.OccupancyMap = G_.OccupancyMap

No. Here you're doing a copying of the dynamic containers - rather than a move as with std::move(). It's copying of the dynamic containers that are expensive. That's why you move them as after the move the old values are not required.

Last edited on

Topic archived. No new replies allowed.

C++

Forum

Assignment operator bottlenecking simulation