OpenMP parallelisation - how to make variables that last through the lifetime for each thread

Hi all,

I am trying to parallelise a for loop in C++ with openmp. In each loop a vector is populated with values from various calculations.

1
2
3
4
5
6
7
8
9
std::vector<double> data(max_len);

#pragma omp parallel for
for (size_t i=0; i < max_len; ++i) {
  for (size_t j=0; j < max_len; ++j) {
    // heavy calculation -> result
    data[j] = result; //error!!
  }
}


Now the problem is that if I define the vector outside of loop, then it is shared between different threads, and but that does not work, as each thread needs all of the vector for itself.

I can define the vector inside the for loop, making it private:
1
2
3
4
5
6
7
8
#pragma omp parallel for
for (size_t i=0; i < max_len; ++i) {
  std::vector<double> data(max_len);
  for (size_t j=0; j < max_len; ++j) {
    // heavy calculation -> result
    data[j] = result; // works but slow
  }
}


But this means the vector is created and destroyed in every iteration of the first loop. Which is quite expensive.

What I want is that when each thread starts, it will get it's own copy of the vector<double> data, but that copy will stay with it until the thread ends. In this way, there is no creation and destruction, only assignment of the same vector.

I have heard about openmp threadprivate, but I am not sure if and how it would work in this case.

How do I make sure that each thread gets it's own vector, and it lasts through the lifetime of that thread? (Any other advice for optimization is also appreciated!!)
Last edited on
[I am not using vector.push_back() because I have heard that it is quite slower than the [] operator.

The two operators do different operations. push_back adds elements to the end of a vector, operator[] gives random access to already existing elements.
http://www.cplusplus.com/reference/vector/vector/push_back/
http://www.cplusplus.com/reference/vector/vector/operator[]/

Apples and oranges.

operator[] doesn't prevent going out of bounds with a vector, or any other container that has [] access. If you want that protection consider std::vector::at.
http://www.cplusplus.com/reference/vector/vector/at/

Performs the same action as operator[], access the specified element, with additional boundary checks. at() is slower than [] because of the bounds checks. If the check goes out of bounds at() throws an exception you have to catch.
in terms of performance, push-back is microscopically slower than [] on a preallocated array because it changes the size variable too, and probably checks to ensure there is reserved space, and such. It has to do more things, so it can never be as fast as [] which is effectively a single pointer hit just like a C array.
where push-back is terribly slow is when the container is not preallocated.
Thanks for the replies, but my question is not really about vector.push_back() or vector.at() or [] operator. I researched about all of that, and I decided to use [] because it is the fastest out of all options of populating vectors. Bounds checking is unnecessary in this case, because the loop will never go to any index that is out of bounds, because the vector is defined to the same size as the for loop limit. (I removed that part from the question)

My question is really about how to use OpenMP parallel for but ensure that each thread gets it's own copy of the variable, which lasts through the lifetime of the thread. Any input on that would be really helpful.
Last edited on
20 seconds of internet searching yields
#pragma omp threadprivate(identifiers)
https://www.openmp.org/spec-html/5.0/openmpsu104.html
Last edited on
@mbozzi Thank you, I also found threadprivate while searching for it, but I am not sure how exactly I should use it in the program.

For example, should I write this?

1
2
3
4
5
6
7
8
9
10
std::vector<double> data(max_len);
#pragma omp threadprivate(data)

#pragma omp parallel for
for (size_t i=0; i < max_len; ++i) {
  for (size_t j=0; j < max_len; ++j) {
    // heavy calculation -> result
    data[j] = result;
  }
}


Now, when I initialize the vector with std::vector<double> data(max_len), it reserves the necessary amount of memory and initializes every member of the vector to 0. This is absolutely crucial because I will be using the [] operator on the vector after this, which has no bounds checking, or memory reserve checking, so the vector has to be properly initialized.

The OpenMP manuals say that the threadprivate variables are private copies for each thread. So what I don't understand is whether the values assigned in the before the parallel region starts will stay when the copies are made?

I have also heard about #pragma omp firstprivate . I am not sure which one would be the most appropriate for my code.
Last edited on
Topic archived. No new replies allowed.