OpenMP Seg Fault

I have a program which finds similar strings from a database consists of million strings of same length m. The input to my program is a string of length n (>>> m). The program checks for each substring of length m. It is implemented in c++ and uses OpenMP, but I am getting segmentation fault for the following piece of code. The string contains only four characters A, B, C and D. The program finds the similar strings and if the hamming distance is less than given value, then it counts the occurrences of each character (stores it in ARR)



1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
  
std::map<char, int> mapp = {{'A', 0}, {'B', 1}, {'C', 2}, {'D', 3}};
int **ARR = (int**) malloc(sizeof(int*)*n);
for(int z=0; z<n; z++)
  ARR[z] = (int*) malloc(sizeof(int)*4);
for(int i=0; i<n; i++)
  for(int j=0; j<4; j++) ARR[i][j] = 0; 

vector<string> V;
#pragma omp parallel for private(V) shared(ARR) 
for(int i=0; i<str.length()-m-1; i++){  
  std::string substr = str.substr(i, m);
  for(int j=0; j<N; j++){  // N number of databases to search
    V = retrieve(substr);  // find similar strings
    for(int k=0; k<(int)V.size(); ++k) {
    int hd = hamming_distance(substr, V[k]); // no. of mismatches between substr and kth element of V
    if( h < minHD){
      for(int l=0; l<m; l++){
        char c = V[k][l];
        #pragma omp critical
        {
           ARR[j+l][mapp.at(c)]++;
         }
       }
     }
   }
 }
}


I am getting Seg fault for the above code. Any help would be appreciated
Last edited on
Hi, cpp82.
Your problem (a lot of strings which can be made only by 4 characters) looks like something which relates with DNA.
I’m sorry I can’t help you, mainly because I haven’t understood what you are trying to do.
I suspect this line could be the one which tries to access beyond your array limits:
ARR[j+l][mapp.at(c)]++;
I think the simplest thing would be to substitute vectors, which you already use, for your C-style arrays, because they throw exceptions, which are a lot more informative.
Plus, with vectors code become much easier to write (and to read).

Since I couldn’t understand your code, even because I don’t know what a “hamming distance” is, I can’t come up with a helpful example; anyway I hope the following one can give you some hints:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
#include <iostream>
#include <limits>
#include <map>
#include <vector>

std::size_t makeAllStringsSameLength(std::vector<std::string> phrase);

int main()
{
   std::map<char, int> charintmap;
   std::vector<std::vector<int>> myvec;
   for(char c='A'; c<='Z'; c++) {
      charintmap.emplace(std::make_pair(c, int((c-'A'+1))));
      myvec.emplace_back(std::initializer_list<int>{c, 0});
   }
   for(char c='a'; c<='z'; c++) {
      charintmap.emplace(std::make_pair(c, int((c-'a'+'Z'-'A'+2))));
      myvec.emplace_back(std::initializer_list<int>{c, 0});
   }

   for(const auto& couple : charintmap) {
      std::cout << "charintmap.at(" << couple.first
                << ") == " << couple.second << '\n';
   }

   std::cout << "\nPress ENTER to continue...\n";
   std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');

   for(size_t row{0}; row<myvec.size(); row++) {
      std::cout << "myvec.at(" << row << ").at(0) == "
                << char(myvec.at(row).at(0))
                << "; myvec.at(" << row << ").at(1) == "
                << myvec.at(row).at(1) << '\n';
   }

   std::cout << "\nPress ENTER to continue...\n";
   std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');

   std::vector<std::string> words = { "next", "time", "please",
                                      "provide", "a", "compilable",
                                      "piece", "of", "your",
                                      "code", "along", "with",
                                      "examples", "of", "the",
                                      "data", "you", "are",
                                      "using" };

   // 'wordlen' would inform you about how long strings are, if needed.
   int wordlen = makeAllStringsSameLength(words);
   for(const auto& word : words) {
      for(const auto& ch : word) {
         int index{0};
         if('A' <= ch && ch <= 'Z') index = int('Z' - ch);
         if('a' <= ch && ch <= 'z') index = int('z' - ch);
         myvec.at(index).at(1) += charintmap.at(ch);
      }
   }

   for(size_t row{0}; row<myvec.size(); row++) {
      std::cout << "myvec.at(" << row << ").at(0) == "
                << char(myvec.at(row).at(0))
                << "; myvec.at(" << row << ").at(1) == "
                << myvec.at(row).at(1) << '\n';
   }

   return 0;
}

std::size_t makeAllStringsSameLength(std::vector<std::string> phrase)
{
   std::size_t longest{0};
   for(auto& word : phrase) {
      if(word.length() > longest) {
         longest = word.length();
      }
   }

   for(auto& word : phrase) {
      std::size_t spaces = longest - word.length();
      switch (spaces) {
         case 0:
            break;
         case 1: // 1 / 2 == 0 --> integer division
            word.insert(0, 1, ' ');
            break;
         default:
            word.insert(0, spaces/2, ' ');
            // now word.length != wlen
            word.insert(word.length(), longest-word.length(), ' ');
            break;
      }

   }

   for(auto& word : phrase) {
      std::cout << word << ": " << word.length() << '\n';
   }
   std::cout << "\nPress ENTER to continue...\n";
   std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');

   return longest;
}

Topic archived. No new replies allowed.