using count() question

Pages: 12

I'm trying to use the count() to see if 2(or more) elements from row [a] are in row[b]. If false, then increment b and increment the counter. The rows are in a 2d vector called d5. I've been researching this for a few hours and don't see what I'm doing wrong. I looked it up in the c++Reference, but don't have something correct.
Basically:
a = 0
layer_rep_start = 1

        int two_counter {0};
        for ( size_t b {layer_rep_start} ; b < d5.size() ; ++b, ++two_counter ){
            size_t two_num = std::count ( d5[a].begin(), d5[a].end(), d5[b].begin() );
            if ( two_num >= 2 )
                break;
        }

Edit: The 2d vector called d5 contains 5 integers per row and has close to 12 million rows.
Besides the for loop above, I'm also gonna create one for: >=3 and also one for: >=4.

Last edited on

ne555 (10692)

> and don't see what I'm doing wrong
¿what's the problem?
¿your code doesn't compile? paste the error message verbatim
¿the output is incorrect? provide a testcase

Brian845 (54)

Sorry about that. Here is the error message that was produced.

C:/Program Files/mingw-w64/mingw64/lib/gcc/x86_64-w64-mingw32/8.1.0/include/c++/bits/predefined_ops.h:241:17: error: no match for 'operator==' (operand types are 'int' and 'const __gnu_cxx::__normal_iterator<int*, std::vector<int> >')
{ return *__it == _M_value; }
~~~~~~^~~~~~~~~~~

No output, cause it wouldn't build without an error.

dutch (2548)

You need to look up the functions you use and give them the proper types. The third parameter to std::count isn't an iterator but a value. Try d5[b][0] if that's what you mean, although I still have no idea what you're up to. For instance, are the integers in a row in ascending order? PM me a description of your project if you can.

Last edited on

Brian845 (54)

I just tried d5[b][0]. I also changed the size_t two_num to int two_num. It didn't give me any errors, but the output was incorrect. Output was last row number of the d5 file.

Maybe I'm not using count for the right reason.

dutch (2548)

Here's a possible solution:

    const int MaxValue = 43;
    const int ValuesPerRow = 5;

    int a = 0;
    int layer_rep_start = 1;
    int two_counter = 0;

    // An array with one element per possible value where elements for values in row 'a'
    // are set to 1 and all other elements are 0.
    int n[MaxValue + 1] {};
    for (size_t i = 0; i < ValuesPerRow; ++i)
        n[v[a][i]] = 1;

    for (size_t b = layer_rep_start; b < d5.size(); ++b, ++two_counter)
    {
        int cnt = 0;
        for (size_t i = 0; i < ValuesPerRow; ++i)
            cnt += n[v[b][i]];
        if (cnt >= 2)
            break;
    }

Last edited on

Brian845 (54)

Thanks dutch! I will give it a try.

Once again, for those who are not sure of what I'm trying to accomplish.
Using a 2d vector, 12 million rows and 5 ints per row(ints can be from 1 to 43 inclusive).
I want to see when 2 integers from row "a"=[row index 0] appear in row "b"[row index 1]. If there is no match, ++b and ++counter. "b" will keep incrementing until >=2 numbers from row "a" appear in row "b". Hope that explains it better for others.

keskiverto (10365)

Lottery?

That task sounds more like set_intersection than count (except that you need only the size of the result set):
http://www.cplusplus.com/reference/algorithm/set_intersection/

Brian845 (54)

Lottery?

Huh? No, not sure what you mean by Lottery?

Although, I will look into http://www.cplusplus.com/reference/algorithm/set_intersection/ . Thanks for that!! Its not easy being self taught and no one to personally talk to. Thanks again!

dutch (2548)

I was thinking it was some kind of hare-brained lottery analysis, too, since it's a bunch of 5-number sets with values from 1 to 43. But it turns out it isn't.

Although this is obviously a set intersection problem, I don't think that std::set_intersection will be better than the (optimized!) solution I've given.

EDIT: Now that I think about it, set_intersection may not be that bad. It would look something like this:

    const int ValuesPerRow = 5;
    int a = 0;
    int layer_rep_start = 1;
    int two_counter = 0;

    for (size_t b = layer_rep_start; b < d5.size(); ++b, ++two_counter)
    {
        int w[ValuesPerRow];
        int* e = set_intersection(v[a].begin(), v[a].end(), v[b].begin(), v[b].end(), w);
        if (e - w >= 2)
            break;
    }

Last edited on

Brian845 (54)

dutch, I will try your solution. I just have a few other things to take care of this morning. I will post back on the results. Thanks again!!

keskiverto (10365)

A downside is that set_intersection requires sorted ranges, but sorting is simple:

1
2
3

for ( auto& row : d5 ) {
  std::sort( row.begin(), row.end() );
}

One can tweak the example implementation to do just the count:

template <class InputIterator1, class InputIterator2>
  int count_intersection (InputIterator1 first1, InputIterator1 last1,
                           InputIterator2 first2, InputIterator2 last2)
{
  int result {};
  while ( first1!=last1 && first2!=last2)
  {
    if (*first1<*first2) ++first1;
    else if (*first2<*first1) ++first2;
    else {
      ++result; ++first1; ++first2;
    }
  }
  return result;
}

[Edit] A version that returns bool:

template <class InputIterator1, class InputIterator2>
  bool count_intersection (InputIterator1 first1, InputIterator1 last1,
                           InputIterator2 first2, InputIterator2 last2,
                           int required)
{
  int result {};
  while ( result < required && first1!=last1 && first2!=last2)
  {
    if (*first1<*first2) ++first1;
    else if (*first2<*first1) ++first2;
    else {
      ++result; ++first1; ++first2;
    }
  }
  return required <= result;
}

Last edited on

dutch (2548)

It turns out that the 5 values in each row are already in ascending order.
You can think of them as monotonic timing values.
That's a good idea to just do the count to avoid storing the matched values.
I think this is likely to be more efficient than my original idea.
So with count_intersection the code becomes:

    int a = 0;
    int layer_rep_start = 1;
    int two_counter = 0;
    for (size_t b = layer_rep_start; b < d5.size(); ++b, ++two_counter)
        if (count_intersection(v[a].begin(), v[a].end(), v[b].begin(), v[b].end()) >= 2)
            break;

Brian845 (54)

Thanks keskiverto!

A downside is that set_intersection requires sorted ranges, but sorting is simple:
for ( auto& row : d5 ) {
std::sort( row.begin(), row.end() );
}

The rows are in a sorted order. That's just the way the timings are. Yes!!! Each row consists of 5 different timings. Sorry, but that is all I can divulge about the data file called "d5". Thanks for your help!! the set_intersection is a very interesting read. At first glance, it looks confusing to me. I will understand it a little better the more I learn. The 'template' stuff I haven't learned yet in my udemy course. Right now I'm learning about pointers....UGH! I understand the concept (ok, only a little), but not sure why use pointers instead of just accessing the elements directly. I'll get there though.

Thanks Again!

Brian845 (54)

WOW!! Thanks dutch!!! I just saw your count_intersection post. I just tried your other version "The one with creating an array. It worked perfectly! Now I will see how the count intersection works. I'll do some <chrono> timings on each and see which one is fastest. I'll be back with some results.

I really want to convey a special thanks to everyone who has helped me in this forum!! I'm very grateful for the help and inspiration to not give up. I have learned a LOT being a part of this forum. THANK YOU ALL!!!!!

dutch (2548)

You can use this little Clock class to time the code.

#include <iostream>
#include <chrono>

class Clock
{
    using Clk = std::chrono::steady_clock;
    Clk clk;
    Clk::time_point mStart, mEnd;
public:
    Clock() { mStart = clk.now(); }
    Clk::time_point get_start() const { return mStart; }
    Clk::time_point get_end()   const { return mEnd;   }
    void start()  { mStart = clk.now(); }
    void end()    { mEnd   = clk.now(); }
    void end_ms() { end();  print_ms(); }
    void end_us() { end();  print_us(); }
    Clk::duration dur() const { return mEnd - mStart; }
    template<typename D>
    auto ticks() const
        { return std::chrono::duration_cast<D>(dur()).count(); }
    void print_ms() const
        { std::cout << ticks<std::chrono::milliseconds>() << "ms\n"; }
    void print_us() const
        { std::cout << ticks<std::chrono::microseconds>() << "us\n"; }
};

int main()
{
    Clock clk; // starts clock (you can later say clk.start() to restart)

    // do something
    for (volatile int i = 0; i < 100000000; ++i)
        ;

    clk.end_us(); // print time in microseconds (or clk.end_ms() for milliseconds)
}

Brian845 (54)

Ok, will use that. One question...Is steady_clock better than the high_resolution_clock? Some of the videos I saw on YouTube say that the high res clk is used for better precision. I don't doubt you, was just curious. Thanks for the clock class!!! I don't quite understand "classes" yet, but I will, eventually. I'm learning more and more by being here!!
THANKS!!

keskiverto (10365)

One of core features of C++ is function overloading. C distinguishes functions by name only. C++ uses name and types of parameters. Hence you can have many functions with identical name.
For example, while C has abs, labs, fabs, fabsf, and fabsl C++ has just abs and fabs.
All these do essentially same operation; only the type is different.

However, writing overloads manually for all types would be boring. Not to mention impossible. An author of algorithm library cannot possibly know all the types that someone in future could apply that algorithm to.
Enter the compiler. Rather than writing an overload for new type, you write a template and the compiler writes the actual function (with concrete types) on the spot.

A template of an algorithm is generic, yet keeps C++ strongly typed.

not sure why use pointers instead of just accessing the elements directly.

1
2

for ( size_t b = start; b < last; ++b )
  cout << d5[b];

Lets say that the code above is in a function. What do you need to pass to that function?
d5, start, and last
func( Container& d5, size_t start, size_t last );
That is just one more parameter than two pointers/iterators.
Look at set_intersection. It takes five iterators. With container+indices it would need three containers and five indices. Eight parameters.

dutch (2548)

Some of the videos I saw on YouTube say that the high res clk is used for better precision.

In a perfect world, yes, but read the Notes here: https://en.cppreference.com/w/cpp/chrono/high_resolution_clock

Brian845 (54)

Dutch,
I'm throwing high res clk out the window! I can't believe I missed that in my previous read. I'm going with your "clock class" from this point forward. I have learned so much just today alone!! I have to re-write some of my notes, they are a mess right now.

keskiverto,
Thanks for the explanation. If I understand you correctly, then using pointers would be more efficient and easier to write? It may be more clearer to me after I get through the pointers section of my course. I'm going to printout your explanation above and refer to it after completion of the section in my course. Hopefully, I'll get back to you with a better understanding.

THANKS to both of you!!!

It's time for me to take a break and go over some notes. I'm enjoying the challenge of learning c++!

Pages: 12

C++

Forum

using count() question