Here I scan a very large vector, which has more than 1 << 16 elements. vector<vector<vector<uint64_t>>> table_portions: a vector of tables (here a vector of 2 tables, these tables is just part of a bigger table, so they have the same schema) vector<vector<uint64_t>>: a single table with 13 columns vector<uint64_t>: a single columns with size 1 << 16
1 2 3 4 5 6
vector<vector<vector<uint64_t>>> table_portions(2, vector<vector<uint64_t>>(13));
for (auto &vec : table_portions) {
for (auto &v : vec) {
v.resize(1 << 16);
}
}
Problem:
I just want to scan this vector with some restrictions to measure the time needed. But the duration is always 0? I can not understand that. I did turn on the -O3 flag while compiling.
Without -O3 the scans takes some time.
The inportant part:
1 2 3 4 5 6 7 8 9 10 11 12 13
for (size_t scan = 0; scan < scans; ++scan) {
uint64_t count_if = 0;
for (auto &vec_vec : table_portions) {
for (size_t i = 0; i < vec_vec[0].size(); ++i) {
if (vec_vec[10][i] >= l_shipdate_left && vec_vec[10][i] < l_shipdate_right
&& vec_vec[6][i] >= l_discount_left && vec_vec[6][i] <= l_discount_right
&& vec_vec[4][i] < l_quantity) ++count_if;
}
// std::cout << count_if << std::endl;
}
}
auto end_time = std::chrono::high_resolution_clock::now();
auto time_vec = std::chrono::duration_cast<std::chrono::microseconds>(end_time - start_time).count();
-- TPC-H Query 6
select
count(*)
from
lineitem
where
l_shipdate >= date '1994-01-01'
and l_shipdate < date '1995-01-01'
and l_discount between 0.06 - 0.01 and 0.06 + 0.01
and l_quantity < 24
Wow Indeed.
Compiler can do the job in the compile time.
Hmm Long debugged and found that is aimed at the flag!
Hi salem c.
But I re think this.
I want to measure, how long it takes to scan 2000 times.
How can I make it?
Cout woud do negative impact on the performance
Put any I/O outside of the start/end time measurements of the portion of code you're trying to measure (unless it's the I/O that you're specifically trying to measure).
Ask yourself, why are you calculating the value of count_if if you never use the result of the value?
you can avoid most of cout's time penalty by redirecting the output to a file.
you can also factor it out, run it 2000 times and give that time (for the total, not per iteration) so you cout after the timing portion is totally done, no effect.
as far as running it in a tight loop ... be warned: running something 2000 times in a loop is very different from running it 2000 times between other code calls in the real program. If there is a bunch of stuff between the calls, you lose some of the efficiency that loops have (caches and registers and such that don't have to get swapped out, memory access magic, and more come into play).
time it in a loop to tweak it and beat its run time per iteration down.
run it in the real code alongside the other real code to see how it really performs and if its good enough.