finding sum

How to find the minimum no of element required to form a given number >= x in O(nlogn)?

ex - {29, 17, 33, 21, 13, 15, 24, 20, 23}
x = 100

Last edited on

icy1 (596)

do the elements have to be consecutive? e.g. first four 29+17+33+21 = 100
or can it be any order, e.g. 33+24+20+23 = 100

Last edited on

FrostGiant (12)

It is not required.

icy1 (596)

not sure that's possible in O(nlog(n)). If for example there were more restrictions, like using exactly 2 elements, then sorting followed by another pass would fit.

oh wait, equal or greater than sum x? Can't you just sort and then cumulatively sum from the back until >= x?

Ganado (6812)

I agree, sorting and then accumulating a sum from the back seems like it should work. O(n log n) to sort (and one O(n) to traverse, but this of course doesn't affect the complexity).

Last edited on

jonnin (11441)

you can do this in ~O(N) I think if I understand the question.
set up buckets for >= x/1, >= 1/2 x, >= 1/3x ... >= 1/4x ... (I am not sure where to stop, is it 1/x? I will come back with this in a bit when I can think)
and then iterate the list, marking all the appropriate buckets with +1
the first bucket where the value is larger than the denominator is the answer.
for example if you have 3 values bigger than x, the answer is the denominator: 1 value is sufficient. If you have 3 values in the 1/2 x bucket, the answer is 2. and so on.

It does not tell you which numbers nor the sum, but it tells you how the smallest # needed to get >= x. Is that useful? It won't work for very large X ... eventually you run out of memory. There might be a better way to use this idea that is more space efficient.

you need some way to know to stop early too. eg if you have 2 values > 1/3 and 2 values > 1/4 you can stop. Hmm. Trying to make it simple but this is looking like more and more work as I think about it... I can't do this without writing something down, it will have to wait until after work. Trying to find a way to do it with only 1 bucket changed per input; as I have it you have to poke every bucket for every value which is very expensive, and check all the buckets after processing each one to see if we are done yet, also expensive. But I think there may be a way to do it.

edit: I think you only need 100 buckets, as they can represent percents of X.

Last edited on

icy1 (596)

btw, in case you did want a basic recursive for all possibilities leading to exactly sum X:

#include <iostream>
#include <vector>
#include <string>

using namespace std;

void SearchSum(const vector<int>& v, unsigned i, int sum, string solution)
{
    if (sum==0)
    {
        cout << solution << '\n';
        return;
    }
    if (sum<0 || i==v.size())
        return;
    
    SearchSum(v, i+1, sum, solution);
    SearchSum(v, i+1, sum-v[i], solution+" + "+to_string(v[i]));
}

void SubsetSum(const vector<int>& v, int sum)
{    
    SearchSum(v, 1, sum, "");                   // Skipping v[0]
    SearchSum(v, 1, sum-v[0], to_string(v[0])); // Including v[0]
}

int main() 
{
    vector<int> v {29, 17, 33, 21, 13, 15, 24, 20, 23};
    SubsetSum(v, 100);

    return 0;    
}

 + 33 + 24 + 20 + 23
 + 17 + 21 + 15 + 24 + 23
29 + 13 + 15 + 20 + 23
29 + 33 + 15 + 23
29 + 17 + 21 + 13 + 20
29 + 17 + 33 + 21

FrostGiant (12)

thanks everyone

jonnin (11441)

I played with it off and on since you asked.
my approach does work, but it is *woefully* bad for small arrays like your example.
the best I have so far involves a fixed loop from 100 down to 1 as percents with an inner loop that executes N times (but not all in the same iteration, N total times for all 100)

looks like
for(100 down to 1)
for(count of items at that percentage down to 0) //so if there are 10 items at 5%, it loops 10 times, but there are only N items total, so this loop executes N times across 100 buckets.

I am looking at moving the percent counts to a map, but, that is slower for large arrays where most percentages have a value, its only a tweak for tiny arrays. So I think the full array is better for large input arrays, but it will be terrible for arrays with all the same value, like a million inputs of '1' ... nothing seems to fit all the cases well (so far).

the other thing is that while the double loop only really executes the *work* N times, it does an awful lot of comparisons.

I am not happy with it. I know it can be better, but I don't see it yet.

the million 1s is also a problem .... if you wanted to produce 1 million from an array with 1 million ones, it needs extra logic to handle the <1% values. sigh.

Last edited on

Ganado (6812)

It sounds like it just isn't a general solution... I wouldn't bother.

jonnin (11441)

thats what is bugging me :) It seems like it should be doable in O(N), but I can't quite get the edge cases. Grumble.

Topic archived. No new replies allowed.