I'm wondering if this is an acceptable w

Forum

Forum
Beginners
I'm wondering if this is an acceptable w

I'm wondering if this is an acceptable way of doing things...

Nov 8, 2015 at 9:05pm

To put it simply, I have a small program with these structs:

struct Student
{
    int id;
    string name;
};

struct Course
{
    int id;
    string name;
};


struct Grade
{
    int studentId;
    int courseId;
    bool passed;
};

I have an array of Students (let's call it studentsArray), an array of Courses (coursesArray) and an array of Grades (gradesArray) to store collections of each one of the structs.

Grades will tell if a specific student passed a specific course or not in any given year. BUT there can be students who took no courses at all that year. That is, not all of the students in the studentsArray will be linked to an element in the gradesArray.

So I thought of two different options to do this:

First option:
a) I use a bubble sorting algorithm to sort the gradesArray so all studentId are grouped. Then I proceed like this:
b) read the first studentId in the sorted gradesArray and store it in a temporary variable. I also start a counter at 1.
c) I keep iterating through the gradesArray while the studentId in the current element is the same as in the temp variable, incrementing the counter.
d) When it changes I know I won't find any other elements linking that student so I can show the counter.
e) Assign the newly found studentId to the temp variable and restart the counter.

Second option:
This uses a nested iteration.
a) I iterate through the studentsArray.
b) For each student id, I start a counter in 0 and iterate through the whole gradesArray to see if that student id is linked to any Grade element.
c) If it is, I increment the counter.
d) When I finish iterating through the whole gradesArray, I print the counter if it's greater than 0.
e) Then I move to the next studentsArray item and go back from b).

Is the second option absolutely inefficient and a no-no? Or both are valid considering this is a beginner exercise? I mean, despite being a beginner exercise, I still want to follow good programming practices.

Last edited on Nov 8, 2015 at 9:13pm

Nov 8, 2015 at 9:26pm

Kevin C (319)

Ok, so you have a set of students, a set of courses and a relation Grade between them.

Your solution is similar to the strategy used in relational databases. They have the same issue you are describing: how to quickly find entries in an relation?

Well, it all depends on what do you want, how flexible do you want your program to be and how much effort do you want to put into it.

Obviously, the version with sorting is going to be much faster than linear search if the number of grades is sufficiently high.

On the other hand, if you are expecting to have less than a million entries in the Grades array, and you don't have to make a search several times each second, as a database would, then, who cares?

So, what solution you will pick is up to you.

Lets discuss the faster solutions.

The first possibility is to sort the Grades array. This will be fast on query, but it will be terrible when inserting new grades.

The way the databases solve this issue, and it is not very hard to do it in C++, is to use std::multimap for the Grades instead of arrays. Then you can either insert or query quickly.

It is possible to build another std::multimap for Grades indexed by courseId. This would mimic the feature of so-called 'building an index' on grades-courseID in a relational database. However, since you seem to have no need to query by courseId, then it might be unnecessary.

Well, I think I have written enough by now for you to have a good exercise.

Last edited on Nov 8, 2015 at 9:43pm

Nov 8, 2015 at 9:53pm

anarelle (68)

Thanks, Kevin!

Actually, I'm not doing any sorting when inserting new elements, only when querying. And for now I'm limited to arrays (no multimap for me yet, hehe).

So, let's see if I get it: if I have less than a million elements in the gradesArray, then both algorithms are pretty much the same in terms of efficiency?

And If there are at least a million entries, then the sorting (first option) is much faster when querying, right?

Also, how can you tell that the second option implements a linear search? I believe it has something to do with algorithm complexity, but not sure how exactly you get to that affirmation and would love to know :)

Nov 8, 2015 at 9:58pm

Kevin C (319)

So, let's see if I get it: if I have less than a million elements in the gradesArray, then both algorithms are pretty much the same in terms of efficiency?

No, the version with sorting is significantly more efficient almost always, but if you have less than a million elements, the difference in speed between the two would probably be around 0.05 seconds. Does it matter?

Also, how can you tell that the second option implements a linear search?

a) I iterate through the studentsArray.
b) For each student id, I start a counter in 0 and iterate through the whole gradesArray

Last edited on Nov 8, 2015 at 10:00pm

Nov 8, 2015 at 10:05pm

Kevin C (319)

Ah, now I realize that you want to find the number of grades of EACH student.

The second solution is actually quadratic, which is much worse than linear. But there is no solution faster than linear for this task.

But it is easy to do it with linear search:
- make the array gradeCounts with one element for each student and each element equal to 0. The indices of this array represent studentIds
- iterate through the grades array. For each studentId in grades, increment the corresponding gradeCount.

Last edited on Nov 8, 2015 at 10:07pm

Topic archived. No new replies allowed.