vector as an argument

Pages: 123
@xoreaxeax

I think the problem is that you are trying to use C++ like you do other languages, particularly functional type languages.

It is not and was never designed to be like that and your ideas about design in C++ don't apply. It is a different language for a different purpose.

NUMA is a red herring. It solved the copy problem in hardware, not software.

If you are worried about the container being modified outside the function, then it should be const before ever calling the function. That is what const is for. In fact const is the only mechanism C++ has for immutability.
Const does not protect from the problem of modifying the list by the caller afterwards.


How exactly do you modify the list if it is const? Or by problem do you mean "they need a throwaway copy" that they can modify and not actually have reflected in the original list?

Pass by ref makes my code easily breakable by changes in other places of code, not directly coupled with my code.


Example please.

It breaks encapsulation and simply doesn't scale up. This is why most modern languages use immutable strings as a basic data structure for text and not mutable ones like in STL.


Example please. I don't see how it "breaks encapsulation".

Passing by const ref forces me to do costly O(n) copy whenever I need to change something in the vector/list.


Your intended meaning is "have a throwaway-copy" basically, right? Doesn't an immutable string (for example) basically have to construct a new "basic string" inside of itself if you try to change it?
closed account (oz10RXSz)

I think the problem is that you are trying to use C++ like you do other languages, particularly functional type languages.


The problem is much more general - mutable state shared between many units is evil, regardless it is C++ or not. C++ has just poor tools to deal with this problem, so it is you, the PROGRAMMER who should care. I only warned.


NUMA is a red herring. It solved the copy problem in hardware, not software.


True, ss long as you do not want to mutate the shared data. In that case you can say good bye to scalability.


How exactly do you modify the list if it is const?
Example please. I don't see how it "breaks encapsulation".


It is not const from the beginning. It is const only when passed by const-ref.
So it is temporarily const. After the function exits it can still be modified.

1
2
3
4
5
vector<int> vec;
vec.push_back(...)

someObject.setVector(vec)  
vec.push_back(...)  // modifies the const vector that was received and stored somewhere by the setVector - possibly breaks the someObject invariant 


So, the consistency of someObject is at the mercy of the calling code. If the calling code does a thing like that, the program may crash several lines further in the code of someObject class. And who will be the victim? The programmer of the someObject class.

So, the protection is to:
1. pass by val
2. never store a reference (pointer) to an object passed by ref - always create a copy first - treat never trust any references from outside.

To sumarize: either you get loose coupling or you get performance. But never both.
In a purely immutable style of programming, when you change some part of the code you are sure thie change may only affect the code that is afterwards. But if you use refs/pointers everywhere, your changes affect both the code that is afterwards and before.


our intended meaning is "have a throwaway-copy" basically, right? Doesn't an immutable string (for example) basically have to construct a new "basic string" inside of itself if you try to change it?


Making throwaway copies in C++ is extremely costly - you need to copy the whole structure. If it was immutable, you could just copy a small part of it, saving lots of memory and CPU. Additionally making a copy constructor for a complex mutable data structure is non-trivial and very often a source of difficult to debug bugs.





Last edited on
It is not const from the beginning. It is const only when passed by const-ref.
So it is temporarily const. After the function exits it can still be modified.


As it should be.

So yes, in this case you should pass by value. Obviously if you intend to have your own copy of a value you should make a copy of it.

Making throwaway copies in C++ is extremely costly - you need to copy the whole structure. If it was immutable, you could just copy a small part of it, saving lots of memory and CPU.


Ah, I see what you are getting at then. Although what exactly is the problem with this?

1
2
3
4
void some_class::set_string(const std::string& new_val) {
    this->mystr = new_val;
    //std::string operator = doesn't copy/change stuff that doesn't need to be changed
}


Additionally making a copy constructor for a complex mutable data structure is non-trivial


Could you give an example of a non-trivial copy ctor?

and very often a source of difficult to debug bugs.


I don't understand this part...it is no more difficult to make hard-to-detect bugs in a copy ctor than a normal ctor or any function really...The only time I can think of where you'd need to make a custom copy ctor is for something that is managing raw memory of some type. Other than that, the compiler generator copy ctor should be fine.
Last edited on
xoreaxeax wrote:
The problem is much more general - mutable state shared between many units is evil, regardless it is C++ or not. C++ has just poor tools to deal with this problem, so it is you, the PROGRAMMER who should care. I only warned.


So your argument boils down to C++ not being designed to be used in the way in which you want to use it. That is what I said. Your 'design philosophy' is not right for C++.

You're 'design philosophy' may be great for people studying 'computer language theory' but you can't take those 'ideal' language concepts and force then into every language you use. C++ was designed with different ideals in mind.

If I want to use a functional language, I won't choose C++. While I am using C++ I will use good design practice in C++.

Copying everything isn't it.
closed account (oz10RXSz)
Ok, so how do you protect from mutable shared state problems other than by defensive copying?


You're 'design philosophy' may be great for people studying 'computer language theory'


Actually this design philosophy is used in practice very often also in imperative languages (in C, C++, Java, etc.) and proved itself superior to "pass pointers to mutable data everywhere". But please don't turn this discussion into programming paradigm flame-war. I'm just searching for better ways to solve this problem in C++. The problem exists for real, it is not my imagination.
Last edited on
xoreaxeax wrote:
Actually this design philosophy is used in practice very often also in imperative languages (in C, C++, Java, etc.) and proved itself superior to "pass pointers to mutable data everywhere".


Once again, no one has ever suggested you should "pass pointers to mutable data everywhere".

When you want the data to change, you use mutable data. C++ has various language tools to mitigate the dangers of changing data. But the bottom line is that programs change data. Its what they do.
Last edited on
closed account (oz10RXSz)

Once again, no one has ever suggested you should "pass pointers to mutable data everywhere"


And who said there is nothing insecure about passing a reference?
I just warned against possible maintainability problems - I've never said that pass by ref is evil everywhere. You just need to be much more careful than in pass by val.


When you want the data to change, you use mutable data. (...) But the bottom line is that programs change data. Its what they do.


This is entirely incorrect. It is possible to write any algorithm with no mutable data at all.


C++ has various language tools to mitigate the dangers of changing data


Ok, which, except passing by val, making copies and imperfect in many cases - const?


Could you give an example of a non-trivial copy ctor?
I don't understand this part...it is no more difficult to make hard-to-detect bugs in a copy ctor than a normal ctor or any function really...The only time I can think of where you'd need to make a custom copy ctor is for something that is managing raw memory of some type. Other than that, the compiler generator copy ctor should be fine.


In most cases the autogenerated constructor is wrong. It provides only a shallow copy. Copying a complex structure e.g. a graph of other complex structures can be difficult. It is very easy to forget to copy something deeply and then you get a partially copied structure that shares some of its data with the original. I do not want to be a person that debugs such programs.
Last edited on
xoreaxeax wrote:
This is entirely incorrect. It is possible to write any algorithm with no mutable data at all.


Yes, and it will run like a pig on a spit.

xoreaxeax wrote:
Ok, which, except passing by val and making copies?


C++ gives you tools to control who and what gets to modify data. So data hiding and const qualifiers play their part in this.

What you are suggesting is they we use C++ in a way that is far more appropriate to other programming languages. What you should be doing is advising people to use other programming languages where that makes sense.

But this is a C++ forum and people want good advice on how to use C++. You are not giving it.

Trying to teach people to use C++ without modifying any data, but creating new data for every result is bad advice. You will never write a game like that, you will never write a driver like that, you will never write an OS module like that, you will never write a million things that people use C++ for like that.
Last edited on
@xoreaxeax
There's no point in using vague terms all the time, that's a typical example of a discussion that will never get anywhere.
There needs to be a concrete example of what you're talking about.

In general, before you start implementing anything, you need to know:
a) what kind of different data do you have?
b) who needs to read the data and who needs to write to it?
c) for those that need to read from it: do they need to keep a view on the data as it was when they first learned about it? Or do they just need a view on the current (but consistent) state of the data?
d) do any of the objects involved operate from different threads?

Once this is known, one can start looking for suitable solutions.
closed account (oz10RXSz)

Yes, and it will run like a pig on a spit.


Why do you think so?

I've seen some benchmarks: Haskell and OCaml actually run not more than 2-4x slower than hand optimized C++ code. The worst-case slowdown is logarithmic, but it almost never happens in practice. This is negligible difference in 99% of applications (I work for database vendor and we have a simple performance measure: < 2x slowdown = very good, < 10x = acceptable, >= 10x unacceptable).

Programs running like a pig on a spit are always a fault of crappy coding, almost never a fault of the language (unless you use interpreted Ruby or Python :D).


Once this is known, one can start looking for suitable solutions.

Agreed.

Last edited on
In most cases the autogenerated constructor is wrong.


Only if for some reason you want to use raw pointers. You should almost never have to use them in classes unless the class is some sort of smart pointer.

I think your issues with C++ stem from the fact that you don't have clear object ownership. For example, in this code you gave:

1
2
3
4
5
vector<int> vec; //who owns this? no-one in particular
vec.push_back(...);

someObject.setVector(vec); //does someObject own this now?
vec.push_back(...);


The vector<int> should most likely be a member of someObject, not some arbitrary vector.
closed account (oz10RXSz)

Only if for some reason you want to use raw pointers.


If I use smart pointers, the autogenerated constructor is in most cases also wrong.


I think your issues with C++ stem from the fact that you don't have clear object ownership.


Agreed. But I've already said that in large projects written by many authors you usually don't have.

Last edited on
If I use smart pointers, the autogenerated constructor is in most cases also wrong.


I think you failed here...the point of a smart pointer is to manage the data for you, which means they usually have a manually created constructor, copy ctor, = operator, destructor. Ex. std::auto_ptr.

But I've already said that in large projects written by many authors you usually don't have.


That's what documentation/being a team is all about. If you don't have good documentation or a team that can work well together, C++ can't help you.
closed account (oz10RXSz)

I think you failed here...the point of a smart pointer is to manage the data for you, which means they usually have a manually created constructor, copy ctor, = operator, destructor. Ex. std::auto_ptr.


And how this default copy constructor of universal smart pointer knows if it should copy the pointed data or not? :D The point of smart pointer is just to manage memory (in a quite inefficient way, AFAIK) and avoid leaks and dangling pointers. Not to control copying semantics.


That's what documentation/being a team is all about. If you don't have good documentation or a team that can work well together, C++ can't help you.


Agreed, but it works only in theory. Actually documentaiton sucks in 99% or more projects and developers often come and go. ;) Ok, anyway, you should not program complex systems in low-level languages, so maybe I'm exaggerating this problem. It is like complaining that text processing is difficult and error-prone in pure C ;)

BTW: I didn't notice that earlier:
 
//std::string operator = doesn't copy/change stuff that doesn't need to be changed 


How so? Do you mean it is copy-on-write? If so, then it must be slow as a pig. COW is one of the most inefficient techniques for handling small objects.



Last edited on
And how this default copy constructor of universal smart pointer knows if it should copy the pointed data or not? :D The point of smart pointer is just to manage memory (in a quite inefficient way, AFAIK) and avoid leaks and dangling pointers. Not to control copying semantics.


Also, you wouldn't be "copying" data. You use smart ptrs like this:
std::auto_ptr<some_type> my_ptr(new some_type);

As for memory management, it would depend on the smart pointer. For example, std::auto_ptr states that it is *the* owner of the data, and when you copy it to other auto_ptrs it NULLs itself. Other pointers like boost::shared_ptr know how many other shared_ptrs point to that object and only delete it when the final shared_ptr is destroyed. And also, how exactly are they inefficient? They are no less inefficient than putting the new/delete calls in there yourself, and in addition they are exception-safe.

I think the point I am trying to make here is that your class should own the data it needs to function. The class's constructor should set up it's invariants, and not depend on some outside source (like someone else's code) to set up random internal elements for it.

Actually documentaiton sucks in 99% or more projects and developers often come and go. ;)


And your point is? I don't think you understand that you can write bad code in any language..."C++ doesn't try to make it impossible for bad programmers to write bad programs; it enables reasonable developers to create superior software."

How so? Do you mean it is copy-on-write? If so, then it must be slow as a pig. COW is one of the most inefficient techniques for handling small objects.


Maybe. Since COW is an optimization, it is performed by the compiler, not the library. If it's not efficient for the objects I'm using, I would hope the compiler wouldn't use it.

Also, I went and looked up some information about immutable objects, and I found this on Wikipedia: "In C++, a const-correct implementation of Cart would allow the user to declare new instances of the class as either const (immutable) or mutable, as desired"
closed account (oz10RXSz)

And also, how exactly are they inefficient?


They keep reference counts. Every pointer assignment = a test + 2 interlocked increment / decrements. On multiprocessors slow as a pig. Proper GC is much, much faster.


Also, you wouldn't be "copying" data. You use smart ptrs like this


I know how to use smart pointers, but imagine an object of the class:

1
2
3
4
5
class X {
  private:
    smart_ptr<Y> member;
   //...
};


If this object owns member, then it should copy the member when it is copied (a deep copy). The default copy constructor would only copy the pointer, not the pointed object of class Y. So, you must provide your own ctor. When there are lots of such members it is easy to forget to copy something and nothing guards you against this.

 
it enables reasonable developers to create superior software." 


Superior to what? And which programmers do you find reasonable? Some great programmers stay away from C++: see http://www.google.pl/search?q=Coders+at+Work+on+C%2B%2B&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:pl:official&client=firefox-a
They keep reference counts. Every pointer assignment = a test + 2 interlocked increment / decrements. On multiprocessors slow as a pig. Proper GC is much, much faster.


GC doesn't work well with C/C++. Example:
1
2
3
4
5
int* p = new int;
p += 10;
//What if GC runs here?
p -= 10;
*p = 5; //!! 


GC isn't that much faster, it just (randomly) will take up time searching for memory you aren't referencing. It also takes up more memory since you don't delete the unused memory immediately, you must wait for the GC to do it for you.

I know how to use smart pointers, but imagine an object of the class...


Erm, the point of smart pointers is that they have defined their own copy ctors etc so you don't have to do anything special when they are part of a class.

Superior to what? And which programmers do you find reasonable? Some great programmers stay away from C++: see http://www.google.pl/search?q=Coders+at+Work+on+C%2B%2B&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:pl:official&client=firefox-a


Superior to stuff created in Java for example. I can't just run a "compiled" Java program, I have to go get a Java VM (which is basically an OS) to run it. For C++ I can just run the .exe and have it work, nothing extra required (unless I'm using some dynamically linked stuff, but that applies to all languages). And just because some people decide against using C++ for their projects doesn't mean it's bad: http://www2.research.att.com/~bs/applications.html
closed account (oz10RXSz)

I can't just run a "compiled" Java program


If you can't do something, doesn't imply it is not possible.
See GCJ or Excelsior JET for explanation.
But we are getting offtopic now.


GC isn't that much faster, it just (randomly) will take up time searching for memory you aren't referencing


You obviously don't know how modern GCs work.
But i one case you are right - having even a moderate GC for C++ is impossible. Same as having a good optimizing compiler.


Erm, the point of smart pointers is that they have defined their own copy ctors etc so you don't have to do anything special when they are part of a class...


...thus introducing semantic bugs in your program.


And just because some people decide against using C++ for their projects doesn't mean it's bad


And if these are probably one of the smartest programmers of these times? These are not just some random people. They know what they are saying.

Ok, back to the topic: so what is your approach to shared mutable data? If it is only clear object ownership, then what is your approach to making 99% of documentation NOT suck?

My approach is to have no shared mutable data, by having shared immutable data and local mutable data. This approach does not work well in C++, because STL lacks immutable data structures. So I have to fallback to "copy everything" approach, which is inefficient.



Same as having a good optimizing compiler.


You offer no support for this point. Thus, it is invalid.

...thus introducing semantic bugs in your program.


No support again...and this makes no sense anyways. It only would if you have 0 documentation (and by documentation I mean a 1 line comment that says "copy ctor deep copys pointers" or "doesn't" or whatever), and if you have no documentation then you be screwed by other things anyway.

And if these are probably one of the smartest programmers of these times? These are not just some random people. They know what they are saying.


When did I say it matters who they are? If I am trying writing to write a quick program and I don't particularly care about efficiency, I'll use something like Python because it's fast/simple. I could say the same thing that some of the smartest programmers also use C++, as pointed out by that link I gave you.

Ok, back to the topic: so what is your approach to shared mutable data? If it is only clear object ownership, then what is your approach to making 99% of documentation NOT suck?


So you mean something like one thread add customers to a vector and another one doing stuff to them? In that case, you just have to use a mutex, like all other threading situations with shared data, really. Or if you could give a more specific example, that would be nice.

My approach is to have no shared mutable data, by having shared immutable data and local mutable data. This approach does not work well in C++, because STL lacks immutable data structures. So I have to fallback to "copy everything" approach, which is inefficient.


Could you give an example of a situation like that? I just want to see more clearly what the problem is.
Last edited on
Pages: 123