I've seen people saying to prefer values (value semantics) over X, where X once happened to be "smart pointers".
I understand the main benefit of using smart pointers over raw pointers: clearly we remove have memory leaks. But the possible disadvantage is that wrapping objects created on the heap with other objects may create a small to big overhead depending on how many objects are created, I suppose.
My questions are:
1. In which situation is it better to use plain objects created on the stack in conjunction with value semantics, i.e. copying these objects around instead of passing pointers, rather than managed (by smart pointers) objects created heap?
2. In which situations is it better to use smart pointers rather than value semantics?
I would appreciate a general overview of cases and if concrete examples are provided it would also be nice.
I've realized I've possibly mixed two concepts, but they are related anyway!
It's not a choice that should come up. The standard smart pointers do not exist to remove memory leaks. They exist to provide a standard way to express two popular ownership schemes without having to write your own RAII classes. It is the use of RAII that eliminates memory (and other resource) leaks.
plain objects created on the stack in conjunction with value semantics
On stack (automatic storage) and value semantics are not related: std::vector<T> creates T objects on the heap (unless customized), but has value semantics. a T* can be pointing at an object on stack, but has, obviously, pointer semantics.
A plain object on stack is, as you put it, "plain". The simplest thing you can do. Moreover, disregarding statics and thread-locals, any object in a propertly written C++ program has to be either on stack (or tempoary) or managed by an object that is (perhaps after a few more steps) on stack (or temporary) - a T in a vector<T> or inside a unique_ptr<T> is on heap, but the vector<T> or unique_ptr<T> itself is on some stack. Or perhaps that vector<T> or unique_ptr<T> is a member of an object that itself is inside a vector and that vector is on a stack.
Stack is your first choice. If it can't be on stack, then consider the heap.
For example it can't be on the stack if it has to outlive a function call and not move back to the caller: std::vector::push_back places an object on the heap (unless customized) under its own management. Some YourCoolDataStructure::insert(T) might place one on heap as a unique_ptr<T> internally.
Another case where it can't go on the stack is when its type (and therefore size) is not known until runtime: when dealing with dynamic polymorphism, you may have to create objects on heap using factory functions returning unique_ptr<Base> (which themselves land on the stack of the caller)
Even with smart pointers, it's a mistake to just assume smart-pointers never leak memory. (They're not actually smart, we just say that :-)) Specifically, owning references will leak whenever cycles appear, or whenever owning references are shared across modules.
1. You should prefer value semantics whenever possible, and use reference semantics only when you must.
- By default, most "values" don't leak. It's difficult to leak an object with automatic storage duration, unless you specifically ask for something stupid (e.g., you incorrectly invoke std::exit()) or your program exhibits undefined behavior.
- There's no such thing as a "dangling value". This isn't true for any sort of reference; there is no guarantee a reference does necessarily refers to a living, valid referent.
- You get values by default. As a result, any program that uses references for no reason is more complicated than it should be.
- Values are free. There is no additional overhead for doing computation on values.
It's worth pointing out now that by "reference" I mean anything that exhibits "reference semantics" --- which includes smart pointers, C++ references, reference_wrapper, array indices, raw pointers and any other ways a program expresses indirection. Similarly, by "owning reference" I refer to any reference that claims "ownership" of its referent.
2. This is almost the same question as "why are pointers required".
References (in the general sense*) are required so that you can write std::cin << my_variable; and give my_variable a value. Other things too -- @Cubbi mentioned quite a few of them.
Smart pointers are raw pointers, except they express some ownership -- unique_ptr expresses unique ownership; shared_ptr expresses shared ownership. As a best practice, you should prefer smart pointers over plain raw pointers whenever you own the resource you have a pointer to. unique_ptr will cover a solid 80% of the cases where a pointer is required, and shared_ptr will cover almost all of the remaining cases. You should prefer unique_ptr and shared_ptr in that order.
*Actually in both senses (the reference-to-int int& sense, too) because operator overloads are the main reason that C++ built-in reference types exist. Or so I recently heard.