Not really a beginners question but.....

Forum

Forum
Beginners
Not really a beginners question but.....
Page 2

Not really a beginners question but.....

Pages: 12

Every variable is a reference to a memory address until its destroyed. the difference between passing by reference and by value is the by value is a copy that gets its own memory address hence any changes to it don't reflect the original variable.

I agree 100%. Nothing I said conflicts this understanding. I was arguing that references and pointers are totally distinct; not even one a subset of the other. Distinct but similar things with somewhat overlapping use cases.

remember that & is multiple things in c++. Its a binary and (don't care today), its a reference, its the address of operator, and probably one or two more things. You may be mangling "address of" and "reference to" in that pseudo code example; its hard for me to unravel exactly what you meant there, but that can be a source of confusion?

I think you are close to getting it, or perhaps you do get it and we both need to be careful with the words we use at this point -- something I freely admit to being careless around.

TheIdeasMan (6856)

@jonnin

Congrats on your 10K posts btw :+)

Not sure if your last was directed at the OP, or me, or both, but we are both saying the same things.

"Reference to a memory address" could be potentially confusing. I know what you mean, but I prefer to say "Contains a memory address" because of the specific meaning of reference in C++, that I described earlier. Sorry for being pedantic :+)

markyrocks (346)

take this ungodly bit of code that I just wrote. I'm not sure how we can agree that a reference is basically a dereferenced pointer and still come to the end of the conversation and still be at odds in some points.

     for (auto i = bd.pptr; i < (bd.pptr + size); i++) {
            
            if (i && *i && **i) {
                int* type_ptr = &(**i);
                type = *(type_ptr + 3);
                break;
            }
        }

as i mentioned earlier the reasons why this is necessary is bc the int* 's in the int** aren't sequential. and their addresses aren't the same as where the actual data is stored. they just hold the value of the address where the data is stored.

the above works and needs to be that way bc I don't actually need the data stored at **i, I need the actual address, bc I want the data at the address 3 houses down.

The reason why my pov changed slightly is bc I'm actually looking directly at the memory itself bc I honestly have no other choice. Look at it for yourself.

Reference by deffinition is just 2 objects that are the same. you change one and the other changes bc they're the same thing. The underlying compiled code makes no distinction between a pointer and a reference bc there is no difference it just sends the resulting address and the bytes are moved into the registers etc.

Think about it this way. there can only be one thing in a memory address. So that being said a value that a variable holds can't be held and still know its own address...So if you have an int a=4; that 4 gets stored somewhere but it lives somewhere else than where a lives.

in memory address 1234[ 4 ], somewhere else address 4321[1234] is representative of int a;

So if you want to get uber detailed &a == 1234 a == 4

int* x = &a; x value address = abcd[4321]

&x would be bbbb [abcd]

theres an extra layer in there that none thinks about. I never thought about it either.

the key point being a particular memory address can't save more than one thing and you have 2 things... the variable itself and the value it holds.

doing something like int a = 4;

int& b = a;

the reason b can be a reference to a and still be its own thing is because of something like

1234[4] <~~~4 address

a == 4321[1234] a address and value

b == 3333[1234] ....

to clarify i'm using values in the square brackets bc again that address can only hold one thing it can't hold 1234 things etc.

now dont' get me wrong the address of a variable is just a temporary reference and if its not needed could deffinetly be removed by the compiler but that's just a whole other discussion.

in the situation i'm dealing with theres no optimizations to interpreting a script (at least not in the language so those temp references are all there. not very temporary.

Last edited on

mbozzi (3945)

C++ programmers rarely distinguish variable names from variables, variables from values, and so on. The distinctions are not very relevant to C and C++ programming, but they are more important in dynamic languages.

The general term "reference" shouldn't be confused with the C++ feature. I think what @markyrocks ~~is saying~~ means is that a variable name or expression may be associated with some memory at a particular address and therefore the variable name / expression is a "reference" to the object at that memory address.

I think it will help OP to introduce or learn some more terminology, ideally the terminology used by the project he's memory-hacking / reverse-engineering.

For example, borrowing from Lisp, some source code might contain a variable name. The name is a string, an actual data structure in the computer's memory, associated with a symbol, another data structure. The mappings between symbols and names are bijective and correspond to these two functions

1
2

symbol name_to_symbol(string name);
string symbol_to_name(symbol sym);

Also a symbol can be associated with a variable

variable symbol_to_variable(symbol sym);

And a variable can have a value. The association between "variable" and "value" is maintained in an "environment", which is conceptually just a hash table:

1
2

// lookup the value associated with `var` in `env`
data_object variable_to_value(environment env, variable var)

The value could be another symbol, a function, nothing at all, or an arbitrary data structure (i.e., the value is a variant member).

This is more-or-less how many dynamic languages work.

Last edited on

markyrocks (346)

I'm not referring to a name or even an id of some type. Really its the difference between an lvalue and an Rvalue.

An Rvalue can't be changed bc it has no reference. Even the scenario where any value is pushed into the registers to be part of a function..... that value lives at a memory address somewhere when it gets pushed into the registers its using the address.... or a temp reference. In this instance the cpu knows where to read to get the value .... hence now that value exists where ever it is actually stored and the cpu has some kinda reference to it whether its just a number pushed in there (a hard coded copy) or an address. Even in the instance of the hardcoded copy it would still exist where ever it was written into the code and have an associated instruction that knows where it lives.... Maybe the instruction is the reference... I think about this stuff too much. Even in the instance of a hard coded value it still gets loaded into memory and whenever something accesses that memory it has to know where to look, hence the address. Its that simple.

JLBorges (13770)

> Even in the instance of the hardcoded copy it would still exist where ever it was written into the code
> and have an associated instruction that knows where it lives....

Provided it is not optimised away.

For example:

#include <numeric>
#include <iterator>

int foo( int v )
{
    int result = 0 ;
    for( int i = 0 ; i < 10 ; ++i ) result += i ; // result = 9 * 10 / 2 == 45
    if( result > 40 ) result += 55 ; // result == 100
    return result + v ; // return 100 + v
}

int bar()
{
    int arr[] { foo(30), foo(70), foo(100) } ; // 130, 170, 200
    for( int& v : arr ) v *= 2 ; // 260, 340, 400
    return std::accumulate( std::begin(arr), std::end(arr), 0 ) ; // return 1000
}

https://gcc.godbolt.org/z/E4vKdWE4M

mbozzi (3945)

All operands, all data is stored, even if only for a moment within a CPU register.
If the stored data is needed again its location must be known.

The lvalue and rvalue stuff is not fundamental at all, it just was written down in the C++ standard to help define the language.

Last edited on

markyrocks (346)

The lvalue and rvalue stuff is not fundamental at all, it just was written down in the C++ standard to help define the language.

The devs who create this stuff do everything for a very specific reason. It exists bc its a way to put the concept into perspective for other developers the way that the hardware works . They want to stay as close to bare metal as they can and still improve the language efficiently. I guarantee you that the Lvalue and Rvalue stuff is very fundemental to the basic operation of the system bc its a direct translation to how things function at the hardware level.

Like I said this is all relatively new to me but after this foray things like Lvalues and Rvalues make a whole lot more sense. As I said previously I'm looking at this program I'm dealing with through cheat engine memory view and it took a bit to get a grasp on all this. I'm telling you that theres an automatic dereference that's happening and that's why this seems so foreign, bc you never actually see it happening. Honestly the way that its most likely optimized away is by simply keeping a reference count (that could potentially have one) for every variable that isn't allocated on the heap. If the stack frame ends every variable that doesn't have a reference is destroyed.

Very similar to the behavior of COM objects.

Last edited on

TheIdeasMan (6856)

An Rvalue can't be changed bc it has no reference.

https://en.cppreference.com/w/cpp/language/value_category

This link mentions prvalues with and without result objects.

cppreference wrote:
An rvalue may be used to initialize an rvalue reference, in which case the lifetime of the object identified by the rvalue is extended until the scope of the reference ends.

I am sorry for being totally pedantic, but we need to be careful with terminology here: there are lvalue references and rvalue references. You seem to be using the term reference when you mean address. Maybe you meant this part:

cppreference wrote:
Address of an rvalue cannot be taken by built-in address-of operator: &int(), &i++[3], &42, and &std::move(x) are invalid.

So if you have an int a=4; that 4 gets stored somewhere but it lives somewhere else than where a lives.

No, the value 4 is stored at the address of a. The prvalue expression 4 is used to initialise the lvalue a

Maybe there confusion between what is in memory, and what is contained in the lookup tables in the executable?

mbozzi (3945)

I guarantee you that the Lvalue and Rvalue stuff is very fundemental to the basic operation of the system bc its a direct translation to how things function at the hardware level.

Value category is a concept invented to specify the behavior of C++ programs. It doesn't have a particularly strong correspondence to the hardware.

Value category is used to specify (at least) when the compiler should emit code to
- use move constructors and move assignment operators;
- elide copies; and
- perform return value optimization.
In addition, value category is used to specify what your program means during compilation:
- it controls which functions are called via function overloading;
- it affects which template arguments are substituted into templates;
- it affects the behavior of decltype

Value category isn't fundamental because, for example, the behavior of decltype has nothing to do with the CPU. A similar argument could be made using each of the list items above.

More simply, there are no "objects" in the CPU, just instructions and operands.

Last edited on

JLBorges (13770)

I think the concepts of lvalues and rvalues are semantic properties of expressions in general; these concepts exist (whether explicitly stated or not) in all programming languages. These can affect the code that is generated because compilers can safely assume that the result of evaluating an rvalue expression is purely logical (does not necessarily occupy storage).

For instance, even without the as-if rule:

extern int i ; 
extern int j ;
extern int k ;

void foo()
{
     ++i ; // i is an lvalue: some storage location needs to be updated
     k = j + 1 ; // j+1 is a prvalue; the statement may be evaluated as 1. store value of j in k 2. then increment k
                 // ie. the evaluation of the prvalue expression j+1 or its result need not appear directly anywhere in the generated code
}

seeplus (6653)

First there was lvalue and rvalue and the world was good. Then came prvalue, xvalue, glvalue and upheaval came from the calm...

https://en.cppreference.com/w/cpp/language/value_category

TheIdeasMan (6856)

@seeplus

I have the opposite view: the invention of the extra categories solved problems with the way the compiler deals with expressions. IIRC it all started with move semantics. Somewhere on the web is Bjarne's explanation of how they came about. Found it:

https://www.stroustrup.com/terminology.pdf

seeplus (6653)

Yeah - I know that these solved problems and were needed, but BM life was so much simpler... :) :)

Last edited on

Topic archived. No new replies allowed.

Pages: 12