Cast to different uint possible?

Forum

Forum
Beginners
Cast to different uint possible?

Cast to different uint possible?

Jul 6, 2019 at 12:08pm

I want to write such a function, which depends on the encodingType (We just consider the int: 0, 1, 2, 3) cast the base_pointer to different uint pointer. Then I can do some pointer arithmetic (My Aim).

Is that way possible?

  void cast(int encodingType, uint8_t *base_pointe) {
  switch (encodingType) {
  case 0:
    uint8_t* ptr = reinterpret_cast<uint8_t *>(base_pointe);
    break;
  case 1:
    uint16_t* ptr = reinterpret_cast<uint16_t*>(base_pointe);
    break;
  case 2:
    uint32_t* ptr = reinterpret_cast<uint32_t*>(base_pointe);
    break;
  case 3:
    uint64_t* ptr = reinterpret_cast<uint64_t*>(base_pointe);
    break;
  }
  /// then use ptr like ptr + 1 ...
  /// ...
  /// ... 
} /// End of this function

Last edited on Jul 6, 2019 at 12:08pm

Jul 6, 2019 at 2:12pm

JLBorges (13770)

Strongly consider using a type-safe discriminated union.
https://en.cppreference.com/w/cpp/language/union#Union-like_classes

C++17 has the ready made std::variant<> https://en.cppreference.com/w/cpp/utility/variant

For example:

#include <iostream>
#include <cstdint>
#include <variant>

using base_pointer = std::variant< std::uint8_t*, std::uint16_t*, std::uint32_t*, std::uint64_t* > ;

int encodingType( base_pointer ptr ) { return ptr.index() ; }

void my_fun( base_pointer base_ptr ) {
    
    // note: std:visit would be a simpler option 
    // see: https://en.cppreference.com/w/cpp/utility/variant/visit

    const int type = encodingType(base_ptr) ;

    switch(type) {

        case 0 :
        {
            std::uint8_t* ptr = std::get<0>(base_ptr) ;
            if(ptr) std::cout << "uint8_t: " << int(*ptr) << '\n' ; // use ptr
        }
        break ;

        case 1 :
        {
            std::uint16_t* ptr = std::get<1>(base_ptr) ;
            if(ptr) std::cout << "uint16_t: " << *ptr << '\n' ; // use ptr
        }
        break ;
        
        case 2 :
        {
            std::uint32_t* ptr = std::get<2>(base_ptr) ;
            if(ptr) std::cout << "uint32_t: " << *ptr << '\n' ; // use ptr
        }
        break ;

        case 3 :
        {
            std::uint64_t* ptr = std::get<3>(base_ptr) ;
            if(ptr) std::cout << "uint64_t: " << *ptr << '\n' ; // use ptr
        }
        break ;
        
        default: std::cout << "valueless by exception\n" ;
     }
}


int main()
{
    std::uint8_t a[] { 11, 22, 33 } ;
    my_fun(a) ;
    
    std::uint16_t b[] { 444, 555 } ;
    my_fun(b) ;
    
    std::uint32_t c[] { 6666666, 7777777 } ;
    my_fun(c) ;

    std::uint64_t d[] { 88888888888888888, 9999999999999999 } ;
    my_fun(d) ;
}

Edit & run on cpp.sh

http://coliru.stacked-crooked.com/a/82742fd47843e86e

Jul 6, 2019 at 2:13pm

dutch (2548)

It's not possible to do exactly that since ptr needs to have a single definite type, not four different types.

If I knew exactly what you want to do with it I could tell you the best thing to do, but all I can go on is a vaque mention of "pointer arithmetic". So, guessing, one possible solution is to not cast at all and insteaed use a "size" variable like so:

size = 1 << encodingType; // 0 <= encodingType <= 3

// Using a fake "uint32_t*" type (but it's actually uint8_t*).

ptr += size;  // moving by size bytes

// And of course I have no idea what you want to do to do with it.

Last edited on Jul 6, 2019 at 2:14pm

Jul 6, 2019 at 2:44pm

CakeByTheOcean (81)

Thank you JLBorges.
But I still want to use ptr outside of the switch branch with the pointer type, which is figured out at the swtich case.

Jul 6, 2019 at 2:49pm

CakeByTheOcean (81)

Thank you dutch.
I think in my case I still have to continue to use reinterpret_cast. The aim is to use the __m256i register and the SIMD/AVX instructions. (And I thought about the fake type and bit shifting, which can cost a bit of performance, which I do not want to.)

My current solution is easy and dummy: write 4 different function. They are working well with good performace. But they are very similar (expect of the pointer types). I want to make the thing simple, to write these 4 function into a compact one.

Do you think it is still impossible? If true, I can ignore this try, because I am just want to reduce the Lines of Code and KEEP current performance. (I think my current 4 functions are very well optimized by compiler and work well.)

Last edited on Jul 6, 2019 at 2:51pm

Jul 6, 2019 at 2:57pm

dutch (2548)

I still don't know the details of what you are doing. You should post the code that uses the pointer so we can see what can be done. My idea is to only use the uint8_t pointer and multiply by the size of the object wherever necessary. JLB's idea is a variant type, which is potentially a more general, and very C++, solution.

Last edited on Jul 6, 2019 at 2:58pm

Jul 6, 2019 at 3:28pm

CakeByTheOcean (81)

I post here the uint8_t and uint16_t version with normal code (No AVX).
They can work on my enviroment.

/// Header .h
template <EncodingType E, CompareType C>
static void count(const Predicate &p,
                  const uint32_t dataLength,
                  uint8_t *column_base_pointer,
                  std::vector<uint32_t> &col_count);

template <>
void count<EncodingType::byte1, CompareType::EQUAL>(const Predicate &p,
                                                    const uint32_t dataLength,
                                                    uint8_t *column_base_pointer,
                                                    std::vector<uint32_t> &col_count) {
  SMA* sma_ptr = reinterpret_cast<SMA*>(column_base_pointer);
  const auto[min, max] = sma_ptr->getSMA_min_max();
  const uint64_t value = p.val;

  /// 2 SMA + Padding = 32B
  column_base_pointer += 32;
  const uint8_t differ = value - min;




  if (col_count.size() == 0) {
    /// Case for first scan => full loop all data needed

    /// Initial match with val not in range => then skip
    if (value < min || value > max) return;
      #ifdef SCALAR
    for (size_t in = 0 ; in < dataLength; in++) {
      if (*(column_base_pointer + in) == differ) {
        col_count.push_back(in);
      }
    }
    #endif
  } else {
    /// Case for not the first scan (in col_count vector are some matches already)

    if (value < min || value > max) {
      /// Initial match with val not in range => all invaild
      std::fill(col_count.begin(), col_count.end(), UINT32_MAX);
    } else {
      for (size_t in = 0; in < col_count.size(); ++in) {
        if (col_count[in] != UINT32_MAX /* not UINT32_MAX: there is(was) a match */
            && *(column_base_pointer + (size_t)col_count[in]) != differ) {
          /// Not equal, then marked as INVALID
          col_count[in] = UINT32_MAX;
        }
      }
    }
  }
  
}

template <>
void count<EncodingType::byte2, CompareType::EQUAL>(const Predicate &p,
                                                    const uint32_t dataLength,
                                                    uint8_t *column_base_pointer,
                                                    std::vector<uint32_t> &col_count) {
  SMA* sma_ptr = reinterpret_cast<SMA*>(column_base_pointer);
  const auto[min, max] = sma_ptr->getSMA_min_max();

  column_base_pointer += 32;
  uint16_t* ptr = reinterpret_cast<uint16_t*>(column_base_pointer);
  const uint64_t value = p.val;
  uint16_t differ = value - min;

  if (col_count.size() == 0) {
    /// Case for first scan => full loop all data needed

    /// Initial match with val not in range => then skip
    if (value < min || value > max) return;

    #ifdef SCALAR
    /// in < dataLength >> 1 == in < dataLength / 2
    /// Reason: Each tuple has 2B uint16_t
    for (size_t in = 0 ; in < dataLength >> 1; in++) {
      if (*(ptr + in) == differ) {
        col_count.push_back(in);
      }
    }
     #endif
  } else {
    /// Case for not the first scan (in col_count vector are some matches already)

    if (value < min || value > max) {
      /// Initial match with val not in range => all invaild
      std::fill(col_count.begin(), col_count.end(), UINT32_MAX);
    } else {
      for (size_t in = 0; in < col_count.size(); ++in) {
        if (col_count[in] != UINT32_MAX  /* not UINT32_MAX: there is(was) a match */
            && *(ptr + (size_t)col_count[in]) != differ) {
          /// Not equal, then marked as INVALID
          col_count[in] = UINT32_MAX;
        }
      }
    }
  }

}

Last edited on Jul 6, 2019 at 3:29pm

Jul 6, 2019 at 4:16pm

dutch (2548)

I don't think my idea works very well after all. It's easy enough to point to the beginning of an n-byte object at some position in the array. But to do the comparisons requires bytewise comparisons in a loop, which is not great for performance. And then in selecting the largest size for differ (uint64_t), the endianness comes into the bytewise equality comparison. Little-endian (like our intel/amd cpus) works well, since the "little end" comes first. But big-endian would need an offset from the start of differ. Doable but fiddly.

And I'm not sure what to do about the EncodingType::byte1/byte2 difference. What exactly is that for?

Anyway, this is incomplete. I left EncodingType::byte1 at the top. It won't work on big-endian. And obviously there could be other things wrong with it. I didn't even try compiling this.

If all you're trying to do is save a little code repetition, I don't think this is the way to do it.

bool equal_bytes(uint8_t* p, uint64_t differ, size_t size) {
    auto d = reinterpret_cast<uint8_t*>(&differ);
    while (size--) if (*p++ != *d++) return false;
    return true;
}

template <>
void count<EncodingType::byte1, CompareType::EQUAL>(const Predicate &p,
                                                    const uint32_t dataLength,
                                                    uint8_t *column_base_pointer,
                                                    std::vector<uint32_t> &col_count,
                                                    size_t size) {
  SMA* sma_ptr = reinterpret_cast<SMA*>(column_base_pointer);
  const auto[min, max] = sma_ptr->getSMA_min_max();

  const uint64_t value = p.val;
  column_base_pointer += 32;

  const uint64_t differ = value - min;

  if (col_count.size() == 0) {
    if (value < min || value > max) return;
#ifdef SCALAR
    for (size_t in = 0 ; in < dataLength; in++)
      if (equal_bytes(column_base_pointer + in * size, differ, size))
        col_count.push_back(in);
#endif
  } else {
    if (value < min || value > max) {
      std::fill(col_count.begin(), col_count.end(), UINT32_MAX);
    } else {
      for (size_t in = 0; in < col_count.size(); ++in)
        if (col_count[in] != UINT32_MAX
            && !equal_bytes(column_base_pointer + (size_t)col_count[in] * size, differ, size))
          col_count[in] = UINT32_MAX;
    }
  }
}

Last edited on Jul 6, 2019 at 4:17pm

Topic archived. No new replies allowed.