Strings

Edit*
There are two strings with dynamic size. It's needed to write a function which returns true if these strings have at least one same character and false if they don't. Maximum size and strings should be inputted from file. <cstring> can't be used.

I wrote something but is's such a mess cause bool function returns wrong value and there's an error "33 22 [Error] no matching function for call to 'getline(std::ifstream&, std::vector<std::basic_string<char> >&)''. I'd be so greatful if you'll help me fix it

#include <iostream>
#include <vector>
#include <fstream>
#include <string>

const char* ERROR_FILE = "Error: file is not opened";

bool findSame(std::vector<std::string> str1, std::vector<std::string> str2) 
{
	int i, j;
	for(i = 0; i <= str1.size(); i++)
 {
	for ( j= 0; str1[i] != str2[j]; j++); 
    return false;
 }
  return true;
}

int main()
{
   int line_size;
   std::vector<std::string> str1, str2;
  
   std::ifstream file ("input1.txt");
   if(!file.is_open())
  {
  	throw ERROR_FILE;
  }
  while(!file.eof())
  {
  	file >> line_size;
  	
  	getline(file, str1);
  	str1_push.back(str1);
  	
  	getline(file, str2);
  	str1.push_back(str1);
  }
    file.close();
    bool findSame( str1,  str2);
    
    str1.clear();
    str2.clear();
    
	return 0;
}

Last edited on

kbw (9492)

Write some examples on paper, and work thru it. You'll eventually come up with some method. Once you have that, we can help you code it, if you're still stuck.

seeplus (6655)

What are you using for a definition of 'string' - std::string, c-style null terminated ??

jonnin (11497)

is 'a' the same as 'A' here?

JLBorges (13770)

> What are you using for a definition of 'string' - std::string, c-style null terminated ??

Doesn't matter; std::string_view accomodates both.

#include <iostream>
#include <string_view>
#include <string>
#include <limits>

bool has_common_char( std::string_view str_a, std::string_view str_b ) // case sensitive
{
    bool found[ std::numeric_limits<unsigned char>::max() + 1 ] {} ;
    for( unsigned char c : str_a ) found[c] = true ;
    for( unsigned char c : str_b ) if( found[c] ) return true ;
    return false ;
}

int main()
{
    const std::string str = "abcdefgh" ;
    const char cstr[] = "ijklmnopqrstduvwxyz" ;
    std::cout << std::boolalpha << has_common_char( str, cstr ) << '\n' ;
}

http://coliru.stacked-crooked.com/a/314e6f50df14a982

Things are more difficult if these are multibyte strings.

Duthomhas (13310)

Eh, if we’re doing homework, I want in.

Multibyte strings just take a UTF-8 iterator. You can get one from a number of different sources:

  • Boost has one (hidden in its regexp modules).
  • ICU has one which is easy enough to wrap.
  • Qt’s string class can do it.
  • There are a gazillion UTF-8 handling classes on Github.

Or you can roll your own, like this one:

#include <ciso646>

template <typename Iterator>
struct utf8_iterator
{
  enum { UREPLACEMENT_CHAR = 0xFFFD };

  utf8_iterator( Iterator begin, Iterator end ) : begin(begin), end(end) { }
  utf8_iterator( const utf8_iterator& that ) : begin(that.begin), end(that.end) { }

  char32_t operator * () const
  {
    // Decodes UTF-8, Modified UTF-8, CESU-8
    static const unsigned char nbytes[] =
    {
      1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
      0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 3, 3, 4, 0
    };
    static const unsigned char masks[] = { 0, 0x7F, 0x1F, 0x0F, 0x07 };

    auto s = begin;
    if (s == end) return UREPLACEMENT_CHAR;

    unsigned char n = nbytes[ (unsigned char)*s >> 3 ];
    if (!n) return UREPLACEMENT_CHAR;

    char32_t c = (unsigned char)*s++ & masks[ n ];
    while (--n and (s != end)) c = (c << 6) | ((unsigned char)*s++ & 0x3F);

    return n ? UREPLACEMENT_CHAR : validate( c );
  }

  utf8_iterator& operator ++ ()
  {
    if (begin != end) ++begin;
    while ((begin != end) and ((unsigned char)*begin & 0x80)) ++begin;
    return *this;
  }

  utf8_iterator operator ++ (int)
  {
    auto result = utf8_iterator( begin, end );
    operator ++ ();
    return result;
  }

  bool operator == ( const utf8_iterator& that ) const { return this->begin == that.begin; }
  bool operator != ( const utf8_iterator& that ) const { return this->begin != that.begin; }

private:

  Iterator begin, end;

  char32_t validate( char32_t c ) const
  {
    return (((c & 0xFFFF) > 0xFFFD) or (c > 0x10FFFF)) ? UREPLACEMENT_CHAR : c;
  }
};

It might help to have my little ranger:

#include <utility>

template <typename Iterator>
struct ranger : public std::pair <Iterator, Iterator>
{
  ranger( Iterator begin, Iterator end = Iterator() ) : std::pair <Iterator, Iterator> { begin, end } { }
  Iterator begin() { return this->first;  }
  Iterator end  () { return this->second; }
};

Which you can turn into a UTF-8 ranger:

#include <string>

struct utf8_ranger : public ranger <utf8_iterator <std::string::const_iterator> >
{
  utf8_ranger( const std::string& s ) : ranger(
    utf8_iterator( s.begin(), s.end() ),
    utf8_iterator( s.end(),   s.end() ) )
  { }
};

All you need then is a hash-map (set) for a membership predicate (like JLBorges uses with his found array example):

#include <unordered_set>

bool has_common_codepoints( const std::string& a, const std::string& b )
{
  std::unordered_set <char32_t> cs;
  for (auto c : utf8_ranger( a )) cs.insert( c );
  for (auto c : utf8_ranger( b )) if (cs.count( c )) return true;
  return false;
}

And an example of use:

#include <iostream>

int main()
{
  std::string a, b;
  std::cout << "First string:  ";  getline( std::cin, a );
  std::cout << "Second string: ";  getline( std::cin, b );
  std::cout << std::boolalpha << has_common_codepoints( a, b ) << "\n";
}

UTF-8 is easy. :O)

Topic archived. No new replies allowed.