string replace with unicode

Dec 6, 2010 at 10:19am
I really have problems to translate standard c replace functions to wide char replace. additionally, when i want to use secure functions (_s) everything messes up.

What would be the wchar equivalent of this code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
char* replace(const char *str, const char *oldstr, const char *newstr, int *count)
{
   const char *tmp = str;
   char *result;
   int   found = 0;
   int   length, reslen;
   int   oldlen = strlen(oldstr);
   int   newlen = strlen(newstr);
   int   limit = (count != NULL && *count > 0) ? *count : -1; 

   tmp = str;
   while ((tmp = strstr(tmp, oldstr)) != NULL && found != limit)
      found++, tmp += oldlen;
   
   length = strlen(str) + found * (newlen - oldlen);
   if ( (result = (char *)malloc(length+1)) == NULL) {
      fprintf(stderr, "Not enough memory\n");
      found = -1;
   } else {
      tmp = str;
      limit = found; /* Countdown */
      reslen = 0; /* length of current result */ 
      /* Replace each old string found with new string  */
      while ((limit-- > 0) && (tmp = strstr(tmp, oldstr)) != NULL) {
         length = (tmp - str); /* Number of chars to keep intouched */
         strncpy(result + reslen, str, length); /* Original part keeped */ 
         strcpy(result + (reslen += length), newstr); /* Insert new string */
         reslen += newlen;
         tmp += oldlen;
         str = tmp;
      }
      strcpy(result + reslen, str); /* Copies last part and ending nul char */
   }
   if (count != NULL) *count = found;
   return result;
}


BTW, my source string is a global LPCWSTR, so I don't need the first parameter in my function - and I do NOT want a counter
Last edited on Dec 6, 2010 at 10:23am
Dec 6, 2010 at 10:40am
You'd replace the strxxx functions with wcsxxx.
And you'd replace malloc(length+1) with malloc(2*(length+1)).
And of course, change chars to wchar_ts.

You should note that string lengths are unsiged, and these days are of type size_t.
Dec 6, 2010 at 11:29am
my code compiles, but doesn't work - btw: is the above c char replace function correct? didn't test it.

Please check my code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
////////////////////////////////////////////////////////////////////////////////
bool CMyString::Replace(LPCWSTR pszOldStr, LPCWSTR pszNewStr) 
{
	TRACE_FUNC("Entering CMyString::Replace");

	if (!ISVALIDSTRING(pszOldStr) || !ISVALIDSTRING(pszNewStr) || GetLength() == 0)
		return false;
	
	LPCWSTR		pszString	= NULL;
	LPCWSTR		pszTemp		= m_StrBuff.Ptr();
	LPWSTR		pszResult	= NULL;
	DWORD		dwFound		= 0;
	DWORD		dwOldLen	= (DWORD)wcslen(pszOldStr);
	DWORD		dwNewLen	= (DWORD)wcslen(pszNewStr);
	DWORD		dwLen, dwResLen;

	while (NULL != (pszTemp=wcsstr(pszTemp, pszOldStr)))
		++dwFound, pszTemp+=dwOldLen;
   
	dwLen = ((DWORD)wcslen(m_StrBuff.Ptr()))+dwFound*(dwNewLen - dwOldLen);

	if (NULL == (pszResult=(LPWSTR)malloc(2*(dwLen+1))))
	{
		return false; // not enough memory
	} 
	else
	{
		pszTemp=m_StrBuff.Ptr();
		dwResLen=0; // length of result
      
		// replace old strings with new strings
		while ((pszTemp=wcsstr(pszTemp, pszOldStr)) != NULL) 
		{
			dwLen=(pszTemp-m_StrBuff.Ptr()); // number of chars to keep 
			//wcsncpy(pszResult+dwResLen, m_StrBuff.Ptr(), dwLen);
			wcsncpy_s(pszResult+dwResLen, dwLen, m_StrBuff.Ptr(), (DWORD)wcslen(m_StrBuff.Ptr())); // original part
			//wcscpy(pszResult+(dwResLen+=dwLen), pszNewStr);
			wcscpy_s(pszResult+(dwResLen+=dwLen), dwLen, pszNewStr); // insert new 
			dwResLen+=dwNewLen;
			pszTemp+=dwOldLen;
			pszString = pszTemp;
		}
		//wcscpy(pszResult + dwResLen, pszTemp);
		wcscpy_s(pszResult + dwResLen, dwLen, pszString); // copy last part 
		CreateCopy(pszResult); // writes content from pszResult in m_StrBuff
	}
	return true;
}

Dec 6, 2010 at 12:02pm
LPCWSTR pszTemp = m_StrBuff.Ptr(); What's this?

You haven't done a straight conversion have you?. If you have a debugger, step thru the code to see what's wrong. If not, use trace statements.


Dec 6, 2010 at 12:54pm
m_strBuff is the internal global buffer, where my source string is in. Ptr() gives me LPCWSTR address of it

Stepped through it, got runtime error
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Microsoft Visual C++ Debug Library
---------------------------
Debug Assertion Failed!

Program: C:\Windows\system32\MsiExec.exe
File: f:\dd\vctools\crt_bld\self_x86\crt\src\tcsncpy_s.inl
Line: 62

Expression: (L"Buffer is too small" && 0)

For information on how your program can cause an assertion
failure, see the Visual C++ documentation on asserts.

(Press Retry to debug the application)
---------------------------
Abort   Retry   Ignore   
---------------------------

on line 36 (see above thread post)
Last edited on Dec 6, 2010 at 12:56pm
Dec 6, 2010 at 2:54pm
It worked fine with the "unsafe" functions.

So I guess I'm using the *_s functions wrong...

somebody correct syntax for me?
Dec 6, 2010 at 4:05pm
Since you're using C++, why don't you use strings instead of char arrays? Then you don't have to worry about this kind of thing and the syntax is simpler.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
wstring Replace(const wstring& orig,const wstring& fnd, const wstring& repl)
{
    ret = orig;
    size_t pos = 0;

    while(true)
    {
        pos = ret.find(fnd,pos);
        if(pos == wstring::npos)  // no more instances found
            break;

        ret.replace(pos,pos+fnd.size(),repl);  // replace old string with new string
        pos += repl.size();
    }

    return ret;
}


Usage:
1
2
3
4
5
wstring foo = L"This is a test.  Is this a test?";

foo = Replace(foo,L"is",L"WOW");  //replace "is" with "WOW"

// foo is now L"ThWOW WOW a test.  Is thWOW a test?" 
Dec 6, 2010 at 4:35pm
Yep, STD would be nice.
But, sadly, I'm not allowed to do it, almost the whole project is coded from scratch
Dec 6, 2010 at 5:19pm
almost the whole project is coded from scratch


But you are using the C standard lib... so why can't you use the C++ standard lib?

Is this some kind of school thing?

EDIT: Also...

kbw wrote:
And you'd replace malloc(length+1) with malloc(2*(length+1)).


You should actually use malloc(sizeof(wchar_t)*(length+1)); because sizeof(wchar_t) might be larger than 2 (it's 4 on *nix, for example).
Last edited on Dec 6, 2010 at 6:53pm
Dec 7, 2010 at 2:35pm
It's not school, I must take what somebody created before (CMyString class)

so, I had to write my own replace function.

BTW: How can I do a replace with ""(nothing), in other words delete the substr from it?

No one who can tell me the correct usage of the _s functions in the above example please?
It works with the deprecated functions, but I need the _s to suppress the VS 2010 deprecation warnings
Dec 7, 2010 at 3:46pm
Well, you're using those _s functions in a correct way.

But look at line 38: wcscpy_s(pszResult+(dwResLen+=dwLen), dwLen, pszNewStr); // insert new you add dwLen to dwResLen and later on line 39 you do it again. I guess that's a problem without understanding all you did...
Dec 13, 2010 at 1:05pm
With the following code the replacement works, I can step through it, no error

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
////////////////////////////////////////////////////////////////////////////////
bool CMyString::Replace(LPCWSTR pszOldStr, LPCWSTR pszNewStr) 
{
	TRACE_FUNC("Entering CMyString::Replace");

	if (!ISVALIDSTRING(pszOldStr) || !ISVALIDSTRING(pszNewStr) || GetLength() == 0)
	{
		SetLastError(ERROR_INVALID_PARAMETER);
		return false;
	}
	
	LPCWSTR		pszString	= NULL;
	LPCWSTR		pszTemp		= CMyString::m_StrBuff.Ptr();
	LPWSTR		pszResult	= NULL;
	DWORD		dwFound		= 0;
	DWORD		dwOldLen	= (DWORD)wcslen(pszOldStr);
	DWORD		dwNewLen	= (DWORD)wcslen(pszNewStr);
	DWORD		dwLen, dwResLen;

	while (NULL != (pszTemp=wcsstr(pszTemp, pszOldStr)))
		++dwFound, pszTemp+=dwOldLen;
   
	dwLen = ((DWORD)wcslen(m_StrBuff.Ptr()))+dwFound*(dwNewLen-dwOldLen);

	if (NULL == (pszResult=(LPWSTR)malloc((sizeof(wchar_t))*(dwLen+1))))
	{
		return false; // not enough memory
	} 
	else
	{
		pszTemp=m_StrBuff.Ptr();
		dwResLen=0; // length of result
      
		// replace old strings with new strings
		while (NULL != (pszTemp=wcsstr(pszTemp, pszOldStr))) 
		{
			dwLen=(pszTemp-m_StrBuff.Ptr()); // number of chars to keep 
			//wcsncpy(pszResult+dwResLen, m_StrBuff.Ptr(), dwLen);
			wcsncpy_s(pszResult+dwResLen, wcslen(pszResult), m_StrBuff.Ptr(), dwLen); // original part
			//wcscpy(pszResult+(dwResLen+=dwLen), pszNewStr);cmd

			wcscpy_s(pszResult+(dwResLen+=dwLen), wcslen(pszResult), pszNewStr); // insert new 
			dwResLen+=dwNewLen;
			pszTemp+=dwOldLen;
			pszString=pszTemp;
		}
		//wcscpy(pszResult+dwResLen, pszString);
		
		wcscpy_s(pszResult+dwResLen, wcslen(pszResult), pszString); // copy last part 
		CreateCopy(pszResult); // writes content from pszResult in m_StrBuff
	}

	return true;
}



BUT when my code is finished, it gives me this error

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Microsoft Visual C++ Debug Library
---------------------------
Debug Assertion Failed!

Program: C:\WINDOWS\system32\MsiExec.exe
File: f:\dd\vctools\crt_bld\self_x86\crt\src\dbgheap.c
Line: 1322

Expression: _CrtIsValidHeapPointer(pUserData)

For information on how your program can cause an assertion
failure, see the Visual C++ documentation on asserts.

(Press Retry to debug the application)
---------------------------
Abbrechen   Wiederholen   Ignorieren   
---------------------------


Please some expert give me a hint
Dec 13, 2010 at 2:26pm
Use Disch's Replace function to implement yours. There's no point asking for advice and then not taking it. Just because the advice is free, doesn't mean it's incorrect.
Dec 13, 2010 at 2:44pm
Please read carefully: Like I told before, I'm not allowed to take that STD string class.
Dec 13, 2010 at 5:12pm
I did see that, but I can't see why you're not using the C++ standard library, but happily use the C standard library when implementing a C++ object.
Dec 13, 2010 at 7:48pm
can't see why you're not using the C++ standard library, but happily use the C standard library when implementing a C++ object


+1 to this.

The only reason I can fathom for standard lib code to be forbidden would be if this was a school assignment, which you already said it wasn't.

But whatever. *shrug* If you want to do it the hard way, you can. But then you have to wrestle with problems/bugs like the one you're having now.
Dec 13, 2010 at 8:59pm
What about the line 42 (now) and that ominous (dwResLen+=dwLen)?

I'd say that'll be good for a crash?
Topic archived. No new replies allowed.