Replace Substring

Straight C/C++ Win32 API. No MFC, WinForms, etc. No classes. Just strait C with a possible small C++ convention, if necessary.

I've done this before using strstr() and other like functions, and I can figure out how to do it again, but I always learn when I ask for help, so let me put this out there...

Here is the string (without the quotes) - "KJV Jhn 3:16"

I want to replace the "Jhn" portion with "Joh" (without quotes)

I can of course use strstr() to instantly find "Jhn"

How would you go about from there?

That is, instead of "KJV Jhn 3:16" I want to end up with "KJV Joh 3:16" (without the quotes).

Here's what I've done. I would like to know if there is a better way to do this.

In the following code, szVer[]="KJV Jhn 3:16" without the quotes. The following code effectively replaces that and ends up being "KJV Joh 3:16" without the quotes --

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
int len = strlen(szVer);

	if(strstr(szVer,"Jhn"))
	{
		for(int i = 0; i < len; i++)
		{
			if(szVer[i]=='J')
			{
				szTransfer[i]='J';
			}
			else if(szVer[i]=='h')
			{
				szTransfer[i]='o';
			}
			else if(szVer[i]=='n')
			{
				szTransfer[i]='h';
			}
			else
			{
				szTransfer[i]=szVer[i];
			}
		}
		strcpy_s(szVer,MAX_PATH,szTransfer);
	}


As I said, this works fine, but I'm wondering is there is a faster or more efficient way to do it?
I presented the above code above just for clarity's sake. Here is the actual code in my program, which is as tight as I can make it --

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
char* pdest=strstr(szVer,"Jhn");
	
	if(pdest != NULL)
	{
		int len = strlen(szVer);
		
		for(int i = 0; i < len; i++)
		{
			if(szVer[i]=='h')
			{
				szTransfer[i]='o';
			}
			else if(szVer[i]=='n')
			{
				szTransfer[i]='h';
			}
			else
			{
				szTransfer[i]=szVer[i];
			}
		}
		strcpy_s(szVer,MAX_PATH,szTransfer);
	}

Last edited on
1
2
3
4
5
6
if(pdest != NULL)
{
*(++pdest) = 'o';
*(++pdest) = 'h';
		
}
Last edited on
Interesting. Since I almost never use plain C I never realized there was no Find() and Replace().

Your approach, although practical for the current task (not to mention simple), it is not reusable. Since I have the heart of a software architect, I cannot conceive programming that into an application. I must have a generic Replace() function.

Browsing the C string functions, I decided I should use strncmp() to create a Find() function. Then I used this Find() to implement a Replace() function. See http://ideone.com/fBzmQ .

A few notes, though:

1. The function as it is right now first finds all matches, then replaces them all. It buffers the matching pointers in a dynamic array, but this could become a large array, so this function may not scale well.
2. The result of Replace() is a dynamically-allocated string that requires deletion. It should instead receive the already-allocated buffer to write to and an in/out parameter describing such buffer and the end state of the buffer (on exit).
3. I forgot I should have used malloc() and free() since it was an all-C exercise. My bad.
4. I think I need to null-terminate in line 64, but I could be wrong (I think I'm not wrong, though).
Last edited on
Thanks to both of you. While I would never have come up the pointer in the code above, because I still don't really understand pointers as I should (I do understand pointers in general, but not like I should), I agree with webJose on the reusability.

I'm going to actually take your code, webJose, and toy with it. There's some stuff in there I'm not familiar with, so it will be a good exercise.
Good. Post new questions if you have to.

Another note: The sizes of the input strings are constant (after all, the parameters are const). To gain further performance, calculate the sizes of these strings in unsigned int variables and use them throughout the code, as opposed to calling strlen() every time the size is needed. This could be a significant performance improvement over the code shown there for large input strings.
Just out of curiosity, I added this to one of my Win32 Windows apps --

1
2
3
4
#include <cstring>
#include <string>

string szTest;


and it won't compile. I have settings to Multi-byte instead of Unicode. Could that be it?
Last edited on
Error message?
1
2
3
1>c:\programming\win32 apps\boss\boss 2.25\boss 2.25\boss_option_functions.cpp(14): error C2146: syntax error : missing ';' before identifier 'szTest'
1>c:\programming\win32 apps\boss\boss 2.25\boss 2.25\boss_option_functions.cpp(14): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
1>c:\programming\win32 apps\boss\boss 2.25\boss 2.25\boss_option_functions.cpp(14): error C4430: missing type specifier - int assumed. Note: C++ does not support default int
Last edited on
Strange. See if reversing the order of the #include's help.
Nope. I also added #include <iostream> just to see if that might help, even though I know it shouldn't, and it didn't.

I wonder if I changed the character set to Unicode if it would work?

Except my project is WAY too big to do that. Maybe I'll try it with a unicode Win app when I get time.
Don't really know what to tell you, except: Copy and paste your code into ideone.com. Does it compile? If yes, you might have corrupted header files.
No, I'm sure my header files aren't corrupted because of other stuff I'm doing. I think it has to do either the Multi-byte setting or I"ve got something set that is preventing it.
FYI, I had a bug in the function, and messing with it @ ideone.com I deleted the code, so here's the new URL in case you are depending on its online availability: http://ideone.com/Kz0nx .

FYI, I was zero'ing the string one byte beyond the allocated memory.
Regarding the <string>, <cstring> thing, its my understanding that these are simply two different declarations of the same thing. For example, if one wishes C compilation using the old deprecated include convention, one would do this...

#include <string.h>

For modern C++ one would do this...

#include <cstring>

...and one would be able to use all the string primitives just like in c (after all ISO C++ includes the C Standard Library too), but one would also have access to the standard C++ library string class (STL implementation of basic_string).

For my C++ coding I typically do a #include <string.h>, even though it is deprecated (I don't know if that's the right word). The reason I do this is because I almost always use my own string class, which I developed over the course of a good many years, and it requires the string primitives in string.h, i.e., strlen(), strcpy(), etc.

Hope you don't mind if I add my own two cents here Lamblion. The kind of low level C buffer minipulation you are doing will kill your productivity. I am whole hearted in approval of knowing how to do all that kind of stuff with pointers, but I have to say that it will grind you right into the dirt. I wish I had some of the time back I did stuff like that over the course of my life. Actually, the reason I learned C++ was because stuff like that was eating me alive.

I program in C, C++, and PowerBASIC, and I've run into many occasions where I'd spend five or ten minutes doing some kind of string minipulation in PowerBASIC, that, when I tried to translate it to C took me a whole morning or afternoon. That's why I wrote my own string class.

Anyway, here is a PowerBASIC 32 bit console program with its output that does what you are asking...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#Compile Exe
#Dim All

Function PBMain() As Long
  Local strLine As String

  strLine = "KJV Jhn 3:16"
  Print "strLine = " strLine
  Replace "Jhn" With "joh" In strLine
  Print "strLine = " strLine
  Con.Waitkey$

  PBMain=0
End Function

'strLine = KJV Jhn 3:16
'strLine = KJV joh 3:16


The above compiles to 11K, which would be somewhere in the neighborhood of 40-60K less than a C++ program using the Standard C++ Library String Class (its faster too).

So I'm in agreement with WebJose. That stuff will eat you alive!

However, I'm very sympathetic to what you are doing, because I do it all the time. I prefer to write tight compiled code, and I like small program size. If my string minipulation needs aren't too great in a program, I'll not include the C++ string class, because, like I said, it will add about 40 - 60K to your program. For most folks who use class frameworks, that's no big deal, but I do straight SDK, so its a big deal to me.

Last edited on
I know why it wouldn't compile... I forgot to do this before declaring the string s2 statement --

using namespace std;

IOW, I didn't have access to the class because I didn't use the "using" directive. When I added this, it compiled perfectly.
Topic archived. No new replies allowed.