soundex function question

closed account (zwA4jE8b)
This is my soundex code. It works great. My question is...

In the soundexerize function, I create the soundex code for the name but also pad the name with 0's if need be. Is it 'proper' to have one function perform one duty, not two? Should I create a separate function for pad zero's?

I know the code is short in this case, so maybe it does not matter as much as a more complex program.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
#include <iostream>
#include <fstream>
#include <string>
#include <iomanip>

using namespace std;

void start(char *_fname, ifstream& fin, string& _msg);
void soundexerize(ofstream& outs, string _input);
void output(ofstream& outs, string _output, string _in);

int main()
{
	ifstream _fin;
	string _message;
	char *_fname = new char;
	cout << "Welcome to the SoundEx conversion suite..." << endl;

	do
	{
		cout << "Enter the filename...";
		cin >> _fname;
		_fin.open(_fname);
	}
	while (!_fin);

	start(_fname, _fin, _message);

	cout << _message << endl;
	cin.get();

	return 0;
}

void start( char *_fname, ifstream& _fin, string& _msg)
{
	string _in;
	ofstream outs("out.txt");

	outs << left;

	while (!_fin.eof() && !_fin.fail())
	{
		_fin >> _in >> ws;
		soundexerize(outs, _in);
	}
	if (!_fin.fail())
		_msg = "File processing completed...";
	else
		_msg = "Error processing file...";
}

void soundexerize(ofstream& outs, string _input)
{
	char _ch;
	string _temp = "";
	_temp += toupper(_input[0]);

	for (int i = 1; i < _input.length(); i++)
	{
		switch(tolower(_input[i]))
		{
			case 'b': case 'f':
			case 'p': case 'v':
				_ch = '1';
				break;
			case 'c': case 'q':
			case 'g': case 's':
			case 'j': case 'x':
			case 'k': case 'z':
				_ch = '2';
				break;
			case 'd': case 't':
				_ch = '3';
				break;
			case 'l':
				_ch = '4';
				break;
			case 'm': case 'n':
				_ch = '5';
				break;
			case 'r':
				_ch = '6';
				break;
			default:
				_ch = '0';
		}
		if (_input[i] != _input[i-1] || _ch != _temp[_temp.length()-1])
			if(_ch != '0')
				_temp += _ch;
	}
	if (_temp.length() < 4)
		for(int i = _temp.length(); i < 5; i++)
			_temp += '0';
	_temp = _temp.substr(0, 4);
	output(outs, _temp, _input);
}

void output(ofstream& outs, string _output, string _in)
{
	outs << "Name: " << setw(25) << _in << " SoundEx Code: " << _output << endl;
}


Thanks,
Mike
Last edited on
It makes sense that soundexerize() does what it does. The real question is should soundexerize() call another function to pad or embed the code directly. You can do this with essentially one or two lines of code, which kind of makes the question moot. I'd just embed these lines directly in soundexerize():

1
2
while( _temp.length() < 4 )
    _temp += '0';


(Note: not the most efficient of implementations.)
closed account (zwA4jE8b)
I like your while statement.

what do you mean by
not the most efficient of implementations
? The whole thing, or just part of it.


EDIT:

Do you mean, instead of having the output function, just do it in soundexerize() after i pad the zero's?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
void soundexerize(ofstream& outs, string _input)
{
	char _ch;
	string _temp = "";
	_temp += toupper(_input[0]);

	for (int i = 1; i < _input.length(); i++)
	{
		switch(tolower(_input[i]))
		{
			case 'b': case 'f':
			case 'p': case 'v':
				_ch = '1';
				break;
			case 'c': case 'q':
			case 'g': case 's':
			case 'j': case 'x':
			case 'k': case 'z':
				_ch = '2';
				break;
			case 'd': case 't':
				_ch = '3';
				break;
			case 'l':
				_ch = '4';
				break;
			case 'm': case 'n':
				_ch = '5';
				break;
			case 'r':
				_ch = '6';
				break;
			default:
				_ch = '0';
		}
		if (_input[i] != _input[i-1] || _ch != _temp[_temp.length()-1])
			if(_ch != '0')
				_temp += _ch;
	}
	while (_temp.length() < 5)
			_temp += '0';
	_temp = _temp.substr(0, 4);

	outs << "Name: " << setw(25) << _input << " SoundEx Code: " << _temp << endl;
}
Last edited on
operator+= could potentially reallocate (and copy) the string each time it executes.

Since you know the string returned can never can never be larger than 4 characters, I'd do

_temp.reserve( 4 );

at the top, then break out of the for() loop when _temp.size() becomes 4 (or when the other condition is true).
closed account (zwA4jE8b)
Oh, thank you. Now that you point it out, stopping the loop at 4 seems so obvious.
Topic archived. No new replies allowed.