• Forum
  • Lounge
  • Writing a program that supports multiple

 
Writing a program that supports multiple languages

Since basically 90% of my programming is random experimentation with ideas that interest me but I'll probably never use, I've written a very simple class that allows you to translate your program's output into as many languages as you want/can.

LanguageConverter.hpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
#ifndef _LANGUAGECONVERTER_HPP
#define _LANGUAGECONVERTER_HPP

#include <cctype>
#include <exception>
#include <fstream>
#include <iostream>
#include <string>

/**
 * 
 */
class LanguageConverter {
private:
	/** Languages directory */
	std::string	Directory;
private:
	/**
	 * \brief	Creates a correct path from 'Directory' and 'filename'
	 *		'Directory + "/" + filename' would result in problems
	 *		if Directory contained backslashes ('\\') instead of
	 *		forward. Unless the Directory path contains forward and
	 *		backward slashes, this function avoids that problem
	 * \param filename Name of the file to get a path to
	 * \return	A path built from 'Directory' and 'filename'
	 */
	std::string CreatePath(std::string filename)
	{
		if (Directory.find("\\") != std::string::npos) {
			if (Directory[Directory.size() - 1] == '\\')
				return Directory + filename;
			return Directory + "\\" + filename;
		}
		if (Directory[Directory.size() - 1] == '/')
			return Directory + filename;
		return Directory + "/" + filename;
	}

	/**
	 * \brief	Convert src to lower case
	 * \param src	Source string
	 * \return	The string with any upper case characters converted to
	 *		lower case
	 */
	std::string SToLower(std::string src)
	{
		std::string dst;
		for (std::string::iterator it = src.begin(); it != src.end(); ++it)
			dst.append(1, tolower(*it));
		return dst;
	}
public:
	/**
	 * \brief	Constructor
	 * \param path	Path to languages directory
	 */
	LanguageConverter(std::string path)
	: Directory(path)
	{
	
	}

	/**
	 * \brief	Translate srcString from srcLang into dstLang
	 * \param srcLang Source language (may be case-sensitive)
	 * \param dstLang Destination language (may be case-sensitive)
	 * \param srcString Source string (case insensitive)
	 */
	std::string Convert(std::string srcLang, std::string dstLang, std::string srcString)
	{
		/*
		 * Find srcString in srcLang
		 */
		std::string path = CreatePath(srcLang);
		std::ifstream ifile(path.c_str());
		int i;
		if (ifile.is_open()) {
			std::string line;
			i = -1;
			while (std::getline(ifile, line)) {
				if (SToLower(line) == SToLower(srcString)) {
					++i;
					break;
				}
				++i;
			}
			if (ifile.eof() && i < 0) {
				/* Reached EOF before finding srcString */
				std::cerr << "The source string \""
					  << srcString
					  << "\" was not found in \""
					  << path << std::endl;
				return "";
			}
			ifile.close();
		} else {
			/* Couldn't open file */
			std::cerr << "Could not open file \"" << path << "\""
				  << std::endl;
			return "";
		}
		/*
		 * Now get the string with the same index in dstLang
		 * If the files are correct, this will be the correct translation
		 */
		path = CreatePath(dstLang);
		ifile.open(path.c_str());
		if (ifile.is_open()) {
			std::string line;
			int j = 0;
			while (std::getline(ifile, line) && j < i)
				++j;
			if (ifile.eof() && j < i) {
				/* Reached EOF before j == i */
				std::cerr << "File \"" << path
					  << "\" does not contain a translation for \""
					  << srcString << std::endl;
				return "";
			}
			/* Success */
			return line;
		} else {
			/* Couldn't open file */
			std::cerr << "Could not open file \"" << path << "\""
				  << std::endl;
			return "";
		}
		/*
		 * Control never reaches this point, it's just to stop the
		 * compiler from complaining about no return value.
		 */
		return "";
	}
};

#endif /* ! _LANGUAGECONVERTER_HPP */ 


main.cpp
1
2
3
4
5
6
7
8
9
10
11
12
#include "LanguageConverter.hpp"

int main()
{
	LanguageConverter converter("translations");
	std::string s = "hello, world";
	std::cout << s << " (English) = "
		  << converter.Convert("english", "german", s) << " (German)"
		  << std::endl;

	return 0;
}
hello, world (English) = hallo, welt (German)


Before running, create a folder in the current working directory called "translations". Then make a file called "english" and a file called "german" in that folder. In the file 'english', put the text "hello, world" and in the file 'german', put the text "hallo, welt". Then run the program, and see it converted (hopefully).

It's not 100% bullet-proof or robust but it does work. I've tested it with 1 and 2 lines. I imagine that with a large project with lots of output text, it would be very slow because the complexity of the algorithm is probably O(n) (space complexity should be O(1) because it only stores 1 string at a time). I might have gotten those wrong because I don't fully understand big-O notation but I think those are right. I spent like an hour at most on this, so there are definitely some holes (I don't know how it'll react if the source string contains a newline, for example) and probably some bugs, but like I said, it does work. The biggest flaw is if the files are wrong -- the translations have to be on the same line number. If "hello, world" is line 2 in the English language file but line 2 in the German language file is the translation for something else, then the wrong translation will be printed.

TODO:
- Add the ability to define srcLang and dstLang ahead-of-time (at instantiation) or just-in-time (when Convert is called; currently this is the only supported method)
- Store the language files in an archive rather than having a folder full of plain-text files
- Correctly handle all strings (the program would probably trip up if a string had a newline in it, for example)
- Implement caching


Anyway, thanks for reading :)
Last edited on
Nice job! This could be very helpful for people like me who wouldn't want to hire a translator ^-^

EDIT:

Although, does it use the other languages' grammar as well?
Last edited on
@strongdrink: At first glance, it doesn't actually seem to translate, just search a line in the English file and prints the same line of the German file. For it to actually translate anything, you'd have to keep a full dictionary, and even then it would be a literal translation (a.k.a. simply replacing the words; grammar is ignored).
closed account (10oTURfi)
Yeah; you should make program connect to translate.google.com , spit out the word and grab the output. That would be hell of a job tho xD.
Would be a lot more work to use than simply going to translate.google.com and using it there.
Yeah, it doesn't do the translation for you, the purpose of this class is just to make it easy to output in a different language (once you've translated all the output first).

I've now split the code up into a .hpp and .cpp file (rather than just .hpp) to make it easier to read and what-not, and I've put it inside it's own namespace. I've also added a new constructor that allows you to set srcLang and dstLang AOT as well as a rudimentary caching system.

LanguageConverter.hpp: http://codepad.org/7SYZ8mGh
LanguageConveter.cpp: http://codepad.org/0s6DkH9K
main.cpp: http://codepad.org/HYDxSJw7

Space complexity will be higher because by default it uses up to 1 MiB for the cache, but the time complexity should be somewhat lower. I think 1 MiB is a reasonable amount; every modern computer should easily have enough free memory.

[edit]
@Krofna,
That's an interesting idea but I don't know how I'd enter the search phrase and get the result because I think Google Translate uses Flash or something. Also I imagine it would be extremely slow to connect to the Internet and download several HTML pages every time you wanted to output something.
Last edited on
Maybe you can use Microsofts api http://www.microsofttranslator.com/ . Looks pretty easy to use http://www.dotnetcurry.com/ShowArticle.aspx?ID=357
That's interesting, but it would still be very slow.
OK, so this is a localization system for an app where output will be known at compile time?
@ chrisname: Do you use those special comment tags for Doxygen, otherwise what IDE?
@naraku9333,
That's the intention but theoretically nothing has to be known at compile-time.

@Catfish,
Yes, it's for Doxygen. I don't use an IDE for C++ because I haven't found one I like yet. The only IDE I've used that I liked was MonoDevelop for C# (though I think it supports C++ now so I may try it with C++).
Last edited on
the purpose of this class is just to make it easy to output in a different language

This seems somewhat like a reinvention of message catalogs, as in GNU gettext(), POSIX catgets() or C++ std::messages. While a fun exercise, it may benefit in documenting the differences in scope and function between those approaches and yours.
I don't know what those are, but from what I've read they don't sound like they do what my class does. I don't know, though, so maybe they are the same thing.
If I understand correctly, aren't you more or less reinventing the standard Linux localization method, gettext?

http://en.wikipedia.org/wiki/Gettext

(I've just registered Cubbi's post. It does seem like you're reinventing gettext, which maps strings to strings, rather than catgets, which is id to string based)
Last edited on
Topic archived. No new replies allowed.