String anagram checker - How would you have done it

Nov 30, 2011 at 9:55pm
Hey folks.

So yesterday I was browsing through interview questions regarding programming and one came up that I did.

It was, create a function that determines whether one string is an anagram of another (can't remember the exact wording).

I put myself in a 'only have a minute or two to answer' situation and the first thing I came up with to convert the string characters to ints, sort the list then compare each entry. This is shown in the s_compare code below. Now today I thought of the problem again and came up with another method I think is faster.

Within the st_compare function, we create an array of size 96 which (might be wrong here), should cover all possible characters since the first 32 wouldn't matter (http://www.asciitable.com/). Then cycle through the first string and for each entry in this string, convert it to an integer, subtract 32 to make it fit within the array limits, and increase the arrays entry by 1.

Then go through the second string, do the same conversion and if the array entry is 0, we know there is an entry in the second string that is not in the first. If not, subtract one from the array entry. If you reach the end of the string and haven't hit a zero in the array, you know the second string is an anagram of the first.

I think the complexity of the first method is O(n log(n)) which is the complexity of sort, and I think the complexity of the second is O(n). I was just wondering if anyone else had any ideas on how to do this, whether my second method was a decent idea and whether there's a much faster method.

In the code below, s_compare is the first method and st_compare the second.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
#include <iostream>
#include <algorithm>

using namespace std;

bool s_compare (string& a, string& b)
{
    if (a.length() != b.length())
    {
        return false;
    }
    int *array_a = new int [a.length()];
    int *array_b = new int [b.length()];
    for (unsigned int i = 0; i < a.length(); i++)
    {
        array_a[i] = a[i] - '0';
        array_b[i] = b[i] - '0';
    }

    sort(array_a, array_a + a.length());
    sort(array_b, array_b + a.length());

    for (unsigned int j = 0; j < a.length(); j++)
    {
        if (array_a[j] != array_b[j])
        {
            delete[] array_a;
            delete[] array_b;
            return false;
        }
    }

    delete[] array_a;
    delete[] array_b;
    return true;
}

bool st_compare (string& a, string& b)
{
    if (a.length() != b.length())
    {
        return false;
    }
    int array[96] = {0};
    for (unsigned int i = 0; i < a.length(); i++)
    {
        array[a[i] - '0' - 32] += 1;
    }
    for (unsigned int j = 0; j < a.length(); j++)
    {
        if (array[b[j] - '0' - 32] == 0)
        {
            return false;
        }
        else
        {
            array[b[j] - '0' - 32] -= 1;
        }
    }
    return true;

}

int main()
{
    string in, in2;
    cout << "Enter first string: ";
    getline(cin, in);
    cout << "Enter second string: ";
    getline(cin, in2);
        if (s_compare(in, in2))
        {
            cout << in << " is an anagram of " << in2 << endl;
        }
        else
        {
            cout << in << " is not an anagram of " << in2 << endl;
        }
        if (st_compare(in, in2))
        {
            cout << in << " is an anagram of " << in2 << endl;
        }
        else
        {
            cout << in << " is not an anagram of " << in2 << endl;
        }
    return 0;
}
Last edited on Nov 30, 2011 at 9:57pm
Nov 30, 2011 at 10:15pm

How about just comparing the sorted strings?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include<string>
#include<algorithm>
#include<iostream>

int main() {
  std::string a, b;

  std::cout << "Enter word one: " << std::endl;
  std::cin >> a;
  std::cout << "Enter word two: " << std::endl;
  std::cin >> b;

  std::sort(a.begin(), a.end());
  std::sort(b.begin(), b.end());

  if (a==b)
    {
    std::cout << "They are anagrams of each other." << std::endl;
    }
  else
    {
    std::cout << "They are not anagrams of each other." << std::endl;
    }
  return 0;
}

Last edited on Nov 30, 2011 at 10:17pm
Nov 30, 2011 at 10:41pm
Is that not O(n log(n)) since that's what sort is?
Nov 30, 2011 at 11:08pm
It is (commonly). I don't know how string equality compares to doing it manually char by char.
Last edited on Nov 30, 2011 at 11:08pm
Nov 30, 2011 at 11:49pm
Likely similar.

I suppose I was looking for O(n) methods (is that what my second method is?). I think I saw this question in a google interview question thing although it seems a bit easy compared to their normal stuff.

I guess if the question was, write an efficient algorithm to check if one string is an anagram of another, or to do it in O(n) time, what would you do?
Dec 1, 2011 at 12:02am
++Moschops; though I don't like the inconsistent brace style ;)
Dec 1, 2011 at 12:25am
Except it doesn't work :-(

If string a contains a letter not found in string b, it won't catch it and will return true.

Jim
Dec 1, 2011 at 12:34am
Which method are you referring to jim80y?
Dec 1, 2011 at 12:36am
witch one isn't working? Both Muckle's functions check the length so I don't think there is a problem, except if the strings contains some characters like + or space.
Dec 1, 2011 at 12:36am
st_compare()

If you pass it "xello" and "oleHl", for example, it will return true.

Jim
Dec 1, 2011 at 12:38am
@Peter87

I think + and space are OK. If you check they ASCII tables they are contained in the last 96 values. I just tried it with a random sequence of symbols and it seemed to work.
Dec 1, 2011 at 12:45am
Yeah Jims right. It's because the ASCII table in the link I used doesn't match with the one in C++ (I think?) H is 24 for example when you do 'H' - '0'.

Extend the array to 128 and remove the -32 parts and xello is not an anagram of oleHl.

Likely more mistakes in there somewhere, will need to see full list of char to int conversions.
Dec 1, 2011 at 12:46am
I was confused, too - it's because capitals underrun the array.

'H' - '0' - 32 is -8

Oops!
Topic archived. No new replies allowed.