Hello. I am writing a chatbot and I am having trouble thinking of a function which would return a percentage of similarity for two strings. At first I though that it would be easy. I would use a forloop to iterate though each charcter of the two strings, add the number of matches and convert it into a percentage, but then I run into this problem. Lets say someone accidently misses out one letter for example:
String 1: Hello World!
String 2: Helo World!
Match (y = yes, n = no): YYYNNNNNNNNN (25% match)
Equally if someone accidently adds a letter:
String 1: Hello World!
String 2: Helllo World!
Match (y = yes, n = no): YYYNNNNNNNNNN (23% match)
In terms of human reading, these two strings are alot more similar than just 25% or 23%, so I was wondering if someone could help me come up with an idea for a function that would allow me to check the similarities of two strings without allowing diffrent lengths (expressed in the example) to sigifantly effect the percentage.
Match (y = yes, n = no): [Hello/Helo]
H = H; Y
E = E; Y
L = L; Y
L = O; N
L = N/A; N (maybe you could discount this as a comparison, improving it from 10/12 to 10/11 = 91%)
3/5
whitespace matches, Y
1/1
Match (y = yes, n = no): [World!/World!]
W = W; Y
O = O; Y
R = R; Y
L = L; Y
D = D; Y
! = !; Y
6/6
^^ no problem, glad I could help, it still isn't perfect because Helo World! vs Hello World! seems a lot more accurate than 83~91% to humans, but it should do for the moment.