Fuzzy Matches in Google Sheets

When trying to build our curriculum inventory I needed to match thousands of strings in Google Sheets. Doing this manually would have been tedious and prone to error. The EQ function would fail if there was one character different, forget about objectives that had words shifted around. Here’s my script that outputs the percentage of string A that is present in string B averaged with the percentage of string B that is in string A. I found that anything over 40% seems to be semantic match.

function howMuchMatches(string1, string2) {
 
  // clean both strings
  var array1 = string1.trim().toLowerCase().replace(/[^\w\s\d]/g,"").split(" ");
  var array2 = string2.trim().toLowerCase().replace(/[^\w\s\d]/g,"").split(" ");
  
  // how much of string1 is in string2
  var numberOfMatches12 = 0;
  
  for (i=0; i < array1.length; i++) {
    if (array2.indexOf(array1[i])>=0)
      { numberOfMatches12++; }

  // how much of string2 is in string1
  var numberOfMatches21 = 0;

  for (i=0; i < array2.length; i++) {
    if (array1.indexOf(array2[i])>=0)
      { numberOfMatches21++; }

  // convert those to percentages
  var percentMatch12 = numberOfMatches12/array1.length;
  var percentMatch21 = numberOfMatches21/array2.length;

  // return the average the two
  return (percentMatch12 + percentMatch21)/2; 
}

2 Comments

  1. jfeder says:

    Hello Rahul
    Very nice piece of code. Got the following when trying to use:
    “Missing ) after condition. (line 11, file “Code”)”

    1. RP says:

      Hi Jfeder. Looks like WordPress cleaned my code such that greater-than signs (“>”) with & g t ; . If you replace & g t ; with “>” it should work. Sorry about that, and thanks for catching it.. I’ll fix it when I get back to my laptop.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s