Thursday, January 22, 2015

Determine if two strings are match


Two texts are considered to "match" if they have a common substring of at least length n. Describe an algorithm to determine if two strings are matches.

This is another FB interview question. The reason I am writing this blog is that lots of people are talking about using DP, i.e., find the longest common substring. However, I find it unnecessary and more expensive. I implemented both methods and did some performance test.


//hashset method
 public static boolean isMatch(String s1, String s2, int n) {
  if (s1 == null || s2 == null || s1.length() < n || s2.length() < n)
   return false;
  Set substrings = new HashSet ();
  for (int i = 0; i <= s1.length() - n; i++) {
   substrings.add(s1.substring(i, i + n));
  }
  for (int i = 0; i <= s2.length() - n; i++) {
   if(substrings.contains(s2.substring(i, i + n)))
    return true;
  }
  return false;
 }
 //DP method
 public static boolean isMatch2(String s1, String s2, int n) {
  if (s1 == null || s2 == null || s1.length() < n || s2.length() < n)
   return false;
  int lcs = longestCommonSubstring(s1, s2);
  return lcs >= n;
 }
 public static int longestCommonSubstring(String s1, String s2) {
  if (s1 == null || s2 == null)
   throw new NullPointerException("Null string(s)!");
  int[][] lcs = new int[s1.length() + 1][s2.length() + 1];
  int max = 0;
  for (int i = 1; i <= s1.length(); i++) {
   for (int j = 1; j <= s2.length(); j++) {
    if (s1.charAt(i - 1) == s2.charAt(j - 1)) {
     lcs[i][j] = lcs[i - 1][j - 1] + 1;
    }
    max = Math.max(lcs[i][j], max);
   }
  }
  return max;
 }




The hashset method takes O(m + n) time but DP takes O(mn) time. And the memory usage is almost same.

1 comment:

  1. The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. IEEE final year projects on machine learning In case you will succeed, you have to begin building machine learning projects in the near future.

    Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.


    Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.

    ReplyDelete