Thursday, January 8, 2015

Anagrams and good hash key


Given an array of strings, return all groups of strings that are anagrams.
Note: All inputs will be in lower-case.
Anagrams, are two collections of words that are consisted with same characters. It is not hard to check if two strings are anagrams (use a hashset to store all characters of one and check the other), however, how can we check a group of strings that contains different anagrams? Naturally we will think of using a map to store each group of anagrams with a hash key, then the next problem is, how to determine a good hash key?

At first, I thought that if two strings are consisted of same characters, the sum of all characters should be the same. That is true, however, it is possible that two strings composed of different words also have the same sum of characters. (e.g., "duh", "ill"). Here is a good one I found online: 

private int getKey(String s) {
        int key = 0;
        int a = 378551;
        int b = 63689;
        int[] count = new int[26];
        for (int i = 0; i < s.length(); i++) {
            int tmp = s.charAt(i) - 'a';
            count[tmp]++;
        }
        for (int num : count) {
            key = key * a + num;
            a = a * b;
        }
        return key;
    }


The rest of the method is trivial.


 public List anagrams(String[] strs) {
        if (strs == null)
            throw new NullPointerException("Null string array!");
        List rst = new ArrayList ();
        if (strs.length <= 1)
            return rst;
        Map> map = new HashMap> ();
        for (String s: strs) {
            int key = getKey(s);
            if (!map.containsKey(key)) {
                map.put(key, new ArrayList());
            }
            map.get(key).add(s);
        }
        for (ArrayList a : map.values()) {
            if (a.size() > 1)
                rst.addAll(a);
        }
        return rst;
        
    }
    private int getKey(String s) {
        int key = 0;
        int a = 378551;
        int b = 63689;
        int[] count = new int[26];
        for (int i = 0; i < s.length(); i++) {
            int tmp = s.charAt(i) - 'a';
            count[tmp]++;
        }
        for (int num : count) {
            key = key * a + num;
            a = a * b;
        }
        return key;
    }

1 comment:

  1. The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. IEEE final year projects on machine learning In case you will succeed, you have to begin building machine learning projects in the near future.

    Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.


    Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.

    ReplyDelete