Friday, January 2, 2015

Substring with Concatenation of All Words

You are given a string, S, and a list of words, L, that are all of the same length. Find all starting indices of substring(s) in S that is a concatenation of each word in L exactly once and without any intervening characters.
For example, given:
S"barfoothefoobarman"
L["foo", "bar"]
You should return the indices: [0,9].
(order does not matter).

Two maps are used to store the words in the list and those in the string.  I used two hash sets at first because I didn't realize that duplicates are allowed in the list. For example, "aaa" and ["a", "a"].

Another mistake I made was that the increment of index in S cannot be the word length in the list, since those that are not in the list can be in any length. For example, "afoobar" and ["foo", "bar"].

The inner loop is used to check if the substring is a concatenation of all words in the list. Thus, index i becomes the starting index of the substring, which will be added into the result list if all conditions are satisfied.

If any condition is not satisfied, e.g., the substring cannot be found in the list, or the word occurs more than once, we break the inner loop and start from the next index.

The map is cleared at the beginning of every outer loop. If an index is found, or if a concatenation cannot be found from the last index, everything inside the map is useless. We need to start all over again. :)

Update: 2015 - 01 - 15
Walk through:
1. Store all strings in the list to a map.
2. Since all strings in the list needs to be included, the loop will exit at S.length() - L.length * L[0].length().
3. Use another loop to check for the concatenation of the words in the list. Use count to check if all strings in the list are included.
    1. If the substring that equals to the length of the words is not in the list map, break;
    2. If the number of any string in the list exceeds the number of that string in the list, break;
4. Add the start index if count == L.length.
5. Increment index, clear map.


public class SubstringWithConcatenation {
    public List findSubstring(String S, String[] L) {
        if (S == null || L == null)
            throw new NullPointerException("Null string or list!");
        List rst = new ArrayList ();
        if (S.length() == 0 || L.length == 0)
            return rst;
        int len = L[0].length();
        if (len > S.length())
            return rst;
        HashMap L_map = new HashMap ();
        for (int i = 0; i < L.length; i++) {
            if (!L_map.containsKey(L[i]))
                L_map.put(L[i], 1);
            else
                L_map.put(L[i], L_map.get(L[i]) + 1);
        }
        HashMap hm = new HashMap ();
        for (int i = 0; i <= S.length() - L.length * len; i++) {
            hm.clear();
            int count = 0;
            for (count = 0; count < L.length; count++) {
                int pos = i + count * len;
                String tmp = S.substring(pos, pos + len);
                if (!L_map.containsKey(tmp)) 
                    break;
                if (!hm.containsKey(tmp)) {
                        hm.put(tmp, 1);
                }
                else {
                    hm.put(tmp, hm.get(tmp) + 1);
                    if (hm.get(tmp) > L_map.get(tmp))
                        break;
                }
            }
            if (count == L.length)
                rst.add(i);
        }
        return rst;
    }
}

1 comment:

  1. The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. IEEE final year projects on machine learning In case you will succeed, you have to begin building machine learning projects in the near future.

    Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.


    Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.

    ReplyDelete