public class RepeatedDNA {
private static final Map dna = new HashMap ();
static {
dna.put('A', 0);
dna.put('C', 1);
dna.put('G', 2);
dna.put('T', 3);
}
private final int base = 29;
public List findRepeatedDnaSequences(String s) {
if (s == null)
throw new NullPointerException("Null String!");
//avoid duplicate sequence
List rst = new ArrayList ();
if (s.length() == 0)
return rst;
Set sequence = new HashSet ();
long hashS = 0;
for (int i = 0; i < s.length(); i++) {
if (i > 9) {
hashS -= (long)Math.pow(base, 9) * dna.get(s.charAt(i - 10));
}
hashS = hashS * base + dna.get(s.charAt(i));
if (i > 8 && !sequence.add(hashS)) {
if (!rst.contains(s.substring(i - 9, i + 1)))
rst.add(s.substring(i - 9, i + 1));
}
}
return rst;
}
}
AdSense
Monday, February 9, 2015
Repeated DNA Sequences
The tricky part of this problem is that we cannot simply use a set to store or substrings: it will exceed memory limit. A smarter way is to use integers instead of strings. I use the rolling hash method, which is also used in the implement strStr() problem (see that problem for detail).
Labels:
LeetCode
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment