AdSense

Saturday, November 19, 2016

Design interview general approach -- tiny url example

1. Understand requirement:
Start from understanding basic functionality. For example, If we need to design a tiny url, we know we want to map a long url to a shorter one.

2. Figure out what data we need to store:
In the tiny url example, it's quite simple: we need to store the actual url and the shorter one (ID).

3. Figure out your database:
Different data requires different DB. It would be easier for  you to start with some relational database and then compare it with NoSQL DB.

In this example, since the structure is so easy, there may not be lots of relations between data, so NoSQL may be a good choice. Here, key may be our generated ID (the tiny url) and the value can be the original url.

4. Think about your API:
How to grab data from DB? How to operate data? How to connect with front end? How to display to your user?
Let's walk through our tiny url example. When we receive a long url, our TinyUrlService interface calls createId(URL url) method to create a new id as the tiny url. There can be multiple ways to implement this method. The easiest one can be having incremental ID. However, as incremental ID only contains numbers, which means there can be at most 100 thousand ones (0 - 99999), and it's hard to scale. One way to optimize is to use Base 64 encoding, and we can increase representation to 64^5 = 2^30 ids. The implementation can be found in my previous post, and you can take a look if you are interested.

Now when the client request the page with the tiny url, our TinyUrlService interface will call getId(ID id) to retrieve the original url from the DB, if it matches, we redirect the page to the original url, otherwise we return code 404.

5. Now think about scalability, reliability and reduce latency:
We have mentioned to use Base 64 encoding to allow more tiny urls, that can be one way for scalability. Using NoSQL for fast query is another way when you have lots of data.

Now another common approach is to use cache. In the tiny url example, Least Frequent Use (LFU) can be a good one. Also you can use some other in-memory storage (e.g., Memcached) for actual caching implementation.

If we have lots of data, we need to think about sharding. Here, since we already have a key, we can just shard by this key. Moreover, think about consistent hash (check this post) so that it's easy to add more machines. (scalability!)

Now if we only have incremental key (with base 64 encoding), it has security issues. So we need to use some hashing mechanism so that the actual key doesn't look like the one shown to the client.

1 comment:

  1. The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. IEEE final year projects on machine learning In case you will succeed, you have to begin building machine learning projects in the near future.

    Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.


    Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.

    ReplyDelete