Placeholder Image

字幕列表 影片播放

  • Hello everyone! Welcome to codeKarle. In this video we'll be talking about

  • possibly one of the most common system design interview question.

  • So let's see how do we design a URL shortening service, something very similar to tinyurl.com.

  • Let's start with the functional (FRs) and non functional requirements (NFRs).

  • This is a fairly straightforward problem. You have these two functional requirements -

  • given a long URL, you get a short URL. So, when somebody wants to shorten the URL,

  • they'll give you a long URL and you need to return a shorter URL for that.

  • On the other side, when somebody hits a short URL then you basically redirect the

  • person to the longer URL, which is the second point.

  • On the non-functional side, this service needs to be highly available and it should work at a very

  • low latency. Because, let's just say, we were building it for a Facebook or a Twitter kind of a scenario,

  • it would just be a bad experience if it just takes a lot of time to respond.

  • And this being a very platform kind of a component,

  • it needs to be always available and respond back in a very short time.

  • The very first thing to answer is - What should be the length of our short URL?

  • That depends upon the scale that we need to build this system for.

  • If we're just told a couple of hundreds of URLs, maybe 2-3 characters are more than enough.

  • But if you are building it at a Google scale or a Facebook scale,

  • then we need to think over what should be the length of our short URL.

  • Because we should not run out of the short URLs.

  • Another option is - we build a system agnostic of the length and we start from somewhere

  • and it keeps on incrementing.

  • But let's start with a fixed length to start.

  • The very first question you need to ask your interviewer is - What is the traffic you are looking at?

  • How many unique URLs will come to you for shortening?

  • Probably every day, every month, something of some number.

  • Until what duration you need to support that?

  • Let's just assume that -

  • Once a URL is converted into a shorter URL, you need to save it for at least next ten years.

  • Sounds a bit too much, but let's just take it as a number, okay.

  • With that in mind, let's look at how can we come up with the potential length.

  • So, let's just say that your interviewer tells you that there are 'X'

  • number of requests that come to you every second.

  • So, (X * 60) is the number of requests. These are basically the number of short

  • URLs that you have to generate.

  • So, these are the number of requests that come to you in a minute.

  • These are the number of requests that come to you in an hour.

  • Multiplying by 24 gives you the number of requests in a day.

  • Multiplying by 365 gives you the number of requests in a year. Multiplying by 10 years.

  • Whatever this number would be, you should be able to build a system which can handle

  • these number of unique requests.

  • Let's just say this number is 'Y'. Now, the next question to ask is -

  • What all characters can you include in the short URL? Should it be just numbers? Could it be alphabets?

  • Could it be capital alphabets and small case alphabets? What all is allowed?

  • Generally, people take [A to Z], [a to z] and numbers.

  • So maybe we can stick to that but you could ask your interviewer and then you could come up with

  • a set of, basically a character set that is allowed for short URLs.

  • For now, let us assume that we'll take [a to z], [A to Z] and numbers [0 to 9].

  • These are 62 characters which you can use in the short URL.

  • Now, you need to have some length, you need to come up with a length which can handle

  • these number of URLs, these number of unique combinations of these 62 characters.

  • Let's just say if the length was 1, then you can support 62 URLs.

  • Because there is one possibility each.

  • If the length of the short URL is 2, then you can support (62 ^ 2), which is some number let's say.

  • Basically, you need to come up with a number where (62 ^ n) is greater than this number Y.

  • You solve this equation and you get the answer to n. n would be log62(Y).

  • The feeling of it.

  • Let's say, if this number comes down to be 4.5, then you take n = 5 so you will create short URLs of length 5.

  • As a general number, (62 ^ 6) is somewhere around 58 billion if I am not wrong

  • and (62 ^ 7) is somewhere around 3.5 trillion. Both of these are very massive numbers.

  • Depending upon what is your use case, you can come up with some length.

  • Let's just say, for our use case, we will stick to 7 characters.

  • Generally, I have seen it's a standard that these short URLs generally use 7 characters, may be more at times.

  • But we will stick to 7 characters and we will be able to support 3.5 trillion unique URLs.

  • That I am assuming is more than this number (Y).

  • If it so happens that this Y is even more, then you can probably take 8 characters,

  • 9 characters or whatever.

  • Now, let's look at one of the potential architectures that can solve this problem.

  • This is not a final version. We will iterate over it.

  • So, let's just say, there is this UI, which takes the long URL as input and gives back the short URL.

  • So, it will call the Short URL Service. There'll be multiple instances of this service, which somehow

  • generates a short URL and stores it in the database and returns the short URL to the customer.

  • On the other side, this could also be used when somebody hits a short URL wherein the Short URL Service

  • would fetch the long URLs from the database and then redirect the user to the longer URL.

  • This looks nice.

  • But then we have not covered one important point that how does this service really generate a short URL.

  • Even though we had those 62 characters, for all the communication further we'll just assume,

  • we'll just use numbers so that it's easy to calculate and think over it.

  • Let's just say, there are multiple instances of this service and all of them are getting some requests.

  • This service could also generate a number 111, this service could also generate a number 111

  • and then we will end up with a collision.

  • If two services generate a same short URL for two requests then we have a problem.

  • In technical terms it's called a collision but for us it's a problem

  • because we cannot have one short URL pointing to two long URLs.

  • You can say that the possible solution could be that we first check in the database and then you know

  • kind of retry. That would be fine but then the problem is that's not really efficient.

  • What we need is a predictable way to generate a short URL knowing that there would be no collisions at all.

  • One very simple way to do that is using one of the features of Redis.

  • So let's just say, we use a Redis cluster and all of these services request a number from Redis.

  • Redis makes sure that it will always return a unique number. So basically what it will do is -

  • it will start the counting from 1 all the way up to billions, trillions, whatever.

  • And each time it gets a request, it increments the counter and responds back.

  • We will make sure that all of these guys are getting unique numbers and from

  • unique number at base 10, they can convert to base 62 and generate the short URL.

  • Now that would work fine but then we have a problem.

  • The problem is that first of all, everybody now is connecting to Redis.

  • So, we have Redis under a huge amount of load.

  • Second problem is - remember one of the important key things of any system design interview,

  • you should never create a single point of failure.

  • What is happening is - in all the requests that have to generate a short URL, this Redis becomes a single point of failure

  • and there is no backup to it. If this goes down then there is no way we can recover, which is a problem.

  • You can argue that we can have multiple Redis but that's also tricky.

  • Moreover, if the scale becomes beyond the capacity of one Redis machine, then we are in much bigger problem.

  • Why? Because, let's say, if one Redis is not able to scale to the latency requirements that we wanted to,

  • this will again start choking the whole system.

  • Another alternate is - that we keep multiple Redis that will give us high availability and it will give us

  • better performance because now there are multiple machines and some of these

  • machines will talk to this Redis and let's say this machine talks to this Redis.

  • This would work just fine till the time these Redis don't start generating duplicate numbers.

  • If both the Redis are starting from the same index, again they'll start generating duplicates.

  • What we need to do is - somehow make sure that both these Redis do not generate a same number.

  • One very simple way is - to give a series to this Redis, another series to this Redis to start with.

  • That would work fine. But, what if you have to add a third Redis.

  • Now it becomes complicated. Now you need somebody to manage what series are being given to whom.

  • Plus once you have that management piece, we might as well try to get away from Redis.

  • Let's try to look at a slightly more advanced system which can generate unique URLs without using a Redis,

  • without having a single point of failure.

  • Let's look at one of the possible solutions which can solve this problem.

  • Remember, what we need. Whenever we get a long URL and we need to convert it to a short URL,

  • we basically need to make sure that our service machine should be able to generate a unique number,

  • which will never ever get repeated even by any other instance of the service.

  • So, each of these two service instances should generate a number, which other ones will never ever generate.

  • Convert that number from base 10 to base 62 and return back. This is the flow that we need.

  • What we've done is - again going over the UI,

  • so this is the, green thing is basically the Long to Short UI,

  • from where somebody can give a long URL and expect to get a short URL as an output.

  • It talks to a load balancer and behind it are multiple instances of Short URL Service.

  • What I am trying to now do is - I have introduced a Token Service.

  • The idea is all of these guys need to have a number.

  • And making sure that none of them, none of the other machines generate the same number.

  • One of the simple ways to do that is -assign ranges to each of the machines.

  • I can say that - this Short URL Service will request a token range from Token Service.

  • Each time Token Service gets a request, it will run on a single threaded model

  • and it will give a range to any of the service instances.

  • Let's say, it says that, I am giving a range (1 - 1,000) to this machine.

  • Now, this machine on startup will also call Token Service.

  • So, all of these services call Token Service either when they start up

  • or when they are running out of the range.

  • Now, let's say, this service got a range from (1001 - 2000).

  • Similarly, other instances of the same service would also get some range.

  • This Token Service can be built on top of a MySQL, because it runs at a very low scale.

  • It will get a call once in a couple of hours and it will return like the ranges.

  • I have taken these numbers as (1 - 1,000). Realistically, it will be a bit larger number

  • so that at least it can run for an hour or so.

  • So, now Token Service has given numbers to each of these services.

  • Now, the flow is very simple.

  • Let's say, a request reaches this service. It takes the first number 1001, converts that into base 62 and

  • returns the output. Obviously after saving it.

  • The next request comes, it increments it. It takes 1002, saves it in Cassandra, returns back.

  • Next request comes in, it uses 1003.

  • Let's say, it somehow gets lots of requests, gets to 2000. At that point in time, maybe a bit before that also,

  • when we are closing the end of the range, it will query Token Service and get the next range.

  • Let's say, the next range that this service got is (5001 - 6000), let's say.

  • So, now it will continue processing this one.

  • Now, when (5001 - 6000) is assigned to this service, when this service will make a request,

  • it can possibly get (6001 - 7000), but it will never get this (5001 - 6000) range. Why?

  • Because that thing is maintained by Token Service and the way it would probably work is -

  • it will keep a range as a record and keep a flag, whether it is assigned or not.

  • In a simple MySQL, on transaction basis, you get a record. When you get a request, you take the

  • first unassigned token range and return that.

  • And because it will then sit on top of a MySQL database, it will make sure that it is always unique.

  • So, this becomes kind of our read flow.

  • Now, let's say, there is a massive amount of traffic.

  • All we need to do is - keep multiple instances of Token Service, at max.

  • Even though it might... so, there are multiple ways to scale it. First of all, we can increase the length of the range

  • so that it doesn't get bombarded too often. So, instead of thousands - thousands, we can allocate

  • millions of tokens at once so it doesn't get too many requests.

  • And anyway, this service will be distributed in multiple geographies at least and multiple data centers

  • so that this doesn't become a single point of failure.

  • There are some problems in this.

  • Possible problems are - What if this service got this (5001 - 6000) range, started working for a while,

  • used a couple of tokens and then it kind of got shut down and died?

  • Let's say, there was an Out Of Memory error and this process got killed. What happens to those tokens?

  • So, those tokens go away. Go away as in, there is no track record of how many tokens are being used.

  • All what is happening is - it is iterating over that list in-memory and assigning one token to each request.

  • If this service dies down, it will probably, on the same machine, it will get started up again

  • and the next time we will get another token range.

  • Let's say, third time, it gets a token range of (9001 - 10000).

  • Now, all the unused tokens of this (5001 - 6000) range, there is no record of them.

  • So, they are kind of gone forever.

  • If you look at it, let's say, even if this happens a couple of times in a day, overall how many tokens?

  • If with the length of 7 for a short URL, we are able to support 3.5 trillion unique numbers. That is a very massive number.

  • If you lose out a couple of thousands of tokens out of this range, it is like taking a bucket out of an ocean.

  • It doesn't really matter. So, even if you lose out some tokens, it's okay!!

  • But, if you start tracking those tokens, then it will become a fairly complicated system.

  • Just to make it easy for us to develop, we will say that - okay! when the system shuts down,

  • all those tokens go away. We don't need them. We will get a new range of tokens and work out of that.

  • That is one thing that we could do. This is basically the long URL to short URL path.

  • On the other side, when a short URL is hit, all you do is - you get a request in short URL, you hit your database

  • and you get the longer URL, you send it back to this service. This service does a redirect

  • and the user gets redirected to the main URL.

  • One thing I have not answered is - Why did I use a Cassandra?

  • Ideally, I could have used any database here. All we need to do is - to keep a database which can handle

  • these many number of unique URLs.

  • A MySQL, at these number of records would start giving some problems.

  • We can shard it probably and make it work.

  • Cassandra will, for sure, work. So, that's the reason I've kept Cassandra.

  • You can try using a MySQL here. With enough sharding and all of that, that would possibly also work fine.

  • That's not that big of a problem.

  • Plus this Token service is also using a MySQL. So, that could be shared across both the services as well.

  • Whatever we have built till now is good enough from a functional and non-functional standpoint

  • and it does what we were supposed to build.

  • But is it good enough? Definitely not!!

  • Why? Because this system, what we have built, does not give us any metrics about how it is being used,

  • what kind of URLs are the top most used URLs, what kind of geographies people come from

  • and it does not give any kind of metrics that we would want to have out of this system.

  • For example, whenever a person is generating a short URL, wouldn't it be nice, if we can tell them

  • what kind of geography do your people come from or what kind of hits are you getting

  • or what kind of user agents or devices the people are connecting from.

  • All of those things would be valuable information for any person who's generating a short URL.

  • Similarly for us, let's say, if we have multiple data centers.

  • Let's say, for example, we have 4 data centers and we want to choose any 2 of them as primary

  • and 2 of them as standby data centers - and they are spread across the globe.

  • If you have to decide which ones should be primary and which ones should be secondary,

  • normally, what companies do is - they just make a decision based on somebody's gut feeling.

  • But we could make a data-driven decision over here, depending upon what kind of traffic we are getting,

  • from what kind of geographies and wherever we are getting most amount of traffic from,

  • the data centers closer to those geographies could be made as primary data centers.

  • For doing all of these things, it will be good to have a good enough amount of analytics that this system emits out.

  • Let's look at how can we do that?

  • Let's see how the analytics is being built.

  • The normal flow, the way it works is - whenever we get a request with a short URL,

  • let's say, a client sends us a request. We query our database, get the longer URL from the database

  • based on the short URL and return it back to the client, which then redirects the user to the longer URL.

  • Instead of just simply returning the URL, what we'll essentially do is - each time the request comes to us,

  • that request will come with a lot of attributes. It will give us some origin header saying what is the platform

  • from where this request has come in. So, let's just say, we posted this link on a Facebook or a LinkedIn

  • kind of a platform, it will have that information. It will also have information about the user agent

  • which is calling, let's say, if it's an Android application or iOS application or a browser.

  • It will have that kind of an information as well and it will also have a source IP address.

  • All of these information, before sending out the response, we will also put that into a Kafka,

  • which will be used to then power the Analytics.

  • Based on the IP address, we'll also be able to come up with what country it is.

  • So, all of those things can be taken care of on the Analytics side.

  • But if we start putting into Kafka on each request when we get the request,

  • it will impact our non-functional requirement of latency.

  • Why? Because, now we are doing an additional step in the process and that will increase the latency.

  • So, I would not recommend doing that.

  • What instead we could do as a fairly nice solution is - make that Kafka request as a parallel call.

  • So, you can have a different thread, in which you can send the Kafka write and return back to the user.

  • And asynchronously, it can be returned to Kafka.

  • What is the problem in doing an asynchronous thing?

  • There is a potential possibility that the Kafka write could fail, for whatever reason, and

  • you have returned back to the user so you'll miss certain analytics.

  • Now, in this kind of a system, because payment and all is not involved, it's just some simple analytics,

  • we should be okay if we are losing out on certain events. So, I think it's okay if we build it this way also.

  • Worst case, we'll just lose out on some analytics, which is fine.

  • But, can we improve it even further?

  • So, the problem with this approach is - each time we write to Kafka, it involves some I/O (input/ output).

  • There's a CPU involved, there's a I/O involved, there's a network transfer involved.

  • Can we avoid all of this i.e. doing all of this on each call?

  • Probably! So, what we could do is - instead of writing into Kafka on each request,

  • we could maybe aggregate that information locally in a machine.

  • So, maybe you can have a queue or some kind of a data structure in which

  • you are persisting each record that we got a request for this short URL with count 1.

  • The request again comes in, you increment the count to count 2.

  • Something of that sort you could implement. Or you could simply make it as a queue of all the requests.

  • Whenever the size of the data structure crosses some threshold,

  • or you could do it on a time basis also, saying every 10 seconds you'll flush that.

  • So, in whatever condition you can flush the data into that data structure, into one call in Kafka.

  • Basically you are reducing the I/O that you are doing with each request and doing it as a batch write.

  • The benefit you get is - you can have now drive much more performance out of that single machine,

  • thereby helping us in the non-functional requirements of high availability, low latency, all of that.

  • And secondly, it will improve the overall performance of the system.

  • Again, the drawback here is - now you can lose out, not just one event but a lot of events.

  • Based on your conversation with your interviewer, you can decide on - if that's okay or not.

  • Personally, I would say it's a fairly okay thing if you lose out on 10, 20, 30 events. That's perfectly fine.

  • Now, we have put all the events into Kafka. Now what happens?

  • So, there are a bunch of ways you can implement Analytics.

  • One very simple approach is - you can dump all of this data into a Hadoop and then build some

  • Hive queries and some kind of queries on top of it, which will generate you aggregate results.

  • Alternatively, you could also have a Spark Steaming job running, which comes up every 10 minutes, let's say

  • and takes all the data in last 10 minutes and does some aggregate analysis on top of it,

  • saying which URL was it, how many times, which geographies people came in, how many times,

  • something of that sort and dumps the aggregate information into a data store,

  • which can then be used to power various kinds of analytics that user can see.

  • So, you could implement it in either way you want, based on the conversation with your interviewer.

  • So yeah!! I think this should be it for a URL shortening system.

Hello everyone! Welcome to codeKarle. In this video we'll be talking about

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

A2 初級

設計短網址服務(TinyURL System Design | URL Shortner System Design Interview Question | Bitly System Design)

  • 20 0
    meowu 發佈於 2023 年 10 月 17 日
影片單字