Latency vs Throughput

In today's blog we are going to discuss the difference between latency and throughput. These are two important metrics when analysing the performance of our systems.

So let's just get right into it!

Latency

As a fancy definition of latency, we can say that latency is the amount of time it takes data to traverse from one part of your system to another. When we think about latency, we think of a client-server architecture, and this specific latency is the amount of time it takes for a client to get a response from the server.

But there are other ways of latency. For example, reading a piece of data from memory or from disk.

In an ideal scenario, you would want to minimize the latency of your system, because this might make a big difference on the client side. The lower the client has to wait for his request, the happier it is.

There might be some tuning needed in order to make our latency lower. For instance, use networks with high speeds, make use of mechanisms such as cache or reading from memory (instead of disks), have geographically distributed systems (for example, when a game has a server in Europe and another in America).

Throughput

Throughput is the amount of work a machine can do in a given period of time. An example of this would be the amount of requests a server can handle in a second.

In an ideal scenario, we would want our system to serve as many requests as possible, meaning high throughput. If you run a business, the more clients you can serve, the higher the revenue you will get from it is.

Stuff you can do to make your system's throughput higher include have one super powerful server or have multiple servers to handle the requests. This is quite a debate on whether to use one powerful server or have multiple servers, but if your system serves a lot of traffic, chances are you're better off by having multiple decent servers.

Notes

There one important aspect we must have in consideration. We can’t conclude anything about latency/throughput, by taking in consideration the other one. Let me give you a couple of examples to clarify this.

First scenario: does low latency means high throughput?

  • The answer is no. For example, our server may be a really slow one (low throughput), but it is not serving requests at the moment. This means that my request will be served immediately by the server, thus resulting in low latency.

Second scenario: does low throughput means high latency?

  • Again, the answer is no. This example is tied to the one above. Just because our server is slow (low throughput), doesn't mean we will necessarily have high latency. Our server can handle a couple of requests and be really fast in serving them.

😁 I hope this has helped!

That's everything for now! Thank you so much for reading.

If you have any questions please feel free to drop them off. Follow me if you want to read more about these kinds of topics!

Social Media Links: LinkedIn Twitter Github