Experimenting with distributed queries to servers in Golang

One day, I came accross the Go Concurrency Patterns talk made by Rob Pike (one of the creators of Golang) and I found it fascinating. After this talk, I wanted to explore a bit more the concept of the Google Search code given at the end of the talk.

The goal is to find a behaviour that could be used by a search engine to handle a search query. We have got 3 services (web, images and videos – no ads ahah!) and we want to perform a search on each service according to the query. The goal is to respond as fast as possible.

Architecture

We have got multiple instances of each service. We are going to send the search query in parallel to available instances of web servers, images servers and videos servers. For each server we will take the first returned search result, to meet our goal to respond as fast as possible.

Hyperparameters

We will assume that each server answers a query in a time that follows a normal distribution (the mean is explicit given and is referred to as latency, the standard derivation is inferred from the latency). A search has also a timeout which represents the number of milliseconds we are willing to wait to have search results before exiting (it is possible that search results from all the services have not yet arrived). This is referred to as the timeout parameter.

Finally, we can control how many instances of each service we have available. This is referred to as the replicas parameter.

Execution samples

To test how the variation of the different parameters influence the number of results and when they are returned, you can find below some executions and their results:

# High latency but large number of replicas
./fake-google -timeout 20 -replicas 200 -latency 28
[
  {Search result: `res for: test query` from: `image95` in 18.695281ms}
  {Search result: `res for: test query` from: `web129` in 17.11128ms}
  {Search result: `res for: test query` from: `video13` in 19.058285ms}
]

# High latency but normal number of replicas
./fake-google -timeout 20 -replicas 100 -latency 28
[
  {Search result: `res for: test query` from: `web90` in 19.499019ms}
]

# High latency, very low number of replicas
./fake-google -timeout 20 -replicas 10 -latency 25
[]

# Latency is the same as the timeout and we've got enough replicas
./fake-google -timeout 20 -replicas 100 -latency 20
[
  {Search result: `res for: test query` from: `web90` in 12.735776ms}
  {Search result: `res for: test query` from: `image63` in 12.727817ms}
  {Search result: `res for: test query` from: `video26` in 13.02499ms}
]

Nothing unexpected in these results, this can all be verified by computing probabilities on multiple independent normal laws.

Ameliorations

The existing code is super simple and is definitely not ready for a real life scenario. We could for instance, improve the following points:

I assume that all replicas are always available. The notion of an available replica is hard to define. We don’t want to send requests to replicas that are not healthy, down or are already overwhelmed
I assume that the number of replicas is the same for each service
I assume that the response time of every replica follows a normal law, and is query independent

And countless other things I didn’t think of in a 2-minute window.

Code

Putting aside all the ameliorations I just listed, I find the existing code still interesting because it shows how to use advanced concurrency patterns in Go. The code is available on GitHub, and the main logic resides in the file core/core.go.