When to Use Ruby Threads?

August 14, 2017

software development ruby threads mri global interpreter lock gil concurrency

When to use Ruby threads? This question came up recently at work and I experimented a little to have a better understanding of Ruby threads. In this experiment, I’ve found Ruby threads to be useful when performing remote operations such as making multiple HTTP requests and not useful when performing computationally heavy local operations.

Ruby threads are governed by the Global Interpreter Lock (GIL). What this means is that there is only one Ruby operation executed by the interpreter each time. To illustrate this point, I have created a demo script which executes 2 types of operations:

HTTP requests to webservers which take time to respond
Local executions which perform a counting operation

In both cases, there would be 2 similar tasks, and both will be executed with threads as well as without threads. We will then see the difference in the time taken to complete the entire tasks.

In the HTTP requests test, I have set up 2 instances of a web server, which simply delays its response by 5 seconds. In the first scenario, the requests to each web server are made sequentially, without using threads. In the second scenario, the requests are performed in parallel using threads.

def api_sequential
  p "REMOTE CALL WITHOUT THREADS"
  measure do
    URLS
      .map { |url| Faraday.get url }
      .map(&:status)
  end
end

def api_threaded
  p "REMOTE CALL WITH THREADS"
  measure do
    URLS
      .map { |url| Thread.new { Faraday.get(url) } }
      .map(&:value)
      .map(&:status)
  end
end

I found that with threads, the entire operation took almost half the time, which means that the 2 requests were performed almost in parallel.

"REMOTE CALL WITHOUT THREADS"
200
200
Total time: 10.064001

"REMOTE CALL WITH THREADS"
200
200
Total time: 5.009129

For the local execution tests, I have set up a function which simply increments a counter up to 100,000,000. Similar to the previous task, I would execute 2 operations, first sequentially followed by using threads.

def op_sequential
  p "CODE EXECUTION WITHOUT THREADS"
  measure do
    2.times.map do
      start_counting
    end
  end
end

def op_threaded
  p "CODE EXECUTION WITH THREADS"
  measure do
    2.times.map do
      Thread.new { start_counting }
    end.map(&:value)
  end
end

In this case, there was no significant improvement in the total operation time.

"CODE EXECUTION WITHOUT THREADS"
100000000
100000000
Total time: 9.074766

"CODE EXECUTION WITH THREADS"
100000000
100000000
Total time: 9.069736

The reason for this difference in behaviour between a remote call and a local execution goes back to the GIL. The GIL ensures that only one line of code is executed each time by the interpreter. In the case of the HTTP requests, after the first thread has sent the HTTP request, the interpreter is available to execute the second thread and send the second HTTP request, reducing the total waiting time for the HTTP requests to complete. In the counting scenario however, the interpreter needs to complete both counting operations, thus requiring the same amount of time even with threads.

The code for this experiment can be found in this repository.

For more info on Ruby threads:

https://buildingvts.com/threading-in-mri-ruby-for-fun-and-performance-34a0e1bc6c70

Exploring Date Manipulation in Rails

Exploring date manipulation in Rails

rails ruby

Avoiding Test Data Mutation

October 24, 2017

ruby test rspec tdd