99. Parallel processing functions and loops with dask ‘delayed’ method

For a full SciPy conference video on dask see: SciPy 2018

Dask is a Python library that allows parts of program to run in parallel in separate cpu threads to speed up the program.

Here we will look at using dask to run a normal function in parallel when we need to call the function more than once in one part of a program. We will mimic a slow function by using the Python sleep() method to make the function take on second each time it is run. Normally it would take 3 seconds to run this function 3 times, but here we will see that with dask all three calls to the function will be complete in one second (assuming you have at least a dual core, 4-thread cpu).

We will first import our required libraries.

from time import time # to time the program
from time import sleep # to mimic a slow function
from dask import delayed # to allow parallel computation

Next we define a normal function (there is no use of dask at this this point). Here we will write a function that returns the square of the number passed to the function, but we’ll add a 1 second sleep in it to mimic a longer running function.

# Define a function normally
def my_function(x):
    # mimic a slow function with sleep for 1 seconds
    return x*2

Now we will call that function three times.

Normally this would take three seconds as each function must complete before the next one can start. But by using the decorator ‘delayed’ we mark this as a function call that may be run in parallel with others.

Note the syntax amendment. We would normally call this function with my_function(x), but we amend the syntax to delayed(my_function)(x).

We then calculate the sum of the three returned numbers from our function. But when using dask this does not actually give us our answer. If we print the type of this object we see that it is a ‘delayed’ object. To get the actual result we must then use the .compute() method as shown below.

Then we see how long these three 1 second function calls take. If you have a processor with at least 2 CPUS and 4 threads you should see it takes close to one second rather than three!

# Record time at start of run
start = time()

# Run function in parallel when calling three times
# Syntax of my_function(x) is replaced with delayed(my_function)(x)
a = delayed(my_function)(1)
b = delayed(my_function)(2)
c = delayed(my_function)(3)

# Total will sum results. But at this point we generate a 'delayed' object   
total = a + b + c

# Show object type of total
print ('Object type of total', type(total))

# To get the result we must use 'compute':
final_result = total.compute()

# Calculate time taken and print results
time_taken = time()-start
print ('Process took %0.2f seconds' %time_taken)
print('Final result ',final_result)
Object type of total <class 'dask.delayed.Delayed'>
Process took 1.01 seconds
Final result  12

Using dask ‘delayed’ in a loop

We can also use dask delayed to parallel process data in a loop (so long as an iteration of the loop does not depend on previous results). Here we will call our function 10 times in a loop. Note the use of .compute again to get the actual result. This would take 10 seconds without dask. On a 4-cour/8-thread CPU it takes two seconds!

start = time()

# Example loop will add results to a list and calculate total
results = []
for i in range(10):
    # Call normal function with dask delayed decorator
    x = delayed(my_function)(i)

total = sum(results)
final_result = total.compute()

# Calculate time taken and print results
time_taken = time()-start
print ('Process took %0.2f seconds' %time_taken)
print('Final result ',final_result)
Process took 2.01 seconds
Final result  90


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s