128. Parallel processing in Python

Sometimes we have functions
, or complete models, that may be run in parallel across CPU cores. This may save significant time when we have access to computers to multiple cores. Here we use the joblib library to show how we can run functions in parallel (pip install joblib if not already installed).

Import libraries

import numpy as np
import time
from joblib import Parallel, delayed

Mimic something to be run in parallel

Here we will create a function that takes 2 seconds to run, to mimic a model or a complex function.

def my_slow_function(x):
    """A very slow function, which takes 1 second to double a number"""
    
    # A 1 second delay
    time.sleep (1)
    
    return x * 2

Running our function sequentially in a ‘for’ loop

# Get start time
start = time.time()
# Run functions 8 times with different input (using a list comprehension)
trial_output = [my_slow_function(i) for i in range(8)]
print(trial_output)
# Get time taken
time_taken = time.time() - start
# Print time taken
print (f'Time taken: {time_taken:.1f} seconds')

OUT:

[0, 2, 4, 6, 8, 10, 12, 14]
Time taken: 8.0 seconds

Running our function in parallel using joblib

n_jobs is the maximum number of CPU cores to use. If set to -1, all available cores will be used.

# Get start time
start = time.time()
# Run functions 8 times with different input using joblib
trial_output = \
    Parallel(n_jobs=-1)(delayed(my_slow_function)(i) for i in range(8))
print(trial_output)
# Get time taken
time_taken = time.time() - start
# Print time taken
print (f'Time taken: {time_taken:.1f} seconds')


[0, 2, 4, 6, 8, 10, 12, 14]
Time taken: 1.3 seconds

That’s a good improvement in speed!

Checking pseudo-random number generation

Pseudo-random number generators, if not provided with a seed, use the computer clock to generate the seed. This means some methods of parallel processing will generate sets of random numbers that may be the same. By default joblib uses the loki backend which prevents this occurring, but let’s check.

def numpy_random():
    """Generate a random number using NumPy"""
    return np.random.random()

Parallel(n_jobs=-1)(delayed(numpy_random)() for i in range(5))

Out:
[0.5268839074941227,
 0.12883536669358964,
 0.14084785209998263,
 0.4166795926896423,
 0.19189235808368665]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s