Parallelism

You will notice these commands with their characteristic naming. Each corresponds to a non-parallel version, allowing you to easily write code in a serial style first, and then go back and easily convert serial scripts into parallel scripts with a few extra characters.

The most common parallel command is par-each, a companion to the command.

Let’s say you wanted to count the number of files in each sub-directory of the current directory. Using each, you could write this as:

We create a record for each entry, and fill it with the name of the directory and the count of entries in that sub-directory.

Now, since this operation can be run in parallel, let’s convert the above to parallel by changing to par-each:

On this machine, it now runs in 6ms. That’s quite a difference!

You’ll notice, if you look at the results, that they come back in different orders each run (depending on the number of hardware threads on your system). As tasks finish, and we get the correct result, we may need to add additional steps if we want our results in a particular order. For example, for the above, we may want to sort the results by the “name” field. This allows both and par-each versions of our script to give the same result.