Sunday, 11 September 2016

Running commands in parallel with GNU

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. GNU Parallel dynamically distribute the commands across all of the nodes and cores that were requested by a pbs (portable batch system) job.

Using parallel to run commands on single node:

Put the commands in a file. In this example commands.txt:

[root@devbox ~]# cat commands.txt
uname -a
date
df -h
uptime

Run the parallel command specifying number of jobs/commands & direct input as the text file containing the commands.

[root@devbox ~]# parallel --jobs 4 < commands.txt
Academic tradition requires you to cite works you base your article on.
When using programs that use GNU Parallel to process data for publication
please cite:
  O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
  ;login: The USENIX Magazine, February 2011:42-47.
This helps funding further development; and it won't cost you a cent.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
To silence the citation notice: run 'parallel --bibtex'.

Linux devbox 3.10.0-123.el7.x86_64 #1 SMP Mon Jun 30 12:09:22 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Fri Sep  9 22:00:26 PDT 2016
Filesystem             Size  Used Avail Use% Mounted on
/dev/sda3               18G  6.4G   12G  36% /
devtmpfs               482M     0  482M   0% /dev
tmpfs                  490M   84K  490M   1% /dev/shm
tmpfs                  490M  7.1M  483M   2% /run
tmpfs                  490M     0  490M   0% /sys/fs/cgroup
/dev/mapper/repo-epel  5.0G  121M  4.9G   3% /epel
/dev/sda1              297M  106M  192M  36% /boot
 22:00:26 up 19:05,  4 users,  load average: 0.00, 0.01, 0.05


Using parallel to run commands on multiple node:

In this case specify the number of jobs as 1 equating to one job per node & specify the node names in a file unique-nodelist.txt & run the parallel command as follows:

[root@devbox ~]# cat unique-nodelist.txt
192.168.44.137
192.168.44.135
[root@devbox ~]#

[root@devbox ~]# cat commands.txt
uname -a
uname -a
[root@devbox ~]#

parallel --jobs 1 --sshloginfile unique-nodelist.txt --workdir $PWD < commands.txt


Academic tradition requires you to cite works you base your article on.
When using programs that use GNU Parallel to process data for publication
please cite:
  O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
  ;login: The USENIX Magazine, February 2011:42-47.
This helps funding further development; and it won't cost you a cent.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
To silence the citation notice: run 'parallel --bibtex'.

root@192.168.44.135's password: Linux devbox 3.10.0-123.el7.x86_64 #1 SMP Mon Jun 30 12:09:22 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Linux rheldb 2.6.32-220.el6.x86_64 #1 SMP Wed Nov 9 08:03:13 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
[root@devbox ~]#

No comments:

Post a Comment

Using capture groups in grep in Linux

Introduction Let me start by saying that this article isn't about capture groups in grep per se. What we are going to do here with gr...