The CRAN binary distributions for MAC are NOT OpenMP enabled. Mac Users who want parallel processing will have to compile the packages themselves.
The steps to creating an openMP enabled package are as follows:
> /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
There are several ways to control the number of CPU cores that the
package accesses during OpenMP parallel execution. First, you will need
to determine the number of cores on your local machine. Do this by
starting an R session and issuing the command
detectCores()
. You will require the parallel package for
this.
Then you can do the following:
At the start of every R session, you can set the number of cores
accessed during OpenMP parallel execution by issuing the command
options(rf.cores = x)
, where x
is the number
of cores. If x
is a negative number, the package will
access the maximum number of cores on your machine. The options command
can also be placed in the users .Rprofile file for convenience. You can,
alternatively, initialize the environment variable RF_CORES
in your shell environment.
The default value for rf.cores
is -1 (-1L), if left
unspecified, which uses all available cores, with a minimum of two.
The package also implements R-side parallel processing via the
parallel
package contained in the base R
distribution. However, the parallel
package must be
explicitly loaded to take advantage of this functionality. When this is
the case, the R function lapply()
is replaced with the
parallel version mclapply()
. You can set the number of
cores accessed by mclapply()
by issuing the command
options(mc.cores = x)
where x
is the number of cores. The options command can
also be placed in the users .Rprofile file for convenience. You can,
alternatively, initialize the environment variable MC_CORES
in your shell environment. See the help files in parallel for more
information.
The default value for mclapply()
on non-Windows systems
is two (2L) cores. On Windows systems, the default value is one (1L)
core.
As an example, issuing the following options command uses all available cores for both OpenMP and R-side processing:
options(rf.cores=detectCores(), mc.cores=detectCores())
As stated above, this option command can be placed in the users .Rprofile file.
Cautionary Note on Parallel Execution
1. Once
the package has been compiled with OpenMP enabled, trees will be grown
in parallel using the rf.cores
option. Independently of
this, we also utilize mclapply()
to parallelize loops in
R-side pre-processing and post-processing of the forest. This is always
available and independent of whether the user chooses to compile the
package with the OpenMP option enabled.
2. It is important to NOT
write programs that fork R processes containing OpenMP threads. That is,
one should not use mclapply()
around the functions
rfsrc()
, predict.rfsrc()
,
vimp.rfsc()
, var.select.rfsrc()
,
find.interaction.rfsrc()
and partial.rfsrc()
.
In such a scenario, program execution is not guaranteed.
Cite this vignette as
H. Ishwaran, M. Lu, and
U. B. Kogalur. 2021. “randomForestSRC: parallel processing vignette.” http://randomforestsrc.org/articles/parallel.html.
@misc{HemantParallel,
author = "Hemant Ishwaran and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: parallel processing vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/parallel.html},
howpublished = "\url{http://randomforestsrc.org/articles/parallel.html}",
note = "[accessed date]"
}