R Data Science Library Package
This chapter contains the following information:
For information about the Greenplum Database PL/R Language, see .
Parent topic: Installing Optional Extensions
Libraries provided in the R Data Science package include:
abind
adabag
arm
assertthat
BH
bitops
car
caret
caTools
coda
colorspace
compHclust
curl
data.table
DBI
dichromat
digest
dplyr
e1071
flashClust
forecast
foreign
ggplot2
glmnet
gtable
gtools
hms
hybridHclust
igraph
labeling
lattice
lazyeval
lme4
lmtest
magrittr
MASS
Matrix
MCMCpack
minqa
MTS
munsell
neuralnet
nloptr
nnet
pbkrtest
plyr
quantreg
R2jags
R6
RColorBrewer
Rcpp
RcppEigen
reshape2
rjags
RobustRankAggreg
ROCR
rpart
RPostgreSQL
sandwich
scales
SparseM
stringi
stringr
survival
tibble
tseries
zoo
Before you install the R Data Science Library package, make sure that your Greenplum Database is running, you have sourced greenplum_path.sh
, and that the $MASTER_DATA_DIRECTORY
and $GPHOME
environment variables are set.
Locate the R Data Science library package that you built or downloaded.
The file name format of the package is
DataScienceR-<version>-relhel<N>-x86_64.gppkg
.Copy the package to the Greenplum Database master host.
Follow the instructions in to verify the integrity of the Greenplum Procedural Languages R Data Science Package software.
Use the
gppkg
command to install the package. For example:gppkg
installs the R Data Science libraries on all nodes in your Greenplum Database cluster. The command also sets theR_LIBS_USER
environment variable and updates thePATH
andLD_LIBRARY_PATH
environment variables in your file.Restart Greenplum Database. You must re-source
greenplum_path.sh
before restarting your Greenplum cluster:$ source /usr/local/greenplum-db/greenplum_path.sh
$ gpstop -r
The Greenplum Database R Data Science Modules are installed in the following directory:
Note: rjags
libraries are installed in the $GPHOME/ext/DataScienceR/extlib/lib
directory. If you want to use rjags
and your $GPHOME
is not /usr/local/greenplum-db
, you must perform additional configuration steps to create a symbolic link from $GPHOME
to on each node in your Greenplum Database cluster. For example:
$ gpssh -f all_hosts -e 'ln -s $GPHOME /usr/local/greenplum-db'
$ gpssh -f all_hosts -e 'chown -h gpadmin /usr/local/greenplum-db'
Use the gppkg
utility to uninstall the R Data Science Library package. You must include the version number in the package name you provide to gppkg
.
To determine your R Data Science Library package version number and remove this package:
The command removes the R Data Science libraries from your Greenplum Database cluster. It also removes the R_LIBS_USER
environment variable and updates the PATH
and LD_LIBRARY_PATH
environment variables in your greenplum_path.sh
file to their pre-installation values.
Re-source greenplum_path.sh
and restart Greenplum Database after you remove the R Data Science Library package:
$ gpstop -r