Preparing and Adding Nodes

    To prepare new system nodes for expansion, install the Greenplum Database software binaries, exchange the required SSH keys, and run performance tests.

    Run performance tests first on the new hosts and then all hosts. Run the tests on all hosts with the system offline so user activity does not distort results.

    Generally, you should run performance tests when an administrator modifies host networking or other special conditions in the system. For example, if you will run the expanded system on two network clusters, run tests on each cluster.

    Note: Preparing host systems for use by a Greenplum Database system assumes that the new hosts’ operating system has been properly configured to match the existing hosts.

    Parent topic: Expanding a Greenplum System

    New hosts must exchange SSH keys with the existing hosts to enable Greenplum administrative utilities to connect to all segments without a password prompt. Perform the key exchange process twice with the gpssh-exkeys utility.

    First perform the process as root, for administration convenience, and then as the user gpadmin, for management utilities. Perform the following tasks in order:

    1. To exchange SSH keys as root
    2. To exchange SSH keys as the gpadmin user
    1. Log in as on the master host, and source the greenplum_path.sh file from your Greenplum installation.

      1. $ su -
      2. # source /usr/local/greenplum-db/greenplum_path.sh
    2. Run the gpssh-exkeys utility referencing the host list files. For example:

      1. # gpssh-exkeys -e /home/gpadmin/existing_hosts_file -x
      2. /home/gpadmin/new_hosts_file
    3. gpssh-exkeys checks the remote hosts and performs the key exchange between all hosts. Enter the root user password when prompted. For example:

      1. ***Enter password for root@hostname: <root_password>
    1. Use gpssh to create the gpadmin user on all the new segment hosts (if it does not exist already). Use the list of new hosts you created for the key exchange. For example:

    2. Set a password for the new gpadmin user. On Linux, you can do this on all segment hosts simultaneously using gpssh. For example:

      1. # gpssh -f new_hosts_file 'echo gpadmin_password | passwd
      2. gpadmin --stdin'
      1. # gpssh-exkeys -e /home/gpadmin/<existing_hosts_file> -x
      2. /home/gpadmin/new_hosts_file
    1. gpssh-exkeys will check the remote hosts and perform the key exchange between all hosts. Enter the gpadmin user password when prompted. For example:

    Use the gpcheck utility to verify all new hosts in your array have the correct OS settings to run Greenplum Database software.

    1. Log in on the master host as the user who will run your Greenplum Database system (for example, gpadmin).

      1. $ su - gpadmin
    2. Run the gpcheck utility using your host file for new hosts. For example:

      1. $ gpcheck -f new_hosts_file

    Use the gpcheckperf utility to test disk I/O and memory bandwidth.

    1. Run the gpcheckperf utility using the host file for new hosts. Use the -d option to specify the file systems you want to test on each host. You must have write access to these directories. For example:

      1. $ gpcheckperf -f new_hosts_file -d /data1 -d /data2 -v
    2. The utility may take a long time to perform the tests because it is copying very large files between the hosts. When it is finished, you will see the summary results for the Disk Write, Disk Read, and Stream tests.

    Before initializing the system with the new segments, shut down the system with gpstop to prevent user activity from skewing performance test results. Then, repeat the performance tests using host files that include all nodes, existing and new: