Specify gphdfs Protocol in an External Table Definition (Deprecated)
In a Hadoop HA cluster, the LOCATION
clause references the logical nameservices id (the dfs.nameservices
property in the hdfs-site.xml
configuration file). The hdfs-site.xml
file with the nameservices configuration must be installed on the Greenplum master and on each segment host.
For example, if dfs.nameservices
is set to mycluster
the LOCATION
clause takes this format:
A cluster without HA specifies the hostname and port of the name node in the LOCATION
clause:
If you are using MapR clusters, you specify a specific cluster and the file:
To specify the default cluster, the first entry in the MapR configuration file
/opt/mapr/conf/mapr-clusters.conf
, specify the location of your table with this syntax:
For information about MapR clusters, see the MapR documentation.
Restrictions for HDFS
files are as follows.
You can specify one path for a readable external table with
gphdfs
. Wildcard characters are allowed. If you specify a directory, the default is all files in the directory.The URI of the clause cannot contain any of these four characters:
\
,'
,<
,>
. TheCREATE EXTERNAL TABLE
returns a an error if the URI contains any of the characters.
Parent topic: Accessing HDFS Data with gphdfs (Deprecated)
Compression options for Hadoop Writable External Tables use the form of a URI query and begin with a question mark. Specify multiple compression options with an ampersand (&
).
Place compression options in the query portion of the URI.