Using PXF with Unmanaged Data

    PXF includes built-in connectors for accessing data inside HDFS files, Hive tables, and HBase tables. PXF also integrates with HCatalog to query Hive tables directly.

    PXF allows users to create custom connectors to access other parallel data stores or processing engines. To create these connectors using Java plug-ins, see the .

    • Installing PXF Plug-ins

      This topic describes how to install the built-in PXF service plug-ins that are required to connect PXF to HDFS, Hive, and HBase. You should install the appropriate RPMs on each node in your cluster.

    • This topic describes how to configure the PXF service.

    • Accessing HDFS File Data

    • This topic describes how to access HBase data using PXF.

    • Accessing JSON Data

      This topic describes how to access JSON data using PXF.

    • Using Profiles to Read and Write Data

      PXF profiles are collections of common metadata attributes that can be used to simplify the reading and writing of data. You can use any of the built-in profiles that come with PXF or you can create your own.

    • You can use the PXF API to create your own connectors to access any other type of parallel data store or processing engine.

    • Troubleshooting PXF