Pherf is a standalone tool that can perform performance and functional testing through Phoenix. Pherf can be used both generate highly customized data sets and to measure performance of SQL against that data.

mvn clean package -DskipTests

Running

  • Edit the config/env.sh to include the required property values.
  • bin/pherf-standalone.py -h
  • To use libraries included with HBase deployment on a cluster: bin/pherf-cluster.py -h
  • Example: bin/pherf-cluster.py -drop all -l -q -z [zookeeper] -schemaFile .user_defined_schema.sql -scenarioFile .user_defined_scenario.xml HBASE_CONF_DIR, HBASE_DIR environment variable needs to be set to use against a cluster deployment

$./pherf-standalone.py -listFiles

Pherf arguments:

  • -h Help
  • -l Apply schema and load data
  • -q Executes Multi-threaded query sets and write results
  • -m Enable monitor for statistics
  • -monitorFrequency [frequency in Ms] _Frequency at which the monitor will snopshot stats to log file.
  • -drop [pattern] Regex drop all tables with schema name as PHERF. Example drop Event tables: -drop .(EVENT). Drop all: -drop .* or -drop all
  • -scenarioFile Regex or file name of a specific scenario file to run.
  • -schemaFile Regex or file name of a specific schema file to run.
  • -export Exports query results to CSV files in CSV_EXPORT directory
  • -diff Compares results with previously exported results
  • -hint Executes all queries with specified hint. Example SMALL
  • -rowCountOverride
  • -rowCountOverride [number of rows] Specify number of rows to be upserted rather than using row count specified in schema

Review test_scenario.xml for syntax examples.

  • Rules are defined as and are applied in the order they appear in file.
  • Rules of the same type override the values of a prior rule of the same type. If true is set, rule will only apply override when type and name match the column name in Phoenix.
  • tag is set at the column level. It can be used to define a constant string appended to the beginning of CHAR and VARCHAR data type values.
  • Required field Supported Phoenix types: VARCHAR, CHAR, DATE, DECIMAL, INTEGER
    • denoted by the tag
  • User defined true changes rule matching to use both name and type fields to determine equivalence.
    • Default is false if not specified and equivalence will be determined by type only. An important note here is that you can still override rules without the user defined flag, but they will change the rule globally and not just for a specified column.
  • Required field Length defines boundary for random values for CHAR and VARCHAR types.
    • denoted by the tag
  • Column level Min/Max value defines boundaries for numerical values. For DATES, these values supply a range between which values are generated. At the column level the granularity is a year. At a specific data value level, the granularity is down to the Ms.
    • denoted by the tag
    • denoted by the tag
  • Null chance denotes the probability of generating a null value. From [0-100]. The higher the number, the more likely the value will be null.
    • denoted by
  • Name can either be any text or the actual column name in the Phoenix table.
    • denoted by the
  • Value List is used in conjunction with LIST data sequences. Each entry is a DataValue with a specified value to be used when generating data.
    • Denoted by the tags
    • If the distribution attribute on the datavalue is set, values will be created according to that probability.
    • When distribution is used, values must add up to 100%.

Defining Scenario

Scenario can have multiple querySets. Consider following example, concurrency of 1-4 means that each query will be executed starting with concurrency level of 1 and reach up to maximum concurrency of 4. Per thread, query would be executed to a minimum of 10 times or 10 seconds (whichever comes first). QuerySet by defult is executed serially but you can change executionType to PARALLEL so queries are executed concurrently. Scenarios are defined in XMLs stored in the resource directory.

Testing

Default quorum is localhost. If you want to override set the system variable.

Run unit tests: mvn test -DZK_QUORUM=localhost Run a specific method: mvn -Dtest=ClassName#methodName test