Graph Algorithms
Annotate vertices of a directed graph with the in-degree.
Optional Configuration:
setIncludeZeroDegreeVertices: by default only the edge set is processed for the computation of degree; when this flag is set an additional join is performed against the vertex set in order to output vertices with an in-degree of zero.
setParallelism: override the operator parallelism
VertexOutDegree
Annotate vertices of a directed graph with the out-degree.
DataSet<Vertex<K, LongValue>> outDegree = graph
.run(new VertexOutDegree().setIncludeZeroDegreeVertices(true));
Optional Configuration:
setIncludeZeroDegreeVertices: by default only the edge set is processed for the computation of degree; when this flag is set an additional join is performed against the vertex set in order to output vertices with an out-degree of zero.
setParallelism: override the operator parallelism
VertexDegrees
Annotate vertices of a directed graph with the degree, out-degree, and in-degree.
.run(new VertexDegrees().setIncludeZeroDegreeVertices(true));
Optional configuration:
setIncludeZeroDegreeVertices: by default only the edge set is processed for the computation of degree; when this flag is set an additional join is performed against the vertex set in order to output vertices with out- and in-degree of zero.
setParallelism: override the operator parallelism.
EdgeSourceDegrees
Annotate edges of a directed graph with the degree, out-degree, and in-degree of the source ID.
DataSet<Edge<K, Tuple2<EV, Degrees>>> sourceDegrees = graph
.run(new EdgeSourceDegrees());
Optional configuration:
- setParallelism: override the operator parallelism.
Annotate edges of a directed graph with the degree, out-degree, and in-degree of the target ID.
- setParallelism: override the operator parallelism.
EdgeDegreesPair
Annotate edges of a directed graph with the degree, out-degree, and in-degree of both the source and target vertices.
DataSet<Vertex<K, LongValue>> degree = graph
.setIncludeZeroDegreeVertices(true)
.setReduceOnTargetId(true));
Optional Configuration:
setIncludeZeroDegreeVertices: by default only the edge set is processed for the computation of degree; when this flag is set an additional join is performed against the vertex set in order to output vertices with a degree of zero
setParallelism: override the operator parallelism
setReduceOnTargetId: the degree can be counted from either the edge source or target IDs. By default the source IDs are counted. Reducing on target IDs may optimize the algorithm if the input edge list is sorted by target ID.
EdgeSourceDegree
Annotate edges of an undirected graph with degree of the source ID.
DataSet<Edge<K, Tuple2<EV, LongValue>>> sourceDegree = graph
.run(new EdgeSourceDegree()
.setReduceOnTargetId(true));
Optional Configuration:
setParallelism: override the operator parallelism
setReduceOnTargetId: the degree can be counted from either the edge source or target IDs. By default the source IDs are counted. Reducing on target IDs may optimize the algorithm if the input edge list is sorted by target ID.
EdgeTargetDegree
Annotate edges of an undirected graph with degree of the target ID.
DataSet<Edge<K, Tuple2<EV, LongValue>>> targetDegree = graph
.setReduceOnSourceId(true));
Optional configuration:
setParallelism: override the operator parallelism
setReduceOnSourceId: the degree can be counted from either the edge source or target IDs. By default the target IDs are counted. Reducing on source IDs may optimize the algorithm if the input edge list is sorted by source ID.
Annotate edges of an undirected graph with the degree of both the source and target vertices.
Optional configuration:
setReduceOnTargetId: the degree can be counted from either the edge source or target IDs. By default the source IDs are counted. Reducing on target IDs may optimize the algorithm if the input edge list is sorted by target ID.
MaximumDegree
Filter an undirected graph by maximum degree.
.run(new MaximumDegree(5000)
.setBroadcastHighDegreeVertices(true)
.setReduceOnTargetId(true));
Optional configuration:
setBroadcastHighDegreeVertices: join high-degree vertices using a broadcast-hash to reduce data shuffling when removing a relatively small number of high-degree vertices.
setParallelism: override the operator parallelism
setReduceOnTargetId: the degree can be counted from either the edge source or target IDs. By default the source IDs are counted. Reducing on target IDs may optimize the algorithm if the input edge list is sorted by target ID.
Simplify
Remove self-loops and duplicate edges from a directed graph.
graph.run(new Simplify());
TranslateGraphIds
Translate vertex and edge IDs using the given TranslateFunction
.
graph.run(new TranslateGraphIds(new LongValueToStringValue()));
Required configuration:
- translator: implements type or value conversion
Optional configuration:
- setParallelism: override the operator parallelism
Translate vertex values using the given TranslateFunction
.
Required configuration:
- translator: implements type or value conversion
Optional configuration:
- setParallelism: override the operator parallelism
TranslateEdgeValues
Translate edge values using the given TranslateFunction
.
Required configuration:
- translator: implements type or value conversion