生产就绪情况核对清单
The max parallelism, set on a per-job and per-operator granularity, determines the maximum parallelism to which a stateful operator can scale.There is currently no way to change the maximum parallelism of an operator after a job has started without discarding that operators state. The reason maximum parallelism exists, versus allowing stateful operators to be infinitely scalable, is that it has some impact on your application’s performance and state size.Flink has to maintain specific metadata for its ability to rescale state which grows linearly with max parallelism.In general, you should choose max parallelism that is high enough to fit your future needs in scalability, while keeping it low enough to maintain reasonable performance.
Note: Maximum parallelism must fulfill the following conditions:
- : for all parallelism > 128.
As mentioned in the documentation for , users should set uids for each operator in their DataStream
.Uids are necessary for Flink’s mapping of operator states to operators which, in turn, is essential for savepoints.By default, operator uids are generated by traversing the JobGraph and hashing specific operator properties.While this is comfortable from a user perspective, it is also very fragile, as changes to the JobGraph (e.g., exchanging an operator) results in new UUIDs.To establish a stable mapping, we need stable operator uids provided by the user through setUid(String uid)
.
Currently, Flink’s savepoint binary format is state backend specific.A savepoint taken with one state backend cannot be restored using another, and you should carefully consider which backend you use before going to production.
The JobManager serves as a central coordinator for each Flink deployment, being responsible for both scheduling and resource management of the cluster.It is a single point of failure within the cluster, and if it crashes, no new jobs can be submitted, and running applications will fail.
Configuring High Availability, in conjunction with Apache Zookeeper, allows for a swift recovery and is highly recommended for production setups.