Standalone
The standalone mode is the most barebone way of deploying Flink: The Flink services described in the deployment overview are just launched as processes on the operating system. Unlike deploying Flink with a resource provider such as or YARN, you have to take care of restarting failed processes, or allocation and de-allocation of resources during operation.
In the additional subpages of the standalone mode resource provider, we describe additional deployment methods which are based on the standalone mode: , and on Kubernetes.
Preparation
Flink runs on all UNIX-like environments, e.g. Linux, Mac OS X, and Cygwin (for Windows). Before you start to setup the system, make sure your system fulfils the following requirements.
- Java 1.8.x or higher installed,
Downloaded a recent Flink distribution from the [download page](http://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/resource-providers/standalone/overview/
) and unpacked it.
Starting a Standalone Cluster (Session Mode)
These steps show how to launch a Flink standalone cluster, and submit an example job:
# (1) Start Cluster
$ ./bin/start-cluster.sh
# (2) You can now access the Flink Web Interface on http://localhost:8081
# (3) Submit example job
$ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar
# (4) Stop the cluster again
$ ./bin/stop-cluster.sh
In step (1)
, we’ve started 2 processes: A JVM for the JobManager, and a JVM for the TaskManager. The JobManager is serving the web interface accessible at localhost:8081. In step (3)
, we are starting a Flink Client (a short-lived JVM process) that submits an application to the JobManager.
To start a Flink JobManager with an embedded application, we use the bin/standalone-job.sh
script. We demonstrate this mode by locally starting the TopSpeedWindowing.jar
example, running on a single TaskManager.
The application jar file needs to be available in the classpath. The easiest approach to achieve that is putting the jar into the lib/
folder:
$ cp ./examples/streaming/TopSpeedWindowing.jar lib/
Then, we can launch the JobManager:
$ ./bin/standalone-job.sh start --job-classname org.apache.flink.streaming.examples.windowing.TopSpeedWindowing
The web interface is now available at . However, the application won’t be able to start, because there are no TaskManagers running yet:
$ ./bin/taskmanager.sh start
Note: You can start multiple TaskManagers, if your application needs more resources.
Stopping the services is also supported via the scripts. Call them multiple times if you want to stop multiple instances, or use stop-all
:
$ ./bin/taskmanager.sh stop
$ ./bin/standalone-job.sh stop
Session Mode
For high-level intuition behind the session mode, please refer to the .
Configuration
All available configuration options are listed on the , in particular the Basic Setup section contains good advise on configuring the ports, memory, parallelism etc.
If Flink is behaving unexpectedly, we recommend looking at Flink’s log files as a starting point for further investigations.
The log files are located in the logs/
directory. There’s a .log
file for each Flink service running on this machine. In the default configuration, log files are rotated on each start of a Flink service – older runs of a service will have a number suffixed to the log file.
Alternatively, logs are available from the Flink web frontend (both for the JobManager and each TaskManager).
By default, Flink is logging on the “INFO” log level, which provides basic information for all obvious issues. For cases where Flink seems to behave wrongly, reducing the log level to “DEBUG” is advised. The logging level is controlled via the conf/log4.properties
file. Setting will bootstrap Flink on the DEBUG log level.
There’s a dedicated page on the in Flink.
Component Management Scripts
Starting and Stopping a cluster
bin/start-cluster.sh
and bin/stop-cluster.sh
rely on conf/masters
and conf/workers
to determine the number of cluster component instances.
If password-less SSH access to the listed machines is configured, and they share the same directory structure, the scripts also support starting and stopping instances remotely.
Example 1: Start a cluster with 2 TaskManagers locally
conf/masters
contents:
conf/workers
contents:
localhost
localhost
Example 2: Start a distributed cluster JobManagers
This assumes a cluster with 4 machines (master1, worker1, worker2, worker3
), which all can reach each other over the network.
conf/masters
contents:
master1
conf/workers
contents:
worker1
worker2
worker3
Note that the configuration key jobmanager.rpc.address needs to be set to master1
for this to work.
We show a third example with a standby JobManager in the .
Starting and Stopping Flink Components
The scripts can be called multiple times, for example if multiple TaskManagers are needed. The instances are tracked by the scripts, and can be stopped one-by-one (using stop
) or all together (using stop-all
).
Windows Cygwin Users
If you are installing Flink from the git repository and you are using the Windows git shell, Cygwin can produce a failure similar to this one:
c:/flink/bin/start-cluster.sh: line 30: $'\r': command not found
This error occurs because git is automatically transforming UNIX line endings to Windows style line endings when running on Windows. The problem is that Cygwin can only deal with UNIX style line endings. The solution is to adjust the Cygwin settings to deal with the correct line endings by following these three steps:
Start a Cygwin shell.
Determine your home directory by entering
cd; pwd
This will return a path under the Cygwin root path.
Using NotePad, WordPad or a different text editor open the file
.bash_profile
in the home directory and append the following (if the file does not exist you will have to create it):Save the file and open a new bash shell.
Setting up High-Availability
In order to enable HA for a standalone cluster, you have to use the .
Additionally, you have to configure your cluster to start multiple JobManagers.
In order to start an HA-cluster configure the masters file in :
- masters file: The masters file contains all hosts, on which JobManagers are started, and the ports to which the web user interface binds.
[...]
masterX:webUIPortX
By default, the JobManager will pick a random port for inter process communication. You can change this via the high-availability.jobmanager.port key. This key accepts single ports (e.g. 50010
), ranges (50000-50025
), or a combination of both (50010,50011,50020-50025,50050-50075
).
Example: Standalone HA Cluster with 2 JobManagers
- Configure high availability mode and ZooKeeper quorum in
conf/flink-conf.yaml
:
high-availability: zookeeper
high-availability.zookeeper.quorum: localhost:2181
high-availability.zookeeper.path.root: /flink
high-availability.cluster-id: /cluster_one # important: customize per cluster
high-availability.storageDir: hdfs:///flink/recovery
- Configure masters in
conf/masters
:
localhost:8081
localhost:8082
- Configure ZooKeeper server in
conf/zoo.cfg
(currently it’s only possible to run a single ZooKeeper server per machine):
server.0=localhost:2888:3888
- Start ZooKeeper quorum:
$ ./bin/start-zookeeper-quorum.sh
Starting zookeeper daemon on host localhost.
- Start an HA-cluster:
- Stop ZooKeeper quorum and cluster:
$ ./bin/stop-cluster.sh
Stopping taskexecutor daemon (pid: 7647) on localhost.
Stopping standalonesession daemon (pid: 7495) on host localhost.
Stopping standalonesession daemon (pid: 7349) on host localhost.
$ ./bin/stop-zookeeper-quorum.sh
Stopping zookeeper daemon (pid: 7101) on host localhost.
In Standalone mode, the following jars will be recognized as user-jars and included into user classpath:
- Session Mode: The JAR file specified in startup command.
Please refer to the Debugging Classloading Docs for details.