Preface

Directory Structure

Currently, all the configuration files are under [conf ] directory. Please check the following simplified DolphinScheduler installation directories to have a direct view about the position [conf] directory in and configuration files inside. This document only describes DolphinScheduler configurations and other modules are not going into.

[Note: the DolphinScheduler (hereinafter called the ‘DS’) .]

Configurations in Details

dolphinscheduler-daemon.sh is responsible for DS startup & shutdown. Essentially, startup/shutdown the cluster via dolphinscheduler-daemon.sh. Currently, DS just makes a basic config, please config further JVM options based on your practical situation of resources.

2.datasource.properties [datasource config properties]

DS uses Druid to manage database connections and default simplified configs are:

ParametersDefault valueDescription
spring.datasource.driver-class-namedatasource driver
spring.datasource.urldatasource connection url
spring.datasource.usernamedatasource username
spring.datasource.passworddatasource password
spring.datasource.initialSize5initail connection pool size number
spring.datasource.minIdle5minimum connection pool size number
spring.datasource.maxActive5maximum connection pool size number
spring.datasource.maxWait60000max wait mili-seconds
spring.datasource.timeBetweenEvictionRunsMillis60000idle connection check interval
spring.datasource.timeBetweenConnectErrorMillis60000retry interval
spring.datasource.minEvictableIdleTimeMillis300000connections over minEvictableIdleTimeMillis will be collect when idle check
spring.datasource.validationQuerySELECT 1validate connection by running the SQL
spring.datasource.validationQueryTimeout3validate connection timeout[seconds]
spring.datasource.testWhileIdletrueset whether the pool validates the allocated connection when a new connection request comes
spring.datasource.testOnBorrowtruevalidity check when the program requests a new connection
spring.datasource.testOnReturnfalsevalidity check when the program recalls a connection
spring.datasource.defaultAutoCommittruewhether auto commit
spring.datasource.keepAlivetrueruns validationQuery SQL to avoid the connection closed by pool when the connection idles over minEvictableIdleTimeMillis
spring.datasource.poolPreparedStatementstrueOpen PSCache
spring.datasource.maxPoolPreparedStatementPerConnectionSize20specify the size of PSCache on each connection

3.zookeeper.properties [zookeeper config properties]

ParametersDefault valueDescription
zookeeper.quorumlocalhost:2181zookeeper cluster connection info
zookeeper.dolphinscheduler.root/dolphinschedulerDS is stored under zookeeper root directory
zookeeper.session.timeout60000session timeout
zookeeper.connection.timeout30000connection timeout
zookeeper.retry.base.sleep100time to wait between subsequent retries
zookeeper.retry.max.sleep30000maximum time to wait between subsequent retries
zookeeper.retry.maxtime10maximum retry times

4.common.properties [hadoop、s3、yarn config properties]

Currently, common.properties mainly configures hadoop/s3a related configurations.

ParametersDefault valueDescription
server.port12345api service communication port
server.servlet.session.timeout7200session timeout
server.servlet.context-path/dolphinschedulerrequest path
spring.servlet.multipart.max-file-size1024MBmaximum file size
spring.servlet.multipart.max-request-size1024MBmaximum request size
server.jetty.max-http-post-size5000000jetty maximum post size
spring.messages.encodingUTF-8message encoding
spring.jackson.time-zoneGMT+8time zone
spring.messages.basenamei18n/messagesi18n config
security.authentication.typePASSWORDauthentication type

6.master.properties [master-service log config]

ParametersDefault valueDescription
master.listen.port5678master listen port
master.exec.threads100master execute thread number to limit process instances in parallel
master.exec.task.num20master execute task number in parallel per process instance
master.dispatch.task.num3master dispatch task number per batch
master.host.selectorLowerWeightmaster host selector to select a suitable worker, default value: LowerWeight. Optional values include Random, RoundRobin, LowerWeight
master.heartbeat.interval10master heartbeat interval, the unit is second
master.task.commit.retryTimes5master commit task retry times
master.task.commit.interval1000master commit task interval, the unit is millisecond
master.max.cpuload.avg-1master max CPU load avg, only higher than the system CPU load average, master server can schedule. default value -1: the number of CPU cores * 2
master.reserved.memory0.3master reserved memory, only lower than system available memory, master server can schedule. default value 0.3, the unit is G

7.worker.properties [worker-service log config]

8.alert.properties [alert-service log config]

ParametersDefault valueDescription
alert.typeEMAILalter type
mail.protocolSMTPmail server protocol
mail.server.hostmail server host
mail.server.port25mail server port
mail.senderxxx@xxx.commail sender email
mail.usermail sender email name
mail.passwd111111mail sender email password
mail.smtp.starttls.enabletruespecify mail whether open tls
mail.smtp.ssl.enablefalsespecify mail whether open ssl
mail.smtp.ssl.trustxxx.xxx.comspecify mail ssl trust list
xls.file.path/tmp/xlsmail attachment temp storage directory
following configure WeCom[optional]
enterprise.wechat.enablefalsespecify whether enable WeCom
xxxxxxxWeCom corp id
enterprise.wechat.secretxxxxxxxWeCom secret
enterprise.wechat.agent.idxxxxxxxWeCom agent id
enterprise.wechat.usersxxxxxxxWeCom users
enterprise.wechat.token.url?
corpid=corpId&corpsecret=secret
WeCom token url
enterprise.wechat.push.urlhttps://qyapi.weixin.qq.com/cgi-bin/message/send?
access_token=$token
WeCom push url
enterprise.wechat.user.send.msgsend message format
enterprise.wechat.team.send.msggroup message format
plugin.dir/Users/xx/your/path/to/plugin/dirplugin directory
ParametersDefault valueDescription
org.quartz.jobStore.driverDelegateClassorg.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.driverDelegateClassorg.quartz.impl.jdbcjobstore.PostgreSQLDelegate
org.quartz.scheduler.instanceNameDolphinScheduler
org.quartz.scheduler.instanceIdAUTO
org.quartz.scheduler.makeSchedulerThreadDaemontrue
org.quartz.jobStore.usePropertiesfalse
org.quartz.threadPool.classorg.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.makeThreadsDaemonstrue
org.quartz.threadPool.threadCount25
org.quartz.threadPool.threadPriority5
org.quartz.jobStore.classorg.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.tablePrefixQRTZ_
org.quartz.jobStore.isClusteredtrue
org.quartz.jobStore.misfireThreshold60000
org.quartz.jobStore.clusterCheckinInterval5000
org.quartz.jobStore.acquireTriggersWithinLocktrue
org.quartz.jobStore.dataSourcemyDs
org.quartz.dataSource.myDs.connectionProvider.classorg.apache.dolphinscheduler.service.quartz.DruidConnectionProvider

10.install_config.conf [DS environment variables configuration script[install/start DS]]

install_config.conf is a bit complicated and is mainly used in the following two places.

  • 2.Startup/shutdown DS cluster

11.dolphinscheduler_env.sh [load environment variables configs]

When using shell to commit tasks, DS will load environment variables inside dolphinscheduler_env.sh into the host. Types of tasks involved are: Shell task、Python task、Spark task、Flink task、Datax task and etc.

12. Services logback configs