Google Cloud PubSub

    Copied to clipboard!

    Note that the streaming connectors are currently not part of the binary distribution. See here for information about how to package the program with the libraries for cluster execution.

    The connector provides a connectors for receiving and sending messages from and to Google PubSub. Google PubSub has an guarantee and as such the connector delivers the same guarantees.

    The class PubSubSource has a builder to create PubSubsources:

    There are several optional methods to alter how the PubSubSource is created, the bare minimum is to provide a Google project, Pubsub subscription and a way to deserialize the PubSubMessages.

    Example:

    Currently the source functions messages from PubSub, push endpoints are not supported.

    The class PubSubSink has a builder to create PubSubSinks.

    This builder works in a similar way to the PubSubSource.

    Example:

    Google uses to authenticate and authorize applications so that they can use Google Cloud Platform resources (such as PubSub).

    Both builders allow you to provide these credentials but by default the connectors will look for an environment variable: GOOGLE_APPLICATION_CREDENTIALS which should point to a file containing the credentials.

    If you want to provide Credentials manually, for instance if you read the Credentials yourself from an external system, you can use PubSubSource.newBuilder(...).withCredentials(...).

    The following example shows how you would create a source to read messages from the emulator and send them back:

    SourceFunction

    There are several reasons why a message might be send multiple times, such as failure scenarios on Google PubSub’s side.

    Another reason is when the acknowledgement deadline has passed. This is the time between receiving the message and acknowledging the message. The PubSubSource will only acknowledge a message on successful checkpoints to guarantee at-least-once. This does mean if the time between successful checkpoints is larger than the acknowledgment deadline of your subscription messages will most likely be processed multiple times.

    For this reason it’s recommended to have a (much) lower checkpoint interval than acknowledgement deadline.

    See PubSub for details on how to increase the acknowledgment deadline of your subscription.

    Note: The metric shows how many messages are waiting for the next checkpoint before they will be acknowledged.

    SinkFunction

    The sink function buffers messages that are to be send to PubSub for a short amount of time for performance reasons. Before each checkpoint this buffer is flushed and the checkpoint will not succeed unless the messages have been delivered to PubSub.