JDBC

JDBC data sink

Properties

Properties supported in this sink are shown below ( * indicates required fields )
Property
Description
Name *
Name of the data sink
Description
Description of the data sink
Processing Mode
Select for batch and un-select for streaming. If 'Batch' is selected the value of the switch is set to true. If 'Streaming' is selected the value of the switch is set to false.Default: true
Connection *
Pre-defined JDBC connection
Table *
Database table that should be written into
Select Fields / Columns
Comma separated list of fields / columns to select from inputs to the sinkExample: id, nameDefault: *
Output Mode
If mode is batch mode, the values should be one of Append, Overwrite, ErrorIfExists or Ignore. If streaming mode, the values should be append, complete or update.Default: ErrorIfExists
Number of Partitions
The maximum number of partitions that can be used for parallelism in table writing. This also determines the maximum number of concurrent JDBC connectionsExample: 3Default: 0
Query Timeout
The number of seconds the driver will wait for a Statement object to execute to the given number of seconds. Zero means there is no limit. This option depends on how JDBC drivers implement the API setQueryTimeout, e.g., the h2 JDBC driver checks the timeout of each query instead of an entire JDBC batchExample: 6,000Default: 0
Batch Size
The JDBC batch size, which determines how many rows to insert per round trip. This can help performance on JDBC drivers. This option applies only to writingExample: 200Default: 1,000
Isolation Level
The transaction isolation level, which applies to current connection. It can be one of NONE, READ_COMMITTED, READ_UNCOMMITTED, REPEATABLE_READ, or SERIALIZABLE, corresponding to standard transaction isolation levels defined by JDBC's Connection object, with default of READ_UNCOMMITTED. Please refer the documentation in java.sql.Connection.Example: READ_COMMITTED
Truncate
When the Output Mode is set to Overwrite, this option causes Spark to truncate an existing table instead of dropping and recreating it. This can be more efficient, and prevents the table metadata (e.g., indices) from being removed. However, it will not work in some cases, such as when the new data has a different schemaDefault: false
Cascade Truncate
If enabled and supported by the JDBC database (PostgreSQL and Oracle at the moment), this options allows execution of a TRUNCATE TABLE t CASCADE (in the case of PostgreSQL a TRUNCATE TABLE ONLY t CASCADE is executed to prevent inadvertently truncating descendant tables). This will affect other tables, and thus should be used with care. It defaults to the default cascading truncate behaviour of the JDBC database in question, specified in the isCascadeTruncate in each JDBCDialect.Default: false
Create Table Options
If specified, this option allows setting of database-specific table and partition options when creating a tableExample: CREATE TABLE t (name string) ENGINE=InnoDB
Create Table Column Types
The database column data types to use instead of the defaults, when creating the table. Data type information should be specified in the same format as CREATE TABLE columns syntax. The specified types should be valid Spark SQL data typesExample: name CHAR(64), comments VARCHAR(1024)