Properties supported in this sink are shown below ( * indicates required fields )
Property
Description
Name*
Name of the data sink
Description
Description of the data sink
Format *
The value of this field is always 'org.apache.spark.sql.cassandra'Default: org.apache.spark.sql.cassandra
Processing Mode
Select for batch and un-select for streaming. If 'Batch' is selected the value of the switch is set to true. If 'Streaming' is selected the value of the switch is set to false.Default: true
Connection *
Pre-defined Cassandra connection
Table *
Cassandra table name to writeExample: table_test
Select Fields / Columns
Comma separated list of fields / columns to select from inputs to the sinkExample: id, name, city, state, zipDefault: *
Output Mode
If mode is batch mode, the values should be either of Append, Overwrite, ErrorIfExists or Ignore. If streaming mode, the values should be append, complete or update.Default: Append
Partition By
Comma separated column names to partition byExample: year, month, day
Number of batches per single Spark task to be stored in memory before sending to CassandraExample: 500Default: 1000
Batch Grouping Key
Determines how insert statements are grouped into batchesDefault: partition
Output Batch Size
Maximum total size of the batch in bytes. Overridden by spark.cassandra.output.batch.size.rowsDefault: 1,024
Outptut Batch Size Rows
Number of rows per single batch. The default is 'auto' which means the connector will adjust the number of rows based on the amount of data in each row.Default: None
Output Concurrent Writes
Maximum number of batches executed in parallel by a single Spark taskDefault: 5
Enable Pushdown
Enables pushing down predicates to C* when applicableDefault: true