Delta

Delta file data sink

Properties

Properties supported in this sink are shown below ( * indicates required fields )
Property
Description
Name *
Name of the data sink
Description
Description of the data sink
Processing Mode
Select for batch and un-select for streaming. If 'Batch' is selected the value of the switch is set to true. If 'Streaming' is selected the value of the switch is set to false.Default: true
Path or Table Name *
Path where the file is located or The name of the delta table to read if saved to data catalog
Output Mode
If mode is batch mode, the values should be either of Append, Overwrite, ErrorIfExists or Ignore. If streaming mode, the values should be append, complete or update.Default: ErrorIfExists
Checkpoint Location
Path to checkpoint directory and fileExample: s3a://output_bucket/some_folder
SQL to execute on each partition
SQL to execute in delta streaming modeExample: select job from employee where city = 'NYC'
Select Fields / Columns
Comma separated list of fields / columns to select from inputs to the sinkExample: name, city, countryDefault: *
Partition Overwrite Mode
Select dynamic mode to overwrite all existing data in each logical partition for which the write commits new data. Any existing logical partitions for which the write does not contain data remain unchanged. This partition overwrite is only applicable when data is being written in overwrite mode.
Default: static
Replace Where SQL
SQL expression for replacing data for overwriteExample: Alter employee WHERE date >= '2017-01-01' AND date <= '2017-01-31'
Trigger
The trigger settings of a streaming query define the timing of streaming data processing, whether the query is going to be executed as micro-batch query with a fixed batch interval or as a continuous processing query.
Overwrite Schema
Select to overwrite schema on write. Changing a column’s type or name or dropping a column requires rewriting the table. To do this, use the Overwrite Schema optionDefault: false
Merge Schema
Should the schema be merged into existing one? Columns that are present in the DataFrame but missing from the table are automatically added as part of a write transaction. Note: spark.databricks.deltaschema.autoMerge.enabled needs to be true.Default: false
Partition By
Comma separated column names to partition byExample: year, month, day
Part Files Per Partition
Number of part files to write per partition column. WARNING Setting this value may degrade performance drastically.