Redshift database data source
Properties supported in this source are shown below ( * indicates required fields )
Description of the data source
Pre-defined Redshift connection
Database table that should be read OR a query that will be used to read data from the Redshift sourceExample: dbtable
Source schema to assist during the design of the pipeline
Comma separated list of fields / column names to select from sourceDefault: *
SQL where clause for filtering recordsExample: date = '2022-01-01',year=22 and month = 6 and day = 2
Select rows with distinct column valuesDefault: false
Distribution style to be used when creating a table. When using KEY, you must also set a distribution key
The name of a column in the table to use as the distribution key when creating a table
Sort Keys supported by Redshift
A semicolon-separated list of SQL commands that are executed before data is transferred between Spark and Redshift
A semicolon-separated list of SQL commands that are executed after data is transferred between Spark and Redshift
A list extra options to append to the Redshift COPY command when loading data, e.g. TRUNCATECOLUMNS or MAXERROR (see the Redshift docs for other options)
Normalizes column names by replacing special characters ,;{}()&/\n\t= and space with the given stringExample: _