MongoDB database data source
Properties supported in this source are shown below ( * indicates required fields )
Description of the data source
Pre-defined MongoDB connection
Database to connect.Example: customerdb
Collection data to fetchExample: products
Source schema to assist during the design of the pipeline
Comma separated list of fields / column names to select from sourceDefault: *
SQL where clause for filtering recordsExample: date = '2022-01-01',year=22 and month = 6 and day = 2
Select rows with distinct column valuesDefault: false
The partitioner full class name.Example: com.mongodb.spark.sql.connector.read.partitioner.SamplePartitionerDefault: com.mongodb.spark.sql.connector.read.partitioner.SamplePartitioner
The length of time to keep a MongoClient available for sharing.Example: 100,000Default: 5,000
The number of documents to sample from the collection when inferring the schemaExample: 1,000Default: 1,000
Normalizes column names by replacing special characters ,;{}()&/\n\t= and space with the given stringExample: _