Database Properties¶
Unless otherwise noted, these properties can be used when configuring nodes for any database data source or sink.
In JSON node files, these fields can be used to configure the source or sink generally or used to configure specific keys (steps) within a source or sink definition.
For the properties specific to each connector, consult the documentation for Supported Connectors.
Note
Source/sink names are auto-generated. The system generates the default names, "source" and "sink," upon creation of a DataMapper node, as well as the default folders, "data_sources" and "data_sinks." Other elements inherit those names: the file generated within the node's data_sources folder is source.json, the file in the node's data_sinks folder is sink.json, and mapping keys are mappingname_source and mappingname_sink. Any name changes must be updated in the node's notebook.json and in the source and sink files.
Access database properties¶
Database source and sink properties can be found across the different tabs of the Node Editor and in the JSON file of the File Editor.
Database source properties¶
The following properties can be used when connecting to a soure.
The CSV column below indicates the properties used to define delimited-text formatting.
These fields only apply when the value of the format field is csv.
See File Settings for SQL Connectors for more information.
Mapping/step¶
| Mapping/step field in the UI | Mapping/step field in JSON | CSV | Description and details |
|---|---|---|---|
| Description | description | — | User-defined string for the step, key, or mapping. Availability: Database sources only. Value type: string |
| SQL | sql | — | The SQL/DML/DDL sentence to be executed. Compiles at the start of data source execution, so it cannot call variables created during processing. The result of this SQL query will be mapped to the data sink. S ee Data Source SQL Fields. Inline Query: The SQL is recorded in the /data_sources/source.json and cannot be shared with any other node or data source. Drop File to Copy: The inline query option allows users to drop any text file (.sql, .txt, .py, .json) in the code editor to copy the contents. This is a copy function only; it does not upload the file. Shared Recipe File: The SQL is recorded in the selected file path within the recipe's /resources directory and can be reused in other nodes or data sources. Availability: Database sources only. Required field Value type: string Node file: Recorded in /data_sources/source.json |
| Shared Recipe File (see SQL) | sql-file | — | Specifies the resource file containing runtime variables generated in previous steps within the node. Compiles at the time of step execution, so it can refer to variables created in upstream steps. See Data Source SQL Fields. Availability: Database sources only. Required field Value type: string Node file: Recorded in /data_sources/source.json |
| Result Type | query_type | — | Selection sets the nature of the result that the system will expect from running the SQL. Result Set: For sources only, the result of the SQL query is a rowset in the format specified by the format field. This is the execute_query property in node files. Scalar Value: For sources only, the result of the SQL query is a single scalar value, like sum() or count(). This is the execute_scalar property in node files. None - Bulk Copy: For MS SQL sources only, the SQL is executed using the bulk copy utility (bcp). This is the freebcp property in node files. execute_non_query: For sources only, intended for DML/DDL statements. execute_low_isolation_non_query: For PostgreSQL/Redshift sources only, increases database transaction concurrency and throughput for catalog operations. Availability: All database sources and sinks, with limitations noted for each operation. Note: Salesforce sources support only execute_query. Required field Value type: string Node file: Recorded in /data_sources/source.json |
| N/A | ignore-errors | — | Indicates if the node should continue processing even if the SQL query that is being processed generates errors. true: SQL errors do not cause the system to stop executing the node. false: A SQL error causes the system to stop executing the node immediately. Availability: Redshift/PostgreSQL, Oracle, and Snowflake database sources only. Value type: Boolean Default value: false Node file: Recorded in /data_sources/source.json |
| Maximum Errors | maxerrors | — | Specifies the maximum amount of allowed errors on insert. Value type: integer Availability: MS SQL sources/sinks only where Result Type ( query-type) is freebcp. Node file: Recorded in /data_sources/source.json |
Runtime variables¶
| Runtime variables field in the UI | Runtime variables field in JSON | CSV | Description and details |
|---|---|---|---|
| N/A | set-runtime-vars | — | The system generates the field when a runtime variable form is completed. See Built-In Runtime Variables, File-Based Variables, or Database Variables for more information. |
Output/input¶
| Output/input field in the UI | Output/input field in JSON | CSV | Description |
|---|---|---|---|
| Insert column names as first row (checkbox) | insert-column-names | — | When checked and using delimited text format as output, the first row of output data will be column names. Availability: Database sources only where Result Type ( query-type) is set to execute-query. Value type: Boolean Node file: Recorded in /data_sources/source.json |
| File Format | format | — | The output file type that the system exports. For delimited text settings, see File Settings for SQL Connectors. binary: Exports the data in an internal binary format. Useful when moving from SQL data sources to SQL data sinks. delimited text: Exports the data as CSV. JSON output: Exports the data to JSON format. Default value: binary Value type: string Availability: Database sources where Result Type ( query-type) is set to execute-query. |
| Enclose strings in double quote (checkbox) | quote-strings | — | If checked (set to true), indicates that string values in the result written to a CSV file will be enclosed in double quotes. Default value: true Value type: Boolean Availability: Database sources only where query-type is csv. |
| True Value, False Value | bool-format | How a value stored as True or False in a SQL source should be written to a CSV file, and which strings from a CSV should be created as True or False in a SQL sink. Availability: All database sources/sinks where format is csv and in sources where Result Type (query-type) is execute_query. Not available for BigQuery sinks. Default values are True and False, but users could specify Yes or 1 and No or 0 instead. Value type: string Usage in JSON source files: "bool-format": { "True": "YES", "False": "NO"} |
|
| Column Delimiter | col-delimiter | When using CSV file format, it specifies the column delimiter. Use C-Style escaping for tab, newline, and carriage return characters. Availability: All database sources/sinks. Default value: , (comma) for CSV format. For MS SQL sources/sinks where Result Type (query-type) is freebcp, the default is /t (tab). Value type: string |
|
| Date Format | date-format | Python datetime format for dates in the delimited file. See https://strftime.org/ for syntax. Availability: All database sources/sinks, except BigQuery sinks. Default value: %Y-%m-%d, which might be written to a file as 2020-08-27. Value type: string |
|
| Datetime Format | datetime-format | Python datetime format for dates in the delimited file. Default value See https://strftime.org/ for syntax. Availability: All database sources/sinks, except BigQuery sinks. Default value: %Y-%m-%d %H:%M:%S, which might be written to a file as 2020-08-27 14:30:22. Value type: string |
|
| Decimal Format | decimal-format | Python string format for decimal numbers in the delimited file. Availability: All database sources/sinks, except BigQuery sinks. Default value: %f Value type: string |
|
| Integer Format | int-format | Python string format for integers in the delimited file. Availability: All database sources/sinks, except BigQuery sinks. Default value: %d Value type: string |
|
| Null Value | null-value | Specifies how null values from a SQL source should be written to a CSV file, and which strings from a CSV should be created as null in a SQL sink. Availability: All database sources/sinks, except BigQuery sinks. Default value: An empty string, but users could specify NULL or None instead. Value type: string |
|
| Row Delimiter | row-delimiter | When using CSV format, it specifies the row delimiter. Use C-Style escaping for tab, newline, and carriage return characters. Availability: Database sources and MS SQL sources/sinks. Default value: \r\n for sources. Default value is \n for MS SQL sources/sinks where Result Type (query-type) is freebcp Value type: string |
|
| Time Format | time-format | Python datetime format for time in the delimited file. See https://strftime.org/ for syntax. Availability: All database sources/sinks, except BigQuery sinks. Default value: %H:%M:%S, which might be written to a file as 14:30:22 Value type: string |
Database sink properties¶
The following properties can be used when connecting to a sink.
The CSV column below indicates the properties used to define delimited-text formatting.
These fields only apply when the value of the format field is csv. See
File Settings for SQL Connectors for more information.
Mapping¶
| Mapping field in the UI | Mapping field in JSON | CSV | Description |
|---|---|---|---|
| Insert Mode | query_type | — | Selection instructs the system how to insert the mapped data into the sink database. Bulk Insert: For sinks only, the data is added to the table specified in the Table Name field using a bulk insert. This is the bulk_insert property in node files. Specify Template: For sinks only, the data is added to the database using the insert template statement specified in the SQL Insert Template field. This is the execute_dml property in node files. Bulk Copy (bcp): For MS SQL sinks only, the data is added to the table specified in the Table Name field using the bulk copy utility. This is the freebcp property in node files. Required field Value type: string Availability: All database sources and sinks, with limitations noted for each operation. Note: BigQuery sinks support only bulk_insert. Node file: Recorded in /data_sinks/sink.json |
| Table Name | table-name | — | The table in the database where the data will be inserted. Availability: Database sinks with Bulk Insert and Bulk Copy modes. Required field if query-type is bulk_insert, or in MS SQL sinks where Result Type (query-type) is freebcp. Value type: string |
| SQL Insert Template | insert-template | — | A Python-based SQL template statement (DML template) used to insert the data for each row. For example, INSERT INTO table-name (column1, column2) VALUES (%s, %s). Availability: Database sinks with Specify Template mode. Required field if query-type is execute_dml |
| Create table by executing DDL | table-ddl | — | The DDL statements defined for creating the table for data insertion. Availability: Database sinks, except BigQuery sinks. |
| Batch Size | batch-size | — | The number of rows to send to the server as a single packet. Availability: MS SQL sinks only where query-type is freebcp Value type: integerDefault value: 5000 |
| Maximum Errors | maxerrors | — | Specifies the maximum amount of allowed errors on insert. Value type: integer Availability: MS SQL sources/sinks only where Result Type ( query-type) is freebcp Node file: Recorded in /data_sources/source.json |
| Write Disposition | write-disposition | — | Specifies the action that occurs if the destination table already exists. Valid values: Append ( WRITE_TRUNCATE): If the table already exists, BigQuery overwrites the table data. Truncate ( WRITE_APPEND): If the table already exists, BigQuery appends the data to the table. Empty ( WRITE_EMPTY): If the table already exists and contains data, a 'duplicate' error is returned in the job result. Each action is atomic and only occurs if BigQuery is able to complete the job successfully. Creation, truncation and append actions occur as one atomic update upon job completion. Availability: BigQuery sinks only. Value type: string |
Runtime variables¶
| Runtime variables field in the UI | Runtime variables field in JSON | CSV | Description |
|---|---|---|---|
| N/A | set-runtime-vars | — | The system generates the field, at the source/sink level or at the key level, when a runtime variable form is completed. See Built-In Runtime Variables, File-Based Variables or Database Variables for more information. |
Output/input¶
| Output/input field in the UI | Output/input field in JSON | CSV | Description |
|---|---|---|---|
| Skip first row as column names (checkbox) | first-row-column-names | — | When checked and using delimited text format as input, the first row of data will be ignored. Availability: Database sinks only. Value type: Boolean Node file: Recorded in /data_sinks/sink.json |
| File Format | format | — | The output file type that the system reads from to insert in the sink. For delimited text settings, see File Settings for SQL Connectors. Tip: Binary format preserves better file types so is recommended when using DataMapper nodes to move data from different databases. Valid values for database sinks: binary: Imports the data in an internal binary format. Useful when moving from SQL data sources to SQL data sinks. delimited text: Imports the data as CSV. Default value: binary Value type: string Availability: Database sinks, except MS SQL using Result Type ( query-type) set to freebcp. BigQuery data sinks support CSV and Newline-delimited JSON formats. |
| True Value, False Value | bool-format | How a value stored as True or False in a SQL source should be written to a CSV file, and which strings from a CSV should be created as True or False in a SQL sink. Availability: All database sources/sinks where format is csv and in sources where Result Type (query-type) is execute_query. Not available for BigQuery sinks. Default values are True and False, but users could specify Yes or 1 and No or 0 instead. Value type: string Usage in JSON sink files: "bool-format": { "Yes": true, "No": false} |
|
| Column Delimiter, Field Terminator | col-delimiter | When using CSV format, it specifies the column delimiter. Use C-Style escaping for tab, newline, and carriage return characters. Availability: All database sources/sinks. Default value: , (comma) for CSV format. For MS SQL sources/sinks where query-type is freebcp, the default is /t (tab). Value type: string |
|
| Date Format | date-format | Python datetime format for dates in the delimited file. See https://strftime.org/ for syntax. Availability: All database sources/sinks, except BigQuery sinks. Default value: %Y-%m-%d, which might be written to a file as 2020-08-27. Value type: string |
|
| Datetime Format | datetime-format | Python datetime format for dates in the delimited file. Default value See https://strftime.org/ for syntax. Availability: All database sources/sinks, except BigQuery sinks. Default value: %Y-%m-%d %H:%M:%S, which might be written to a file as 2020-08-27 14:30:22. Value type: string |
|
| Decimal Format | decimal-format | Python string format for decimal numbers in the delimited file. Availability: All database sources/sinks, except BigQuery sinks. Default value: %f Value type: string |
|
| Integer Format | int-format | Python string format for integers in the delimited file. Availability: All database sources/sinks, except BigQuery sinks. Default value: %d Value type: string |
|
| Null Value | null-value | Specifies how null values from a SQL source should be written to a CSV file, and which strings from a CSV should be created as null in a SQL sink. Availability: All database sources/sinks, except BigQuery sinks. Default value: An empty string, but users could specify NULL or None instead. Value type: string |
|
| Row Delimiter, Row Terminator | row-delimiter | When using CSV format, it specifies the row delimiter. Use C-Style escaping for tab, newline, and carriage return characters. Availability: Database sources and MS SQL sources/sinks. Default value: \r\n for sources. Default value is \n for MS SQL sources/sinks where Result Type (query-type) is freebcp Value type: string |
|
| Time Format | time-format | Python datetime format for time in the delimited file. See https://strftime.org/ for syntax. Availability: All database sources/sinks, except BigQuery sinks. Default value: %H:%M:%S, which might be written to a file as 14:30:22 Value type: string |