Amazon S3¶
Amazon S3 data sources and sinks are in the file-based category of I/O connectors.
Tool documentation¶
Connector type values¶
The "type": value to use in the source or sink JSON files.
| Connector type | Value |
|---|---|
| Data Source | DKDataSource_S3 |
| Data Sink | DKDataSink_S3 |
Connection properties¶
The properties to use when connecting to an Amazon S3 instance from Automation.
| Field | Scope | Type | Required? | Description |
|---|---|---|---|---|
bucket |
source/sink | string | yes | Bucket name. |
public-bucket |
source/sink | Boolean | no | When public, s3-access-key and s3-secret-key are not required. |
access-key |
source/sink | string | yes if public-bucket is false |
AWS access key. |
secret-key |
source/sink | string | yes if public-bucket is false |
AWS secret key. |
aws-session-token |
source/sink | string | no | AWS session token optionally passed with access and secret keys in order to assume an IAM (AWS Identity and Access Management) role. |
region |
source/sink | string | no | AWS region name. |
Connections¶
See Connection Properties for more details on connection configurations.
Defined in kitchen-level variables¶
s3config in kitchen overrides
{
"s3config": {
"secret-key": "#{vault://s3/secret_key}",
"access-key": "#{vault://s3/access_key}",
"bucket": "#{vault://s3/bucket}"
}
}
The Connection tab in a Node Editor¶

Expanded connection syntax¶
For a data source¶
s3_datasource.json
{
"type": "DKDataSource_S3",
"name": "s3_datasource",
"config": {
"access-key": "{{s3config.accesskey}}",
"secret-key": "{{s3config.secretkey}}",
"bucket": "{{s3config.bucketname}}"
},
"keys": {},
"tests": {}
}
For a data sink¶
s3_datasink.json
{
"type": "DKDataSink_S3",
"name": "s3_datasink",
"config": {
"access-key": "{{s3config.accesskey}}",
"secret-key": "{{s3config.secretkey}}",
"bucket": "{{s3config.bucketname}}"
},
"keys": {},
"tests": {}
}
Condensed connection syntax¶
Note
Note: Do not use quotes for your condensed connection configuration variables.
For a data source¶
s3_datasource.json
{
"type": "DKDataSource_S3",
"name": "s3_datasource",
"config-ref": "s3config",
"keys": {},
"tests": {}
}
For a data sink¶
s3_datasink.json
{
"type": "DKDataSink_S3",
"name": "s3_datasink",
"config-ref": "s3config",
"keys": {},
"tests": {}
}
Local connectionss¶
S3 bucket contents can be viewed locally by configuring connections with file-transfer applications like Transmit.
Other configuration properties¶
See the following topics for common properties, wildcards, and runtime variables:
File encoding requirements¶
Files used with data sources and data sinks must be encoded in UTF-8 in order to avoid non-Unicode characters causing problems with sinking data to database tables and errors when running related tests
For CSV and other delimited files, use Save as in the program and select the proper encoding, or consider using a text editor with encoding options.
Data source examples¶
Example source 1¶
s3_datasource.json
{
"type": "DKDataSource_S3",
"name": "s3_datasource",
"config": {{s3config}},
"keys": {
"example-key": {
"file-key": "",
"decrypt-key": "",
"decrypt-passphrase": "",
}
},
"tests": {}
}
Example source 2¶
s3_datasource.json
{
"type": "DKDataSource_S3",
"name": "s3_datasource",
"public-bucket": true,
"bucket": "datakitchen-public",
"wildcard": "*.csv",
"wildcard-key-prefix": "dk-public/",
"set-runtime-vars": {
"key_count": "total_csv",
"size": "real_size"
},
"tests": {
"test-key-count": {
"test-logic": {
"test-compare": "equal-to",
"test-metric": 3
},
"action": "stop-on-error",
"type": "test-contents-as-integer",
"test-variable": "total_csv",
"keep-history": false
}
}
}
Example source 3¶
To concatenate all files in a given path, create a key with the name of the path and set the value of file-key to
CONCATENATE_ALL_FILES.
s3_datasource.json
{
"type": "DKDataSource_S3",
"name": "s3_datasource",
"public-bucket": true,
"bucket": "datakitchen-public",
"wildcard": "",
"set-runtime-vars": {
"key_count": "total_csv_concat"
},
"keys": {
"dk-public/concat_file_test/": {
"file-key": "CONCATENATE_ALL_FILES"
}
},
"tests": {
"test-key-count-concat": {
"test-logic": {
"test-compare": "equal-to",
"test-metric": 1
},
"action": "stop-on-error",
"type": "test-file-count",
"test-variable": "total_csv_concat",
"keep-history": false,
},
}
}
Data sink examples¶
Example sink 1¶
Push all files (matching the * wildcard) within the /vendor-name
directory to a public S3 bucket.
s3_datasource.json
{
"type": "DKDataSink_S3",
"name": "s3_datasource.json",
"public-bucket": true,
"bucket": "my-public-bucket",
"wildcard": "*",
"wildcard-key-prefix": "vendor_name/",
"keys": {}
}
Example sink 2¶
s3_datasink.json