Cloud Storage Put Object

Cloud Storage Put Object



Cloud Storage Put Object

Transfer a file from a remote host onto a Google Cloud Platform Bucket.

This component can use a number of common network protocols to transfer data to a Google Cloud Platform Bucket. This component copies, not moves, the target file. In all cases, the source data is specified with a URL.

Currently supported protocols are:

  • FTP
  • HDFS
  • HTTP
  • HTTPS
  • SFTP
  • Windows Fileshare
  • S3 Bucket

Properties

Property Setting Description
Name Text The descriptive name for the component.
Input Data Type Choice Choose a connection protocol from the options available.
Set Home Directory as Root Choice Used with (S)FTP. By default, URLs are relative to the users home directory. This option tells Matillion ETL that the given path is from the server root.
Input Data URL Text The URL, including full path and file name, that points to the file to be uploaded. The format of the URL varies considerably, however a default 'template' is offered once you have chosen a connection protocol.
Note:Special characters used in this field (e.g. in usernames and passwords) must be URL-safe. See documentation on URL Safe Characters for more information. This can be avoided by using the Username and Password properties.
Domain Text The domain that the host file is located on - this parameter only appears when the data type is "Windows Fileshare".
Username Text This is your URL connection username. It is optional and will only be used if the data source requests it.
Password Text This is your URL connection password. It is optional and will only be used if the data source requests it.Users have the option to store their password inside the component but we highly recommend using the Password Manager option.
SFTP Key Text This is your SFTP Private Key. It is optional, only relevant for SFTP, and will only be used if the data source requests it.
This must be the complete private key, beginning with "-----BEGIN RSA PRIVATE KEY-----" and conforming to the same structure as an RSA private key.
Bucket Text/Tree The Bucket and folder to copy the file to. A public Bucket URL can be entered in the text box, although you must have write access.
Output Object Name Text The file name of the output object in the Google Cloud Storage Bucket.

Example URLs

Each protocol, when entered for the first time, will have a sample URL associated with it, detailing the structure of the URL format for that protocol.

Protocol Sample URL
FTP ftp://[username[:password]@]hostname[:port][path]
HDFS hdfs://host:port/filePath
HTTP http://[username[:password]@]hostname[:port][absolute-path]
HTTPS https://[username[:password]@]hostname[:port][absolute-path]
SFTP sftp://[username[:password]@]hostname[:port][path]
Windows Fileshare smb://[[[authdomain;]user@]host[:port][/share[/dirpath][/name]]][?context]
S3 Bucket s3://[bucketname][/path] More...


Square brackets indicate that that part of the URL is optional. In particular, whether the username and password are entered within the URL is discouraged - it CAN be done, but it poses a potential security risk, and may not work. Entering the username and password in the parameters provided is the preferred style.
 


Variable Exports

This component makes the following values available to export into variables:

Source Description
Bytes written The number of bytes read from the source and written out.

Example

In this example, we want to move some data from an Amazon S3 bucket and transfer it to a GCP bucket for transformation using BigQuery. This can be quickly and easily achieved inside the Matillion ETL client using the Cloud Storage Put Object component. The job is shown below.

Properties for the Cloud Platform Object Put component are shown below. We have configured the component to look for files on Amazon S3.

Now, the 'Input Data URL' will search Amazon S3 buckets associated with this Matillion ETL client for the input file of your choosing.

The 'Google Storage URL Location' property allows us to select a destination for our new file. The name of this file is determined by the final property: Output Object Name.

When run, this component will take the object from Amazon S3 and place it into GCP S3. We can check this object has been successfully transferred by using a Cloud Storage Load component to check our buckets for the file.