This article explores the architecture of Matillion ETL's Git integration feature. Read on for a deep-dive exploration of available actions including commit, create branch, merge, push, fetch, and more.
Important Information and Links
This article is part of a series of technical documentation covering the Git integration feature within Matillion ETL. Additional documentation includes:
When using the Git version control feature in Matillion ETL, it will be advantageous to understand the underlying architecture and concomitant technical terms. There are six components involved:
- A Matillion Project - the project is the top-level structure, containing jobs and other collateral within Matillion. Each project is isolated, and user access can be granted or denied on a per-project basis.
- A Matillion Version - a project can contain more than one version. When used with Git, think of a version as an independent working area. Each version points to a single Git commit in the local Git repository.
- The local Git repository (repo) - the local repository stores files on the Matillion ETL instance's filesystem, which is created automatically when a project is Git-enabled.
- The remote Git repository (repo) - a self-hosted or cloud-hosted Git repository that is external to Matillion ETL, and which was set up by the user prior to Git-enablement in Matillion. Users can push local repository commits to their remote repository; users can also fetch newer commits from the remote repository into the local repository.
- Commit - a commit is a point-in-time copy of a Matillion version, typically with collateral stored in the version, such as Orchestration and Transformation jobs.
- Branch - a branch is a collection of one or more commits in Git. Typically, a Git project will have a Master branch, from which other branches will be created to development and test code. A branch model typically allows users to develop new code without adding questionable code to the Master branch before the code has undergone testing and can be merged safely into the Master branch.
The above diagram includes a project (project_Dev), and within this project are three separate versions. Each of these versions could, as an example, belong to an individual developer in a development team.
On the right of the project is the Local Repo, which contains two branches. There is the Master branch (Branch Master), and an additional branch (Branch Feature_1). Within both of these branches are three commits, and the diagram shows via the shorter white arrows which version is pointing at which commit.
On the right of the Local Repo is the Remote Repo. In this diagram, the Remote Repo contains a backup copy of Local Repo. However, readers should note that the Remote Repo is missing Commit 3 from Branch Feature_1. This simply means that the Local Repo's changes require a push to the Remote Repo, at which point Commit 3 of Branch Feature_1, which is being developed in Version ver_z, will be backed up in the cloud-hosted remote repository.
Please note: Matillion ETL's Git integration feature does not support multi-factor authentication (MFA) at this time.
The rest of this article clarifies what actions a user can take when using Git in Matillion ETL.
Note: For many of the sections below, a fictitious development team's example workflow is referenced in the screenshots, focusing on a "master" [default] branch and a branch each for a pair of developers, Alice and Bob. Default, Alice, and Bob also have their own Matillion version, with Alice and Bob's versions serving as independent working areas for their developer work, which, when tested and approved, is merged.
In the top-left of the Matillion ETL user interface, click the Project button, then navigate down and click Git.
When performing this action for the first time, users will have two options:
- Init Local Repository: select this option to initialise a local Git repository and connect a Matillion project to Git for the first time.
- Clone Remote Repository: select this option to connect a new Matillion project to an existing remote Git repository, copying the commits and branches from the remote repository into a local repository.
In this instance, we select Init Local Repository. We click OK to confirm this action and commit the current state of this Matillion ETL project to what will become our "master" branch.
After this, the Git Integration screen loads. Because this is our first interaction with this screen since we initialised the local repository, we currently only have one commit, labelled in the Git Integration screen: "Initial commit", and this first commit belongs to our "master" branch, which is currently our only branch. This interface also provides Author and Date details, along with numerous clickable action buttons, all of which are covered in this article.
Git Action: Commit
We are going to add a commit featuring work added to this project by another team member. Accordingly, a new version is created by going to Project → Manage Versions and clicking the + button.
We assign the new version a name and unlock it. Then, we switch to this version.
To perform another commit, begin by clicking Project → Git. Then, in the Git Integration UI, click the bottom left button (designated by the arrow in the next image), this is the commit button.
Upon pressing the commit button, the Commit window opens. From here, users can select their branch. Currently, the only option available in this example remains the "master" branch, so we type a new branch into this field, thus creating a new branch. This forthcoming commit will sit under this new branch. Beneath the Branch Name field, users can tick and untick the checkbox next to each "change". In this instance, all three changes have been ticked to commit. Finally, a Commit Message has been left by the user making this commit. A Commit Message is required when making a commit.
To confirm, click OK.
Upon confirmation, the user is returned to the Git Integration UI, and we can see in the image below that we now have a second commit in our structure, this time on the "Alice_Branch" branch. The hollowed-out commit circle, which in this case belongs to our newest commit, highlights the currently active commit.
To switch commit, click the button at which the arrow points in the image below, and confirm or cancel the commit switch. This action will switch your current Matillion ETL version to the chosen commit
Git Action: Create Branch
To create a new branch from the Git Intergration UI, click the button at which the arrow is pointing in the below image.
Name the new branch, and click OK.
The below image illustrates all three Git branches in our Git project (Note: a commit has been made on "Bob's_Branch").
Git Action: Merge
Matillion allows users to merge. When performing a merge, one branch commit is merged into the current branch. Performing this action creates a new commit, and will switch the current Matillion version to the new commit.
To begin performing a merge, click the merge symbol, pointed to in the below image.
This will load the Merge user interface. The four fields in this UI are delineated in the list below.
- Merge to - Select which branch to merge. Users can choose to merge to a branch that is not the currently selected branch.
- Ours - this field denotes the latest commit of the branch to merge into.
- Theirs - this field denotes the latest commit of the branch to merge from.
- Commit Information - A required message for the new commit.
Users can also tick or untick the "Checkout After Merge" checkbox. When ticked, Git will perform the "switch commit" action. By default, this boxed is already checked.
In the next image, the merged commit shows that the corresponding branch has been joined back to the "master" branch.
Note: Remember, a Matillion version points at a specific Git commit. The currently selected branch is determined by which commit the current version is pointing at.
Git Action: Configure Remote
To configure a remote repository, click the cog/gear button in the Git Integration UI, as in the image below.
In the "Remote URI" field, paste the URL of your remote repository and click OK.
Git Action: Fetch
Performing a fetch means to pull in branches from another, in this case remote, repository. Remote repositories are an effective method of having a backup "master copy" of code.
To fetch from a remote repository, click the middle action button on the right of the Git UI, as in the image below.
Next, provide the Username and Password of your remote repository account.
Git Action: Push
Use the push action to send the branches of a local repository to a remote repository.
Provide your remote repository account's Username and Password. Users can select a type of push to perform:
- Atomic Push - Guarantees that either all references will be pushed on the remote, or none of them will; this option avoids partial pushes.
- Force Push - Forces the local revision to be pushed into the remote repository. This action can cause the remote repository to lose commits, and should be used with caution.
- Thin Push - Reduces the data sent when the sender and receiver share many of the objects.
Resolving a Merge Conflict
When performing a merge, Matillion will return an error message (as seen in the image below) if a "merge conflict" is found. Merge conflicts are a common aspect of using a version control system such as Git, and are often easy to resolve. In Matillion, a merge conflict arises when there is a conflict between the current local branch and the branch being merged, highlighting, ultimately, a conflict with one or more users' code.
Earlier, Bob's branch was merged into the "master" branch. However, this example's other developer, Alice, is now ready to merge her branch, too. Unfortunately, a merge conflict has been found. The following image shows the Merge window with any current conflicts.
As highlighted in the screenshot, the merge conflict is corresponding to the example's "run_me" Orchestration job, which, following Bob's merge into the "master" branch, contains different code than Alice's version.
The solution is to click the drop-down menu in the merge conflict's Choice column, and simply select whether to keep "OURS" or "Theirs". Given that we are merging to the "master" branch, which is further along the development process, we're going to choose "OURS", and click OK. Once a choice has been made for each merge conflict, Matillion should merge the branch successfully.
Matillion ETL's Git source control management feature currently supports the following private key formats:
Private keys of an OpenSSH format are not currently supported, and will produce an error message when used as a private key for performing a push to a remote repository.
However, you can convert your OpenSSH format private key to a supported key format using the below command:
ssh-keygen -p -f YOUR_PRIVATE_KEY -m pem