GitHub Auto-Deploy Reprise
Most of my time is dedicated to dev projects that make the lives of marketers easier. Web sites, apps, tools, scripts, etc. Today however, I’m writing for web devs in an attempt to make their lives easier. And their solutions smarter. On top of that, I need to correct the approach I wrote about a couple of years ago for a GitHub auto-deploy solution.
If you’re wondering what Git is, you’re a bit behind, as it has taken over as the tool for code versioning for developers. If you’re not a developer, Git is still super useful as a tool that records all revisions to any file you want to track. Collaborating with some marketing colleagues and technical experts on a giant content project? Git can track everyone’s drafts with the same great versioning control. This article has more good information about Git. But as a developer, I’m constantly learning. And writing code. And breaking code. And fixing code. And learning. Today, I’m fixing because I have learned.
Why an auto-deploy solution?
The better question is probably, “why not?” If setup correctly, it makes a developer’s life easier. There are less points of failure when deploying code. It saves a considerable amount of time. And, it’s slick. You’ve finished making updates to your app in your local dev environment and are ready to stage and test. You commit and push your changes to the staging branch of your Git repo. Your auto-deploy setup grabs those latest updates and deploys them on your staging site almost instantly. You test your updates. They look solid, you’re ready to deploy live. Back in your local dev environment, you merge your staging branch updates with the master (or production) branch. You commit and push your updates. Again, auto-deploy handles the rest. All you have to do is (clear the cache and) test.
Some argue that Git should not be used as a deployment tool because it does not properly handle permissions or track empty directories. But, if setup properly with Git commands being executed by the Apache/nginx user, it seems to be a solid solution for many applications, especially compared to other confusing and expensive deployment tools.
Marketers: you may want to stop here and just send this to your developer(s). What comes next may not make much sense.
Why am I fixing?
Since implementing the GitHub auto-deploy setup based on
git pull, I have run into 1 main snag.
If any tracked file gets modified on the production site, whether by a user (SFTP/FTP) or the CMS, the
git pull command executed by the cron job halts because it detects changes that must be addressed. This almost always requires discarding the changes on the production server using
git checkout, overwriting local changes with the latest from the remote branch. This assumes you are not writing code in your production environment. You’re definitely not, right?! You should be pushing your updates from your local dev environment to the remote branch.
This snag is actually good, because I was naive in my understanding of
git pull. Git pull is essentially
git fetch followed by
git merge. This is where the problem lies. Merges can cause conflicts and should be manually reviewed and if necessary, fixed. With
git pull, we are giving git the permission/power to automatically handle the merge. This logic has no business being in a production environment. Thankfully, git isn’t forcing the merge by default when it detects the modification of a tracked file. So when this snag occurs, while the live code base may not be kept up-to-date automatically, at least it can’t break the production code with a troublesome automatic git merge that requires attention.
I did more research on other approaches to a git auto-deploy solution, and found this great unassuming article. Not only did it shed more light on why
git pull should not be used as a deployment tool, but the “deployment rules” section was spot on for what I was trying to ultimately accomplish. I pared down the rules to what I consider a good fit more my needs:
- All files in the branch being deployed should be copied to the deployment directory.
- Files that were deleted in the git repo since the last deployment should get deleted from the deployment directory.
- Any changes to tracked files in the deployment directory after the last deployment should be ignored when following rules 1 and 2.
- Untracked files in the deploy directory should be left alone.
The new auto-deploy solution
The deploy article suggests a handful of possible solutions that fulfill the deployment rules. After some trial-and-error testing, I settled on using 2 sequential commands:
git fetch --all
This command fetches all data from the remote without trying to merge or rebase. More on git fetch.
git checkout --force "origin/master"
This command, utilizing the “force” option, tells git to proceed even if the index or the working tree differs from HEAD. Essentially, this overwrites any local changes (if any) so the local repo mirrors the remote repo, which specifically satisfies deployment rule #2. The rest of the built-in
git checkout functionality satisfies the rest of our deployment rules, so all 4 are addressed. More on git checkout.
Using cron as our auto-deployment method, here is the new command:
*/1 * * * * cd /path/to/webroot && git fetch --all && git checkout --force "origin/master"
Note: the rest of the GitHub Setup Guide is accurate and should be used before the amendment of the auto-deploy solution described in this article.