Version Control for Servers – Emanuele Santanche

When installing, configuring and fine-tuning the software my server runs, many modifications occur.

Applications may tamper with settings when cron jobs run them.

I may tinker with properties in an ini file to improve performances.

Because of these changes, an application on the server may later stop working.

When this happens, I need to figure out what happened and fix it.

Version control tells me what changes intervened.

It’s precious information I use to examine the problem.

What is version control?

I’m using git. It’s the version control system Linus Torvalds created and it’s widely used.

I put together many files, usually containing only text, that I need to build an application or for any other reason.

I tell git to track them.

Every now and then I commit. It means that I tell git to save the files as they currently are. This results in many versions of them.

When the application stops working, git tells me what I modified since the last working version. I will be able to find the difference that crippled it.

Why to use git to control a server’s configuration?

If I control my server’s configuration using git, I’ll fix problems more easily. It’s like using git to track application code. It let me know what changed since the last time everything was running smoothly.

When I install a package, it may overwrite one or more configuration files. The package installer may tell me what it’s about to change or it may not. I may let the package do its job and forget to check if it will have side-effects. Even if I do check, there may later be undesired effects.

I may edit system files like hosts or fstab to correct a malfunction, but there may later be problems somewhere else.

I may be fine-tuning my server’s performance and want to keep the fixes I make to later compare the results.

Users who were able to log in to the server, may not be able do it any more and I suspect that an intruder meddled with ssh-related files.

I may be configuring a website in Nginx or Apache and testing different configurations to find the best one.

I may assign multiple ip addresses to a single network card. This configuration needs to be tested with similar ones running on other servers, maybe remote ones.

I may reduce the property innodb_buffer_pool_size in the file /etc/mysql/my.cnf because MySQL runs all the same and I need memory for another application. This affects a script that runs at night when another system administrator is in charge. The latter will find git very useful because he will know what changed since the last time the now-broken script run correctly.

Why I need to set a couple of environment variables to use git in this scenario

Usually git finds the files it’s supposed to track in a single folder.

For example, the files that compose my website https://leadershipcoachfortech.com/ are in the folder /vol/WORKnARCH/SwProjects/leadershipcoachfortech.

But, in this case, I’m using git to track files system-wide.

In the folder /var/spool/cron/crontabs/ there are the crontab files, in /srv/scripts/ there are the scripts cron runs, and /root/.bashrc and /root/.profile contain interesting environment variables.

Git must work from the only folder that encompasses all these files, the folder ‘/’.

But I don’t want the git repository to be in the root folder, it’s a messy arrangement.

This means that I need to have the following two variables in the environment.

GIT_DIR=/srv/gitrepo4config
GIT_WORK_TREE=/

The first one tells git where to put the repository, the place where git keeps the history of changes. The second tells git that the files to track are under the folder ‘/’.

How I set these two environment variables

I run a script before starting to operate on this git repository.

I don’t add the variables to files like .bashrc or .profile because I don’t want them to affect git operations for the entire system. There may be other git repositories that wouldn’t work if these variables are set.

The script is the following one.

#!/bin/bash

if [[ $_ == $0 ]] ; then
   echo "Invoke this script this way: . git-environment-setup.sh"
   echo "First you type git-environment-setup.sh taking advantage of autocomplete"
   echo "Then you go to the beginning of the line and add the '.'"
fi

export GIT_WORK_TREE=/
export GIT_DIR=/srv/gitrepo4config

What git is not supposed to track

The environment variable GIT_WORK_TREE I set above tells git that, potentially, it can track the entire file system. I limit the tracking scope using the file .gitignore git refers to for instructions about what not to track.

Initially, I tell git to track nothing and then I incrementally add folders and files to track.

Here it is the .gitignore file.

# Ignore everything
*
# But descend into directories
# Without this, rules like !/var/spool/cron/** won't work, they wouldn't see the subfolders
!*/

# Get rid of any git repo may be present in sub-directories
/root/.wp-cli
/root/drush-backups
/srv/sites/intranet.emanuelesantanche.com
/srv/sites/rankit.emanuelesantanche.com
/usr/lib/nodejs

# Folders to track

# Remember the double '*' otherwise you don't get sub-directories
# Entire /etc 
!/etc/**
# Crontab files
!/var/spool/cron/crontabs/**
# Scripts
!/srv/scripts/**
# Ssh folder
!/root/.ssh/**
# Rclone makes a backup to Google Nearline Storage
!/root/.config/rclone/**

# Files to track
!/root/.bashrc
!/root/.profile

# Files to ignore
# Had to ignore this file because it contains a token that changes every day
/root/.config/rclone/rclone.conf

# This section is from the etckeeper package
# It ignores many files in the /etc folder

# new and old versions of conffiles, stored by apt/rpm
*.rpmnew
*.rpmorig
*.rpmsave

# old versions of files
*.old

# mount(8) records system state here, no need to store these
blkid.tab
blkid.tab.old

# some other files in /etc that typically do not need to be tracked
nologin
ld.so.cache
prelink.cache
mtab
mtab.fuselock
.pwd.lock
*.LOCK
network/run
adjtime
lvm/cache
lvm/archive
X11/xdm/authdir/authfiles/*
ntp.conf.dhcp
.initctl
webmin/fsdump/*.status
webmin/webmin/oscache
apparmor.d/cache/*
service/*/supervise/*
service/*/log/supervise/*
sv/*/supervise/*
sv/*/log/supervise/*
*.elc
*.pyc
*.pyo
init.d/.depend.*
openvpn/openvpn-status.log
cups/subscriptions.conf
cups/subscriptions.conf.O
fake-hwclock.data
check_mk/logwatch.state

# editor temp files
*~
.*.sw?
.sw?
\#*\#
DEADJOE

# end section from etckeeper

Let me put git at work

Now I need to tell git to create a repository and copy to it the files I want to track.

I use the following command.

root@FREEDOMANDCOURAGE:/# git init

Git creates an empty repository. I need to copy the files to it. But I don’t want to risk to copy the entire file system to it if I make mistakes defining the file .gitignore.

I simulate the copy.

root@FREEDOMANDCOURAGE:/# git add -A --dry-run

Git will list the files it will copy to the repository when I’m ready to do it for real.

When I’m satisfied with the dry run, I start the copy.

root@FREEDOMANDCOURAGE:/# git add -A

I tell git that the first version of the files is complete.

root@FREEDOMANDCOURAGE:/# git commit -m "First version"

How do I check the changes?

During the day, I will make changes to configuration files. Running processes will cause many more of them.

Giving a git status command, I know which files were modified. The command also tells me if files were created.

root@FREEDOMANDCOURAGE:/# git status

The command git diff shows the changes, line by line.

root@FREEDOMANDCOURAGE:/# git diff

I use this information to solve problems the changes may have caused.

Then I add the changes to the repository.

root@FREEDOMANDCOURAGE:/# git add -A
root@FREEDOMANDCOURAGE:/# git commit -m "Comment describing the changes"

Automating git status check

I don’t want to do this every day manually. I write a script that automates this task away.

#!/bin/sh

export GIT_WORK_TREE=/
export GIT_DIR=/srv/gitrepo4config

git status -s

Then I invoke it from crontab. It will run at 07:40 UTC every day.

40 7 *  * * /srv/scripts/check-systemwide-git-repo.sh

If there are modifications, the server will send me an email with a list of them.

I will log in to the server and figure out why the changes occurred. Then I will run a git add -A command and a commit like above.

Automating the commit

I’m happy with having to commit the changes manually.

In the future, I may want to automate this activity.

I’ll use an improved version of the script above.

It will be like this.

#!/bin/bash

export GIT_WORK_TREE=/
export GIT_DIR=/srv/gitrepo4config

PATH_GIT_STATUS_OUTPUT=/tmp/`basename $0`-git-status.out

git status -s >$PATH_GIT_STATUS_OUTPUT

# If there are changes, I cat the output file from git status
# so that the server will send it to me
# I also automatically add the changes and commit

if [[ -s $PATH_GIT_STATUS_OUTPUT ]] ; then
   cat $PATH_GIT_STATUS_OUTPUT
   git add -A
   git commit -m "Auto commit"
fi

I will still be able to edit the comment if I want it to be more expressive than the generic “Auto commit”.

The command git commit –amend will do it.

root@FREEDOMANDCOURAGE:/# export EDITOR=nano
root@FREEDOMANDCOURAGE:/# git commit --amend

I need to remember to set the environment variable EDITOR to my favourite editor, nano, or the amend command won’t work.

Photo by imgix on Unsplash