Automating tasks with Makefiles

Almost 20 years ago, one of the first posts on this blog (hosted elsewhere at the time) was about documentation.

Since then, I’ve written about documentation and checklists and the like spradically. The problem is that although I know documentation and checklists are a good thing, I don’t use them enough.

It is more fun to write code.

At the same time, I have a hidden perfectionist in me (trust me, he’s there), so if I write code to perform some process, I can spend a lot of time making sure it works just right.

So, (part of) the cure for my lack of documentation, is to write code that performs a task and let the code be the documentation. (I’ve even used this as an excuse to practice literate programming because then I can write code and readable documentation at the same time in Emacs.)

Anyway, back to code as documentation.

With my background in setting up systems, I know all too well the pain of having to repeat something over and over. At the same time, because I’m so old, I don’t want to learn any new tool when the tools I have are already. So, while friends of mine have used Ansible and similar tools to set up complete MediaWiki systems, I’m too opinionated about how I do things that, try as I might, I couldn’t just use their system.

Which brings us to Make. GNU Make in particular. I coudl get into the byzantine differences between makes, but I tend to be on Linux and, hey, GNU make is available on the other systems.

For the past year or so, I’ve been working on deploying MediaWiki with Make. I just used it to stage a major upgrade at a client of mine. Today, I have a small project I need to deploy, so I decided to try and use my Makefile method. Over the next few days, I’ll document this.

Get the makefile skeleton

Obviously the first thing to do is get my makefile skeleton set up. I’ve learned that I only need a stub of a file to do this and I’ve been adapting it over the years. Here’s what I have so far:

include makeutil/baseConfig.mk
baseConfigGitRepo=https://phabricator.nichework.com/source/makefile-skeleton

.git:
    echo This is obviously not a git repository! Run 'git init .'
    exit 2

#
makeutil/baseConfig.mk: /usr/bin/git .git
    test -f $@                                                                                                                              ||      \
        git submodule add ${baseConfigGitRepo} makeutil

With that in place as my Makefile, I just run make and the magic happens:

$ make
echo This is obviously not a git repository!
This is obviously not a git repository!
exit 2
Makefile:1: makeutil/baseConfig.mk: No such file or directory
make: *** [Makefile:6: .git] Error 2

Ok, well, I run git init && make and the magic happens:

$ git init && make
Initialized empty Git repository in /home/mah/client/client/.git/
test -f makeutil/baseConfig.mk                                                                                     ||       \
    git submodule add https://phabricator.nichework.com/source/makefile-skeleton makeutil
Cloning into '/home/mah/client/client/makeutil'...
remote: Enumerating objects: 106, done.
remote: Counting objects: 100% (106/106), done.
remote: Compressing objects: 100% (106/106), done.
remote: Total 106 (delta 52), reused 0 (delta 0)
Receiving objects: 100% (106/106), 36.38 KiB | 18.19 MiB/s, done.
Resolving deltas: 100% (52/52), done.

  Usage:

    make <target> [flags...]

  Targets:

    composer   Download composer and verify binary
    help       Show this help prompt
    morehelp   Show more targets and flags

  Flags: (current value in parenthesis)

    NOSSL      Turn off SSL checks -- !!INSECURE!! ()
    VERBOSE    Print out every command ()

Better.

Set up DNS

I want to put the client domain on its own IP with its own DNS record. I don’t have “spin up a VM” anywhere close to automated, but I have been using my bind and nsupdate to update my files, so I’ve automated that.

# DNS server to update
dnsserver ?= 

# List of all DNS servers
allDNSServers ?=

# Keyfile to use
keyfile ?= K${domain}.private

# DNS name to update
name ?=

# IP address to use
ip ?=

# Time to live
ttl ?= 604800

# Domain being updated
domain = $(shell echo ${name} | sed 's,.*\(\.\([^.]\+\.[^.]\+\)\)\.*$,\2,')

NSUPDATE=/usr/bin/nsupdate
DIG=/usr/bin/dig

#
verifyName:
    test -n "${name}"                                                                                                       ||      (       \
        echo Please set name!                                                                                           &&      \
        exit 1                                                                                                                          )

#
verifyIP:
    test -n "${ip}"                                                                                                         ||      (       \
        echo Please set ip!                                                                                                     &&      \
        exit 1                                                                                                                          )

#
verifyDomain:
    test -n "${domain}"                                                                                                     ||      (       \
        echo Could not determine domain. Please set domain!                                     &&      \
        exit 1                                                                                                                          )
    test "${domain}" != "${name}"                                                                            ||      (       \
        echo Problem parsing domain from name. Please set domain!                       &&      \
        exit 1                                                                                                                          )

#
verifyKeyfile:
    test -n "${keyfile}"                                                                                            ||      (       \
        echo No keyfile. Please set keyfile!                                                            &&      \
        exit 1                                                                                                                          )
    test -f "${keyfile}"                                                                                            ||      (       \
        echo "Keyfile (${keyfile}) does not exist!"                                                     &&      \
        exit 1                                                                                                                          )

# Add host with IP
addHost: verifyName verifyIP verifyDomain verifyKeyfile ${NSUPDATE}
    printf "server %s\nupdate add %s %d in A %s\nsend\n" "${dnsserver}"                     \
        "${name}" "${ttl}" "${ip}" | ${NSUPDATE} -k ${keyfile}
    ${make} checkDNSUpdate ip=${ip} name=${name}

# Check a record across all servers
checkDNSUpdate: verifyName verifyIP
    for server in ${allDNSServers}; do                                                                              \
        ${make} checkAddr ip=${ip} name=${name} dnsserver=$server              ||      \
            exit 10                                                                                                         ;       \
    done

# Check host has IP
checkAddr: verifyName verifyIP ${DIG}
    echo -n ${indent}Checking $server for A record of ${name} on ${dnsserver}...
    ${DIG} ${name} @${dnsserver} A | grep -q ^${name}.*IN.*A.*${ip}         ||      (       \
        echo " FAIL!"                                                                                                           &&      \
        echo ${name} is not set to ${ip} on ${dnsserver}!                                       &&      \
        exit 1                                                                                                                          )
    echo " OK"

Now, I’ll just add the IP that I got for the virtual machine to the DNS:

$ make addHost name=example.winkyfrown.com. ip=999.999.999.999
> > Checking for A record of example.winkyfrown.com. on web.nichework.com... OK
> > Checking for A record of example.winkyfrown.com. on ns1.worldwidedns.net... OK
> > Checking for A record of example.winkyfrown.com. on ns2.worldwidedns.net... OK
> > Checking for A record of example.winkyfrown.com. on ns3.worldwidedns.net... OK
> > Checking for A record of example.winkyfrown.com. on 1.1.1.1... OK

(Of note: this goes back to the checklist bit. When I first tested this, I found that my nsupdate wasn’t propagating to one of my secondaries. It prompted me to check who was allowed to do zone transfers from the host and fix the problem.)

Basic server setup

I believe in versioning (wherever it is easy). So the first thing we’ll do is install etckeeper.

#
verifyHost:
    test -n "${REMOTE_HOST}"                                                                                        ||      (       \
        echo Please set REMOTE_HOST!                                                                            &&      \
        exit 10                                                                                                                         )

#
verifyCmd:
    test -n "${cmd}"                                                                                                        ||      (       \
        echo Please set cmd!                                                                                            &&      \
        exit 10                                                                                                                         )

doRemote: verifyHost verifyCmd
    echo ${indent}running '"'${cmd}'"' on ${REMOTE_HOST}
    ssh ${REMOTE_HOST} "${cmd}"


# Set up etckeeper on host
initEtckeeper:
    ${make} doRemote cmd="sh -c 'test -d /etc/.git || sudo apt install -y etckeeper'"

Initial installation of Apache+PHP on the server

Finally, let’s set up a webserver!

# Install the basic LAMP stack
initLamp:
    ${make} doRemote cmd="sh -c 'test -d /etc/apache2 || sudo apt install -y        \
        php-mysql php-curl php-gd php-intl php-mbstring php-xml php-zip                 \
        libapache2-mod-php'"
    ${make} doRemote cmd="sh -c 'test -d /var/lib/mysql || sudo apt install -y mariadb-server'"
    ${make} doRemote cmd="sh -c 'sudo systemctl enable apache2'"
    ${make} doRemote cmd="sh -c 'sudo systemctl enable mariadb'"
    ${make} doRemote cmd="sh -c 'sudo systemctl start apache2'"
    ${make} doRemote cmd="sh -c 'sudo systemctl start mariadb'"

    curl -s -I ${REMOTE_HOST} | grep -q ^.*200.OK                                           ||      (       \
        echo Did not get "'200 OK'" from ${REMOTE_HOST}                                         &&      \
        exit 1                                                                                                                          )
    touch $@

And the basic website:

setupSite: initLamp verifyRemotePath
    ${make} doRemote cmd="sh -c 'test -x /usr/bin/tee || sudo apt install -y        \
            coreutils'"
    (                                                                                                                                                       \
        echo "<VirtualHost *:80>"                                                                 &&      \
        echo "  ServerName ${REMOTE_HOST}"                                                &&      \
        echo "  DocumentRoot ${REMOTE_PATH}/html"                                 &&      \
        echo "  ErrorLog ${REMOTE_PATH}/logs/error.log"                   &&      \
        echo "  CustomLog ${REMOTE_PATH}/logs/access.log combined"                      &&      \
        echo "  <Directory ${REMOTE_PATH}/html>"                                                        &&      \
        echo "          Options FollowSymlinks Indexes"                                                 &&      \
        echo "          Require all granted"                                                                    &&      \
        echo "          AllowOverride All"                                                                              &&      \
        echo "  </Directory>"                                                                                           &&      \
        echo "</VirtualHost>"                                                                                                   \
    ) | ${make} doRemote                                                                                                            \
        cmd="sh -c 'test -f /etc/apache2/sites-available/${REMOTE_HOST}.conf || \
                sudo tee /etc/apache2/sites-available/${REMOTE_HOST}.conf'"
    ${make} doRemote                                                                                                                        \
        cmd="sh -c 'test -L /etc/apache2/sites-enabled/${REMOTE_HOST}.conf      ||      \
            sudo a2ensite ${REMOTE_HOST}'"
    ${make} doRemote                                                                                                                        \
        cmd="sh -c 'test ! -L /etc/apache2/sites-enabled/${REMOTE_HOST}.conf || \
            sudo systemctl reload apache2                                                                   ||      \
            ( sudo systemctl status apache2 && false )'"
        touch $@

Finally, let’s deploy MediaWiki!

Purging whole namespaces of pages in MediaWiki

So, I was asked to purge all the pages in several categories. The smaller categories are relatively easy to do using the API sandbox.

  1. Visit the Special:ApiSandbox page on your wiki..gnome-shell-screenshot-QY4TO0.png
  2. Select the action purge. purge.png
  3. Select action=purge from the sidebar.gnome-shell-screenshot-EBGKO0.png
  4. Look for the generator option and then select allpages from the drop down.gnome-shell-screenshot-YX5XO0.png
  5. Return to the top of the page and select generator=allpages from the sidebar.gnome-shell-screenshot-65JRO0.png
  6. Look for the gapnamespace option and select the namespace you want to purge. gnome-shell-screenshot-JZZOO0.png
  7. Execute the request using the “Make request” button at the top of the page. gnome-shell-screenshot-XHXJO0.png
  8. When the request is complete, there may be the opportunity to repeat the request with the next batch of pages. You’ll see a button at the bottom of the JSON output that says “Continue”. Click it until the entire namespace has been purged. gnome-shell-screenshot-D14WO0.png

The API sandbox will let you play around with different parameters. For example, in the last screenshot, I set gaplimit (under generator=allpages) to 3 but I could have set it as high as 500 if I wanted.

So for namespaces that don’t have too many pages (say, less than 1000), this is do-able. But for your average-sized wiki, a namespace is likely hold tens of thousands of pages. Something more is needed.

Next, purging namespaces programatically.

MABS status report: Updating MediaWiki::API

For the past couple of weeks, I’ve had a significant amount of time to spend on Multilateral, Asynchronous, Bidirectional Synchronisation of wikis or MABS for short.

This is all built on the git remote for MediaWiki work that was started almost a decade ago by some students. Since the initial effort there have been some significant changes in the MediaWiki API and, in the meantime, the MediaWiki::API Perl module that is doing a lot of heavy lifting in this project hasn’t seen a lot of work. For example, the last commit on the GitHub repository was to fix a typo in 2015.

So, I’ve been working this past week on updating the Perl module. This has been a lot of fun since I used to be quite the Perl snob—and by that I mean I looked down on people who didn’t love Perl, not that I looked down on Perl. Times have changed for me in the past ten or eleven years, so I’ve acquired some humility and begun doing a lot of work in what I would have considered to be the bottom-of-the-barrel language: PHP. Coming back to Perl is a lot of fun.

That said, Perl has continued to grow while I’ve been gone and I need some advice. I’ve become a huge fan of linters, so one change has been adhering pretty closely to almost every criticism Perl::Critic throws at me. I’ve gone as far as adding “smx” after almost every regular expression and incompatibly changing use constant to Readonly. You might say I’m getting a little carried away.

This and fixing the tests to use a docker instance (if available) rather than just sending every tester to the testwiki, as well as fixing some bugs I found along the way, has helped me understand this vital piece of the MABS project.

Still, coming back to Perl has made me realize just how ad hoc Perl’s object system is. I’ve heard of Moose and Mus (which I’m leaning towards), but I was wondering what best-practices the Perl community has for updating an existing code base.

Update 1: I asked for some feedback on the Perl object system to use and got some great feedback.

Update 2: I contacted the original author (Jools Wills) of the MediaWiki::API module and talked to him about what direction to take with it. I’ll have to do some more work on it to make it work well with for my purposes, but I may end up sending him a bunch of pull requests.

Photo by Roger McLassus [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0/)], via Wikimedia Commons