ansible roles

Automating My Infrastructure with Ansible and Gitlab CI: Part 3 – SSH Keys and Dotfiles

This post may contain affiliate links. Please see the disclaimer for more information.

This is the third post in a series, you can find the first two installments here and here.

Having recently reinstalled on both of my client machines, I took the opportunity to rotate my SSH keys. Luckily I backed up the old keys before doing this, so I didn’t lock myself out of anything. However, it did leave me having to update the authorized_keys files on all my servers (about 15 at last count). Of course there is a better way than doing this all manually, so cue some Ansible automation!

While I was at it I decided it would be nice to deploy my dotfiles across all my machines. I’ve had them stored in a git repo for some time and manage them with GNU Stow. However, I would never get around to deploying the repo onto new machines and installing all the relevant tools. Writing the Ansible automation to do this was pretty tricky, but I got there in the end. I also added my client machines to my Ansible inventory so that they get the same setup deployed to them.

Getting Started

For those who haven’t read the previous installments in this series, all the code for this article is going in my main ansible-infrastructure repo on GitLab.

I started out by installing some base packages that I would need for the rest of the steps. This is complicated slightly by having different distributions on different machines. The servers are all running Ubuntu or Debian (usually in the form of Raspbian or Proxmox), whilst the clients are running Manjaro (i.e. Archlinux). This is easily dealt with in Ansible by way of a set of checks against ansible_distribution:

- hosts: all
  tasks:
    - name: Install common apt packages
      become: true
      when: ansible_distribution == 'Debian' or ansible_distribution == 'Ubuntu'
      apt:
        pkg:
          - vim
          - git
          - tmux
          - htop
          - dnsutils
          - ack-grep
          - stow
          - zsh
          - build-essential
          - python-dev
          - python3-dev
          - cmake
          - curl

    - name: Install common pacman packages
      become: true
      when: ansible_distribution == 'Archlinux'
      pacman:
        name:
          - vim
          - git
          - tmux
          - htop
          - bind-tools
          - ack
          - stow
          - zsh
          - cmake
          - clang
          - curl

You’ll see here that some of the packages are named differently in the different distros. We also need to use different package management modules for each.

It should be noted that I didn’t start out with this full list, but instead just added a few basics (e.g. git, vim, etc) and added more as I encountered the need for them. You’ll also notice that I take the opportunity to install any nice utilities that I like to have everywhere, such as htop and dig (provided by dnsutils/bind-tools).

Deploying SSH Keys

Deploying the SSH keys turned out to be fairly trivial, despite being the main task that I wanted to accomplish here. This is thanks to the excellent authorized_key module in Ansible:

- name: Set up authorized keys
  become: true
  authorized_key:
    user: '{{ admin_user }}'
    state: present
    key: '{{ item }}'
  with_file:
    - public_keys/aragorn.pub
    - public_keys/arathorn.pub
    - public_keys/work.pub
    - public_keys/phone.pub

Here I add a set of four keys from the repository, by way of the with_file clause in Ansible. I copied all the public keys into the playbooks/files/public_keys directory for ease of access. This also makes it easy to rotate keys as we’ll see below.

I set the user to add the keys to a custom variable called admin_user. This variable is set to a default value at the top of my inventory file and then overriden it for certain hosts or groups. For example I use the standard pi user on my Raspberry Pis, so the variable is set to pi for the rpis group. This ensures that the keys always get installed for the right user.

I also wanted to remove the old keys from my machines, which is pretty straightforward:

- name: Remove old authorized keys
  become: true
  authorized_key:
    user: '{{ admin_user }}'
    state: absent
    key: '{{ item }}'
  with_file:
    - public_keys/riker.pub.deprecated

Now if I want to rotate keys in future, I’ll just add the new key to the repository, rename the old key to remind myself that it’s no longer in use and update the file lists in these tags. Done!

A Minor Detour

I haven’t mentioned so far this article that all of this is running through the Gitlab CI pipeline I built in my original post. In fact this is most likely going to be a GitLab CI article, without much GitLab CI stuff. That’s because the previous pipeline has been working brilliantly.

However, one issue has been the speed. Making changes, committing, pushing and waiting for the pipeline to complete takes quite a long time. It was pretty frustrating given the number of iterations I needed to get this right!

I noticed that it took quite a while each time to install Ansible and Ansible Lint in the containers and that this was done twice for each pipeline. Given my recent success with custom Docker images, I built a quick image containing the tools I needed (with a 3 line Dockerfile!). I was able to quickly copy over the previous docker build pipeline and get this building via CI. All I then had to do was update the images used in my main pipeline and remove the old installation commands. Boom, much faster!

You can check out my Ansible image on GitLab and pull it with the command:

$ docker pull registry.gitlab.com/robconnolly/docker-ansible

I haven’t set up a periodic build for this yet, but I’m intending that this image will be automatically updated on a weekly basis.

Deploying My Dotfiles

My dotfiles are deployed from a private repository on my internal Gitea instance. So far I haven’t published them as they contain quite a few unredacted details of my network. In order to deploy them I generated a new SSH key and added it as a deploy key to the project in Gitea.

ansible dotfiles
Deploy keys in Gitea are added in the project Settings->Deploy Keys

I then encrypted the private key with Ansible vault (I added the public key to the repo too, in case I need it again in future):

$ ansible-vault encrypt playbooks/files/dotfiles_deploy_key

I then copy the private key to each of the machines which need it:

- hosts: all,!cloud,!rpis
  serial: 2
  tasks:
    - name: Copy deploy key
      become: true
      become_user: "{{ admin_user }}"
      copy:
        src: dotfiles_deploy_key
        dest: "/home/{{ admin_user }}/.dotfiles_deploy_key"
        owner: "{{ admin_user }}"
        group: "{{ admin_user }}"
        mode: 0600

You’ll note that the above is in a new play to the previous steps. That’s because I wanted to restrict which machines get my dotfiles. The cloud machined currently can’t access my Gitea instance, since I still need to deploy my OpenVPN setup to some of them. The Raspberry Pis have trouble with some of the later steps in the setup, so I’ve skipped them too for now. I’m also running this two hosts at a time, because of the compilation step (see below).

The next step is to simply clone the repository with the Ansible git module:

- name: Clone dotfiles repo
  become: true
  become_user: "{{ admin_user }}"
  git:
    repo: "{{ dotfiles_repo }}"
    dest: "/home/{{ admin_user }}/dotfiles"
    version: master
    accept_hostkey: true
    ssh_opts: "-i /home/{{ admin_user }}/.dotfiles_deploy_key"

The dotfiles_repo variable host the URL to clone the repository from and is again defined in my encrypted inventory file. I use the ssh_opts clause to set the key for git to use.

You’ll note that the tasks above all use become_user to switch to the admin_user. In order to get this to work on some of my hosts I had to set allow_world_readable_tmpfiles to true. This has some security implications, so you might want to tread carefully, if you have potentially untrustworthy users on your systems. It seemed to work without this set on Ubuntu based systems, but those with a pure Debian base had issues.

Running Stow

The next step was to unstow a few of the newly deployed files. For this we can use the command module and with_items:

- name: Unstow dotfiles
  become: true
  become_user: "{{ admin_user }}"
  command:
    chdir: "/home/{{ admin_user }}/dotfiles"
    cmd: "stow {{ item.name }}"
    creates: "{{ item.file }}"
  with_items:
    - { name: ssh, file: "/home/{{ admin_user }}/.ssh/config"}
    - { name: vim, file: "/home/{{ admin_user }}/.vimrc"}
    - { name: tmux, file: "/home/{{ admin_user }}/.tmux.conf"}

In the items list here I pass a dictionary of names and filenames. This is so that Ansible can use one of the files which should be created to know if it needs to run the command. These are accessed with the item.variable notation in the templates. I really like the templating in Ansible!

Dealing With Vim Plugins

My Vim config contains a load of plugins, which I manage with Vundle. Installing these should be pretty simple, but it confused me for ages because the command always seemed to fail (with no output!), even when I could see it working on the command line. As it turns out, the command will exit with a return code of one even when it is successful! You can see why I was confused! In the end I came up with:

- name: Install vundle
  become: true
  become_user: "{{ admin_user }}"
  git:
    repo: https://github.com/VundleVim/Vundle.vim.git
    dest: "/home/{{ admin_user }}/.vim/bundle/Vundle.vim"
    version: master

- name: Install vim plugins
  become: true
  become_user: "{{ admin_user }}"
  shell:
    cmd: 'vim -E -s -c "source /home/{{ admin_user }}/.vimrc" -c PluginInstall -c qa || touch /home/{{ admin_user }}/.vim/plugins_installed'
    creates: "/home/{{ admin_user }}/.vim/plugins_installed"

I ended up using the shell module from Ansible to create a file when the installation completes. This file is used as the check in Ansible for whether it should run the command again. The || operator has to be used here (rather than &&) due to the weird return code. This does however have the effect of changing the overall return code to zero, which makes Ansible happy.

The final step here is compiling the YouCompleteMe plugin, which is just running another command:

- name: Build ycm_core
  become: true
  become_user: "{{ admin_user }}"
  command:
    cmd: "./install.py --clang-completer {% if ansible_distribution == 'Archlinux' %}--system-libclang{% endif %}"
    chdir: "/home/{{ admin_user }}/.vim/bundle/YouCompleteMe"
    creates: "/home/{{ admin_user }}/.vim/bundle/YouCompleteMe/third_party/ycmd/ycm_core.so"
  environment:
    YCM_CORES: 1

You’ll see above that the command is different on Arch based systems, since I use the system libclang there to work around a compile issue. I also define the YCM_CORES environment variable. This limits the number of cores to one, which seems to stop the build running out of memory on small virtual machines!

Deploying Oh-My-Zsh

The final piece to this increasingly complex puzzle is installing Oh-My-Zsh to give me a nice Zsh environment. This is again accomplished with the shell module:

- name: Install oh-my-zsh
  become: true
  become_user: "{{ admin_user }}"
  shell:
    cmd: sh -c "$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)" "" --unattended
    creates: "/home/{{ admin_user }}/.oh-my-zsh"
  register: ohmyzsh

You’ll see here that I register a variable containing the status of this task. This is used in the next step to delete the default zshrc that the installer will create for us:

- name: Remove default zshrc
  become: true
  become_user: "{{ admin_user }}"
  file:
    name: "/home/{{ admin_user }}/.zshrc"
    state: absent
  when: ohmyzsh.changed

I then unstow my Zsh config as before:

- name: Unstow zsh config
  become: true
  become_user: "{{ admin_user }}"
  command:
    chdir: "/home/{{ admin_user }}/dotfiles"
    cmd: "stow zsh"
    creates: "/home/{{ admin_user }}/.zshrc"

This process is probably ripe for simplification, since I assume the installer wouldn’t overwrite an existing zshrc. If I unstowed the Zsh config earlier I could probably remove the file deletion, but I haven’t tried this to see if it works.

The absolute last step, is to switch the default shell for admin_user over to Zsh:

- name: Change shell to zsh
  become: true
  user:
    name: "{{ admin_user }}"
    shell: /usr/bin/zsh

Done!

Conclusion

Phew! That comes out as a pretty epic playbook. I’ve opted to keep this all in my common playbook for now since it’s getting run against every machine, along with my previous roles. I may split it up later however, if it becomes useful to do so.

The playbook works really well now and it’s nice to have the same environment on every machine. I also really like the centralised SSH key management, which solves a real issue for me.

One improvement I would like to make would be around the syncing of changes to the dotfiles repository to all the machines. This could be as simple as deploying a cron job to git pull that repository periodically, but I’d rather have it react to changes in to repo. I could move the repository to GitLab and run a pipeline which would deploy it, but this would mean duplicating my Ansible inventory (and keeping two copies up to date). I’m wondering if a webhook could be used to trigger the main CI pipeline?

I’m interested to know if anyone out there has solved similar problems in a different way. Please let me know in the comments!

If you liked this post and want to see more, please consider subscribing to the mailing list (below) or the RSS feed. You can also follow me on Twitter. If you want to show your appreciation, feel free to buy me a coffee.