Deploying Tailscale/Headscale for private mesh networking

Author(s) orcid logoHelena Rasche avatar Helena Rasche
Editor(s) orcid logoNate Coraor avatar Nate Coraor
Overview
Creative Commons License: CC-BY Questions:
  • What is Tailscale?

  • When is it useful?

  • Is it right for me?

Objectives:
  • Setup a tailnet across a few nodes

Requirements:
Time estimation: 60 minutes
Supporting Materials:
Published: Sep 21, 2022
Last modification: Mar 17, 2023
License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MIT
purl PURL: https://gxy.io/GTN:T00027
version Revision: 3

Tailscale makes secure networking easy, it really is like magic. If you’ve used wireguard before, you know it takes a bit to setup and some configuration if you need to do anything fancy.

Agenda
  1. What is Tailscale?
    1. Is it right for me?
  2. Setting up the infrastructure

What is Tailscale?

It’s like Wireguard but easier and they built a lot of nice features on top of it. It’s networking that “Just Works™” even more than Wireguard. If you prefer to use plain Wireguard without Headscale/Tailscale, or just want to get an understanding of the technology that Headscale/Tailscale build off of, there is a tutorial for that as well.

Is it right for me?

if you have machines that need to talk to each other privately, and you don’t have a better way to do it like a local network team, then yes, it’s a great solution to private, secure, fast networking. if you need auditing, tailscale will do that, rather than you having to build it out yourself. it has excellent performance despite the encryption, and is built directly into the kernel.

By using wireguard, you can let services listen only on the wireguard interface, and thus only known and trusted machines can access those services.

Tailscale makes wireguard setup even easier by removing the key management step, which normally requires distributing keys to every machine. Instead that step is handled centrally, and in the case of Tailscale enforceable with ACLs and SSO and 2FA policies, however the networking remains meshed, and machines connect directly to one another.

You can go one step further than trusting individual machines, with Tailscale, as every device is tied to a single user, and you can ask Tailscale what is the authenticated identity of the specific TCP connection, allowing automatically logging in your users.

In the context of Galaxy, this can be useful for components like Interactive Tools, which require a web proxy between Galaxy and the cluster node where the tool runs. If the cluster is not on the local network, Wireguard can be used to securely bridge the gap, and Headscale or Tailscale can greatly simplify that process.

Setting up the infrastructure

Hands-on: Choose Your Own Tutorial

This is a "Choose Your Own Tutorial" section, where you can select between multiple paths. Click one of the buttons below to select how you want to follow the tutorial

Hands-on: Configuration files
  1. Create a ansible.cfg file (next to your playbook) to configure settings like the inventory file (and save ourselves some typing!), or the Python interpreter to use:

    --- /dev/null
    +++ b/ansible.cfg
    @@ -0,0 +1,6 @@
    +[defaults]
    +interpreter_python = /usr/bin/python3
    +inventory = hosts
    +retry_files_enabled = false
    +[ssh_connection]
    +pipelining = true
       
    
  2. Create the hosts inventory file if you have not done so yet.

    Input: Bash
    cat hosts
    
    Output: Bash

    Your hostname is probably different:

    Pick one to be the head, and the rest to be the tail. The head will act as the coordination server, the tail will be the nodes that should talk to each other.

    --- /dev/null
    +++ b/hosts
    @@ -0,0 +1,5 @@
    +[head]
    +1-wg.galaxy.training
    +[tail]
    +2-wg.galaxy.training
    +3-wg.galaxy.training
    +4-wg.galaxy.training
    
    

    Place all of the nodes in a tail group

    --- /dev/null
    +++ b/hosts
    @@ -0,0 +1,5 @@
    +[tail]
    +1-wg.galaxy.training
    +2-wg.galaxy.training
    +3-wg.galaxy.training
    +4-wg.galaxy.training
    

Tailscale has implemented the coordination server and infrastructure with much better, more robust infrastructure that you won’t have to be responsible for. If you can get your institution to pay for it, or grant money covering it, it’s probably worth it. They add many new features often, and things like the mobile apps only work with their service. Tailscale is free for personal use (i.e. to test things out) and offers a free plan for open source projects that you may qualify for.

For a training event obviously we want something free and quick to setup and destroy, so, we’re using Headscale since it’s free and we’re just going to destroy it immediately, and no one will accidentally get billed ;)

Using Headscale will also teach you everything you need to know if you do choose to use Tailscale, which is simpler and has fewer components for you to manage yourself.

Hands-on: Installing Headscale
  1. Install the role

    Input: Bash
    ansible-galaxy install -p roles ckstevenson.headscale
    
  2. Create and open head.yml which will be our playbook. Add the following:

    --- /dev/null
    +++ b/head.yml
    @@ -0,0 +1,16 @@
    +---
    +- name: Headscale
    +  hosts: head
    +  become: true
    +  vars:
    +    headscale_user: 'headscale'
    +    headscale_version: '0.15.0'
    +    headscale_namespaces:
    +    - galaxy
    +  roles:
    +    - ckstevenson.headscale
    +  post_tasks:
    +    - command: headscale --namespace galaxy preauthkeys create --reusable --expiration 1h
    +      register: authkey
    +    - debug:
    +        msg: "{{ authkey.stdout.split('\n')[-1] }}"
       
    
  3. Run the playbook:

    Input: Bash
    ansible-playbook headscale.yml
    
  4. This will return a code in the debug output. Save this code, you’ll need it shortly

Now we can setup the nodes

Hands-on: Configure the nodes
  1. Install the role

    Input: Bash
    ansible-galaxy install -p roles artis3n.tailscale
    
  2. Edit tail.yml and add the following.

    --- /dev/null
    +++ b/tail.yml
    @@ -0,0 +1,17 @@
    +---
    +- name: Tailscale
    +  hosts: tail
    +  become: true
    +  vars:
    +    tailscale_args: "--advertise-exit-node --login-server http://{{ hostvars[groups['head'][0]].inventory_hostname }}:8080"
    +  pre_tasks:
    + - sysctl:
    +     name: net.ipv4.ip_forward
    +     value: '1'
    +     state: present
    + - sysctl:
    +     name: net.ipv6.conf.all.forwarding
    +     value: '1'
    +     state: present
    +  roles:
    +    - artis3n.tailscale
       
    
    --- /dev/null
    +++ b/tail.yml
    @@ -0,0 +1,17 @@
    +---
    +- name: Tailscale
    +  hosts: tail
    +  become: true
    +  vars:
    +    tailscale_args: "--advertise-exit-node"
    +  pre_tasks:
    + - sysctl:
    +     name: net.ipv4.ip_forward
    +     value: '1'
    +     state: present
    + - sysctl:
    +     name: net.ipv6.conf.all.forwarding
    +     value: '1'
    +     state: present
    +  roles:
    +    - artis3n.tailscale
    
  3. Run the playbook:

    Input: Bash
    ansible-playbook tail.yml -e tailscale_authkey=YOUR_CODE
    

    Remember, you can find this code from the output of the first playbook

    You can generate an authentication key under your Tailscale account page

  4. Go check out your tailnet! Play around with the tailscale command and pinging other nodes with the suffix .galaxy.example.org <username>.org.github.beta.tailscale.net

    Note that you’ll need to enable MagicDNS in your acount settings.

We’ve configured --advertise-exit-node, which means you can direct ALL of your traffic to use one of your tailscale endpoints as an exit node, just run tailscale up --exit-node=...

Note that:

  • If you’re using headscale you need to manually enable that route (check the node list via headscale nodes list and then enable the specific route via headscale nodes routes enable -i ...), this is automatic in Tailscale
  • If you enable it, on a remote machine, it will immediately become unresponsive. Only do this on your local machine, e.g. a laptop connected to your tailnet.