Skip to content

Kubernetes

Tips

StatefulSets

Using a stateful set is very helpful for when you need to have predictable dns names to access pods within your cluster. For example if you service called webserver in the namespace of samswebfarm.

Using a statefulset you would have a dns name of webserver.samswebfarm.svc.cluster.local which would be your loadbalancer address. This allows you to have something to point other services at if/when needed.

If you needed to target the pods then you would have the servicename-0,1,2, etc… So if have a replica set of 2 then your pods would be webserver-0 and webserver-1 which would give you a dns name of webserver-0.samswebfarm.svc.cluster.local and webserver-1.samswebfarm.svc.cluster.local. If you had 3 replica's the third would be webserver-2.samswebfarm.svc.cluster.local and so on.

Intel Quick Sync

Check to see if your nodes have the stuff needed to do intel quick sync

kubectl get nodes -o=jsonpath="{range .items[*]}{.metadata.name}{'\n'}{' i915: '}{.status.allocatable.gpu\.intel\.com/i915}{'\n'}"

Troubleshooting

CoreDNS Keeps Crashing

This happens when the system you're running things doesn't have a resolv.conf or there is something different that what is expected. To fix this edit your configmap for coredns

kubectl -n kube-system edit configmaps coredns -o yaml

Then replace the section: forward . /etc/resolv.conf with something like forward . 1.1.1.1

Now delete the CoreDNS Pods, or wait till the pods pickup the update.

Bare Metal; the Hard-way

I've often said the best way to learn something is to teach it. The best way to teach it is to understand it. To understand it, you gotta build it. At least that's how I do most things in IT. I can't say I fully understand everything about Kubernetes but I'm damn close to understanding the basics. I understand that this isn't going to be for most people and that most people are completely happy with never building this on bare metal. After all you have to be a little crazy to this. HOWEVER if you actually do it, you'll learn a lot, or at least I did.

Using K0S to Get Closer to Raw K8S

So I'm going to take a shortcut and cheat a little. While yes, I can totally build out k8s using upstream packages. I've found for my purposes using k0s is damn near identical and a hell of a lot faster to get started. While some people may jump into something like k3s, microk8s, etc. they abstract a lot of stuff away from you. For example, do you know how to flannel works? Have you ever deployed it? When it breaks do you understand it enough to fix it? If you've never built it then I'm guessing the answer is no. So let's fix that!

First we need k0sctl, this will let us deploy the config pretty quickly to our nodes, I also recommend k9s while you're at it

go install github.com/k0sproject/k0sctl@latest
go install github.com/derailed/k9s@latest

Deploying k0s via k0sctl

Once you have that you'll need a config file, you can generate one with k0sctl or just use this one:

---
apiVersion: k0sctl.k0sproject.io/v1beta1
kind: Cluster
metadata:
  name: k0s-cluster
spec:
  hosts:
    - ssh:
        address: 2601:123:456:7890::11
        user: root
        port: 22
        keyPath: ~/.ssh/id_ed25519
      role: controller+worker
      noTaints: true
    - ssh:
        address: 2601:123:456:7890::12
        user: root
        port: 22
        keyPath: ~/.ssh/id_ed25519
      role: worker
    - ssh:
        address: 2601:123:456:7890::13
        user: root
        port: 22
        keyPath: ~/.ssh/id_ed25519
      role: worker
  k0s:
    version: null
    versionChannel: stable
    dynamicConfig: false
    config:
      apiVersion: k0s.k0sproject.io/v1beta1
      kind: ClusterConfig
      metadata:
        creationTimestamp: null
        name: k0s
      spec:
        network:
          kubeProxy:
            mode: ipvs
            ipvs:
              strictARP: true

Since we'll be deploying metallb I've included strictARP for kubeproxy. This config will download the latest version of k0s onto the nodes and get them setup.

You can deploy it and update your kube config by doing:

mkdir -p ~/.kube
k0sctl apply --config k0sctl.yaml
k0sctl kubeconfig --config k0sctl.yaml | tee ~/.kube/config

I recommend you make an Ansible inventory file to help with deploying and fixing things.

---
all:
  hosts:
    controller:
      ansible_host: 2601:123:456:7890::11
      ansible_user: root
    worker1:
      ansible_host: 2601:123:456:7890::12
      ansible_user: root
    worker2:
      ansible_host: 2601:123:456:7890::13
      ansible_user: root
  children:
    k0s:
      hosts:
        controller:
        worker1:
        worker2:

Fixing Your Messes

If for someone reason you need to unfuck yourself if (when) you blow it up. Don't worry, it's easy to fix by just going to the nodes and running k0s reset and making sure the /etc/k0s and /var/lib/k0s dirs are removed. Then give it a reboot.

You can use this playbook to help automate things:

---
- name: Reset & Remove K0S
  hosts: k0s
  tags: [k0s]
  vars:
    k0s_services:
      - k0scontroller
      - k0sworker
    k0s_paths:
      - /var/lib/k0s
      - /etc/k0s
  handlers:
    - name: Reboot
      ansible.builtin.reboot:
  tasks:
    - name: Reset & Remove K0S
      notify: Reboot
      ignore_errors: true
      block:
        - name: Run Stop Command
          ansible.builtin.command: "k0s stop"
        - name: Stop K0S Services if still running
          ansible.builtin.systemd:
            service: "{{ item }}"
            state: stopped
          loop: "{{ k0s_services }}"
          register: k0s_stop_results
          until: k0s_stop_results is success
          retries: 1
          delay: 5
        - name: Run Reset Command
          ansible.builtin.command: "k0s reset"
        - name: Remove Configs
          ansible.builtin.file:
            path: "{{ item }}"
            state: absent
            force: true
          loop: "{{ k0s_paths }}"

At this point you have a full cluster ready to go. You can deploy whatever you want to it, use it, abuse it, destroy it, etc.

Helm Charts

However I'm guessing we'll want to install some helm charts. In this case I'm to be using traefik and metallb to handle my load balancing & cluster ip. I also will be using openebs and nfs for data storage.

Just because we're doing this hard way, doesn't mean we don't have to do things the hard way. You can run everything manually however I'm going to use Ansible to add my repos and deploy my charts.

---
- name: Helm Repos, Plugins. & Charts
  hosts: k0s
  tags: [k0s]
  vars:
    local_user_name: samshamshop
    nfs_server: nfs.server=nfs.samsfantastichams.org
    nfs_path: nfs.path=/mnt/nfs/datavol
  tasks:
    - name: Add Helm Repos & Plugins
      become: true
      become_user: "{{ local_user_name }}"
      delegate_to: localhost
      run_once: true
      block:
        - name: Install Helm env plugin
          kubernetes.core.helm_plugin:
            plugin_path: https://github.com/adamreese/helm-env
            state: present
        - name: Install Helm diff plugin
          kubernetes.core.helm_plugin:
            plugin_path: https://github.com/databus23/helm-diff
            state: present
        - name: Add nfs-subdir-external-provisioner repository
          kubernetes.core.helm_repository:
            name: nfs-subdir-external-provisioner
            repo_url: https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
        - name: Add traefik repository
          kubernetes.core.helm_repository:
            name: traefik
            repo_url: https://traefik.github.io/charts
        - name: Add cert-manager repository
          kubernetes.core.helm_repository:
            name: jetstack
            repo_url: https://charts.jetstack.io
        - name: Add openebs repository
          kubernetes.core.helm_repository:
            name: openebs
            repo_url: https://openebs.github.io/charts
        - name: Add longhorn repository
          kubernetes.core.helm_repository:
            name: longhorn
            repo_url: https://charts.longhorn.io
    - name: Deploy Helm Charts
      become: true
      become_user: "{{ local_user_name }}"
      delegate_to: localhost
      run_once: true
      block:
        - name: Deploy cert-manager
          kubernetes.core.helm:
            name: cert-manager
            chart_ref: jetstack/cert-manager
            release_namespace: cert-manager
            create_namespace: true
            set_values:
              - value: installCRDs=true
        - name: Deploy MetalLB
          kubernetes.core.helm:
            name: metallb
            chart_ref: metallb/metallb
            namespace: metallb-system
            create_namespace: true
        - name: Deploy traefik
          kubernetes.core.helm:
            name: traefik
            chart_ref: traefik/traefik
            namespace: traefik
            create_namespace: true
        - name: Deploy OpenEBS cStor
          kubernetes.core.helm:
            name: openebs
            chart_ref: openebs/openebs
            namespace: openebs
            create_namespace: true
            set_values:
              - value: cstor.enabled=true
        - name: Deploy nfs-subdir-external-provisioner
          # This fails if run a second time, hence the ignore
          # https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner
          kubernetes.core.helm:
            name: nfs-subdir-external-provisioner
            chart_ref: nfs-subdir-external-provisioner/nfs-subdir-external-provisioner
            namespace: kube-system
            create_namespace: true
            set_values:
              - value: "{{ nfs_server }}"
              - value: "{{ nfs_path }}"
          ignore_errors: true

Give that some time to work though everything and come up. It's going to take a while. You can use k9s to monitor things in the mean time. Once everything is up and running for the most part we can move on.

OpenEBS Configmap Edits

For my purposes openebs does not support lvm out of the box. So we need to tweak the configmap. If you're using LVM or something like that you'll need to do this to make cStor / NDM work for you.

EDITOR=vim kubectl edit configmap -n openebs openebs-ndm-config -o yaml

You'll want to remove /dev/dm- from the path-filter excludes.

- key: path-filter
        name: path filter
        state: true
        include: ""
        exclude: "/dev/loop,/dev/fd0,/dev/sr0,/dev/ram,/dev/md,/dev/rbd,/dev/zd"

This will enable you to use lvm volumes. Though if you have attached raw disks, then you really shouldn't need this.

Traefik Dashboard & Metallb Pools

Now that we have openebs sorted we need to give ourselves someway to access the cluster, since we're on bare metal, let's use metallb and traefik!

---
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: metal-pool
  namespace: metallb-system
spec:
  addresses:
    - 10.10.10.20-10.10.10.40
    - 2601:123:456:7890::1000:1/64
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: metal-l2
  namespace: metallb-system
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: dashboard
spec:
  entryPoints:
    - web
  routes:
    - match: PathPrefix(`/dashboard`) || PathPrefix(`/api`)
      kind: Rule
      services:
        - name: api@internal
          kind: TraefikService  

We can apply this with kubectl by doing

kubectl apply -f metallb-pool.yaml
kubectl apply -f traefik-dashboard.yaml

Or we can stick theme of Ansible, because once we have it done in ansible, we can just add more without adding more work.

---
- name: Let's deploy some stuff!
  hosts: k0s
  tags: [k0s]
  vars:
    local_user_name: samshamshop
    k8s_dir: "{{ playbook_dir }}/../k8s"
  tasks:
    - name: Run k8s configs
      become: true
      become_user: "{{ local_user_name }}"
      delegate_to: localhost
      run_once: true
      block:
        - name: Install metallb pools
          kubernetes.core.k8s:
            src: "{{ k8s_dir }}/metallb-pool.yaml"
            state: present
            namespace: metallb-system
        - name: Install traefik dashboard
          kubernetes.core.k8s:
            src: "{{ k8s_dir }}/traefik-dashboard.yaml"
            state: present
            namespace: traefik

At this point we have a basic cluster setup and you can start doing whatever you want!

Using K3S, All the Shinny Things out of the Box