Upgrade guide

Upgrading to 0.8

Note

0.8 introduce many breaking changes. We gave our best to make the upgrade smooth but this will require a lot of manual intervention. Please reach out to us for help if needed !

When upgrading to version 0.8 OpenAppStack will be renamed to its final name: Stackspin. This comes with many changes, some of which need to be applied manually.

We have written a script to automate a lot of the preparations for the upgrade. However, afterwards you might need to get your hands dirty to get all your applications to work again. Read this whole upgrade guide carefully, before you get started!

# Log in to your Stackspin server
ssh <server>
# Download our upgrade script
wget https://open.greenhost.net/stackspin/stackspin/-/raw/main/upgrade-scripts/to-0.8.0/rename-to-stackspin.sh
chmod +x rename-to-stackspin.sh

First of all, if you have any -override configmaps or secrets, you’ll want to move them from the oas namespace to the stackspin namespace, and from oas-apps to stackspin-apps (you also need to make these namespaces first). You also need to rename them from oas-X to stackspin-X. You can use a command like this to rename the cm and move it to the right namespace.

kubectl get cm -n oas-apps oas-$APP-override -o json | jq '.metadata.name="stackspin-$APP-override"' | jq '.metadata.namespace="stackspin-apps"' | kubectl apply -f -

This script will cause serious down time and it will not do everything for you. Rather, it will prepare your cluster for the upgrade.

The script does the following:

  1. Install jq

  2. Shut down the cluster, make a back-up of the data, and bring the cluster back up

  3. Copy all relevant oas-* secrets to stackspin-*

  4. Move all PersistentVolumeClaims to the stackspin and stackspin-apps namespaces and sets the PersistentVolumes ReclaimPolicy to “Retain” so your data is not accidentally deleted.

  5. Delete all OAS flux kustomizations

  6. Delete the oas and oas-apps namespace

  7. Create the new stackspin source and kustomization

Because there are not many Stackspin users yet, the script can need some manual adjustments. It was written for clusters on which all applications are installed. If you have not installed some of the applications, please remove these applications form the script manually.

# Execute the upgrade preparation script
./rename-to-stackspin.sh

After this, you need to update secrets and Flux in the cluster by running install/install-stackspin.sh. Then re-install applications by running install/install-app.sh <app> from the Stackspin repository. See the application specific upgrade guides below.

After all your applications work again, you can clean up the old secrets and reset the Persistent Volume ReclaimPolicy to Delete

wget https://open.greenhost.net/stackspin/stackspin/-/raw/main/upgrade-scripts/to-0.8.0/cleanup.sh
chmod +x cleanup.sh
./cleanup.sh

Nextcloud

Your SSO users will have new usernames, because the OIDC provider has been renamed from oas to stackspin and because the new SSO system uses UUIDs to uniquely identify users.

You can choose from these options:

  1. Manually re-upload and re-share your files after logging in to your new user for the first time.

  2. It is possible to transfer files from your previous user to the new user. To do so, find your new username. It is visible in Settings -> Sharing behind “Your Federated Cloud ID” after you’ve logged out and in to Nextcloud with the new SSO (the part before the @).

    # Exec into the Nextcloud container
    kubectl exec -n stackspin-apps nc-nextcloud-xxx-xxx -it -- /bin/bash
    # Change to the www-data user
    su www-data -s /bin/bash
    # Repeat this command for each username
    php occ files:transfer-ownership oas-<old username> <new user ID>
    # Note: the files are tranferred to a subfolder in the new user's
    # directory
    

Depending on when you first installed Nextcloud, the setup-apps job may fail during the upgrade. If that happens, execute these commands in order to update the failing apps to their newest version, and to remove old files that can cause problems.

kubectl exec -n stackspin-apps deployment/nc-nextcloud -- rm -r /var/www/html/custom_apps/onlyoffice
kubectl exec -n stackspin-apps deployment/nc-nextcloud -- rm -r /var/www/html/custom_apps/sociallogin
flux suspend hr -n stackspin-apps nextcloud && flux resume hr -n stackspin-apps nextcloud

Rocket.Chat

We replaced Rocket.Chat with Zulip in this release. If you want to migrate your Rocket.Chat data to your new Zulip installation please refer to Import from Rocket.Chat.

Monitoring

The monitoring stack will work after the upgrade, but monitoring data from the previous version will not be available.

Wekan

In our testing we didn’t need to change anything for Wekan to work.

WordPress

In our testing we didn’t need to change anything for WordPress to work.

Upgrading to 0.7.0

Because of problems with Helm and secret management we had to move away from using a helm chart for application secrets, and now use scripts that run during installation to manage secrets. Because we have removed the oas-secrets helm chart, Flux will remove the secrets that it has generated. It is important that you back up these secrets before switching from v0.6 to v0.7!

Note

Before you start, please ensure that you have the right yq tool installed, because you will need it later. There are two very different versions of yq. The one you need is the go based yq from Mike Farah, which installs the same binary name as the python-yq one, while both have different command sets. The yq needed here can be installed by running sudo snap install yq, brew install yq or with other methods from the yq installation instructions.

If you’re unsure which yq you have installed, look at the output of yq --help and make sure eval shows up under Available Commands:.

To back-up your secrets, run the following script:

bash
#!/usr/bin/env bash

mkdir secrets-backup

kubectl get secret -o yaml -n flux-system  oas-cluster-variables > secrets-backup/oas-cluster-variables.yaml
kubectl get secret -o yaml -n flux-system  oas-wordpress-variables > secrets-backup/oas-wordpress-variables.yaml
kubectl get secret -o yaml -n flux-system  oas-wekan-variables > secrets-backup/oas-wekan-variables.yaml
kubectl get secret -o yaml -n flux-system  oas-single-sign-on-variables > secrets-backup/oas-single-sign-on-variables.yaml
kubectl get secret -o yaml -n flux-system  oas-rocketchat-variables > secrets-backup/oas-rocketchat-variables.yaml
kubectl get secret -o yaml -n flux-system  oas-kube-prometheus-stack-variables > secrets-backup/oas-kube-prometheus-stack-variables.yaml
kubectl get secret -o yaml -n oas          oas-prometheus-basic-auth > secrets-backup/oas-prometheus-basic-auth.yaml
kubectl get secret -o yaml -n oas          oas-alertmanager-basic-auth > secrets-backup/oas-alertmanager-basic-auth.yaml
kubectl get secret -o yaml -n flux-system  oas-oauth-variables > secrets-backup/oas-oauth-variables.yaml
kubectl get secret -o yaml -n flux-system  oas-nextcloud-variables > secrets-backup/oas-nextcloud-variables.yaml

This script assumes you have all applications enabled. You might get an error like:

Error from server (NotFound): secrets "oas-wekan-variables" not found

This is not a problem, but it does mean you need to add an oauth secret for Wekan to the file secrets-backup/oas-oauth-variables.yaml. Copy one of the lines under “data:”, rename the field to wekan_oauth_client_secret and enter a different random password. Make sure to base64 encode it (echo "<your random password>" | base64).

This script creates a directory called secrets-backup and places the secrets that have been generated by Helm in it as yaml files.

Now you can upgrade your cluster by running kubectl -n flux-system patch gitrepository openappstack --type merge -p '{"spec":{"ref":{"branch":"v0.7"}}}' or by editing the gitrepository object manually with kubectl -n flux-system edit gitrepository openappstack and setting spec.ref.branch to v0.7.

Flux will now start updating your cluster to version 0.7. This process will fail, because it will remove the secrets that you just backed up. Make sure that the oas-secrets helmrelease has been removed by running flux get hr -A. You might also see that some helmreleases start failing to be installed because important secrets do not exist anymore.

As soon as the oas-secrets helmrelease does not exist anymore, you can run the following code:

#!/usr/bin/env bash

# Again: make sure you use https://github.com/mikefarah/yq -- install with `snap install yq`
yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-wordpress-variables.yaml | kubectl apply -f -
yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-wekan-variables.yaml | kubectl apply -f -
yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-single-sign-on-variables.yaml | kubectl apply -f -
yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-rocketchat-variables.yaml | kubectl apply -f -
yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-kube-prometheus-stack-variables.yaml | kubectl apply -f -
yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-prometheus-basic-auth.yaml | kubectl apply -f -
yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-alertmanager-basic-auth.yaml | kubectl apply -f -
yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-oauth-variables.yaml | kubectl apply -f -
yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-nextcloud-variables.yaml | kubectl apply -f -

Again this script assumes you have all applications installed. If you get the following error, you can ignore it:

error: error validating "STDIN": error validating data: [apiVersion not set, kind not set]; if you choose to ignore these errors, turn validation off with --validate=false

Now Flux should succeed in finishing the update. Some helmreleases or kustomizations might have already failed because the secrets did not exist. Once failed, you can retrigger reconciliation of a kustomization using the commands flux reconcile kustomization ... or flux reconcile helmrelease .... This can take quite a while (over an hour some times), because Flux waits for some long timeouts before giving up and re-starting a reconciliation.

Potential upgrade issues

Some errors we’ve seen during our own upgrade process, and how to solve them:

SSO helm upgrade failed

oas          single-sign-on          False Helm upgrade failed: template: single-sign-on/templates/secret-oauth2-clients.yaml:9:55: executing "single-sign-on/templates/secret-oauth2-clients.yaml" at <b64enc>: invalid value; expected string  0.2.2     False

This means that the single-sign-on helmrelease was created with empty oauth secrets. The secrets will get a value once the core kustomization is reconciled: flux reconcile ks core should solve the problem.

If that does not solve the problem, you should check if the secret contains a value for all the apps:

# kubectl get secret -n flux-system oas-oauth-variables -o yaml
apiVersion: v1
data:
  grafana_oauth_client_secret: <redacted>
  nextcloud_oauth_client_secret: <redacted>
  rocketchat_oauth_client_secret: <redacted>
  userpanel_oauth_client_secret: <redacted>
  wekan_oauth_client_secret: <redacted>
  wordpress_oauth_client_secret: <redacted>
...

If your secret lacks one of these variables, use kubectl edit to add them. You can use any password generator to generate a password for it. Make sure to base64 encode the data before you enter it in the secret.

Loki upgrade retries exhausted

While running flux get helmrelease -A, you’ll see:

oas          loki                    False   upgrade retries exhausted         2.5.2     False

This happens sometimes because Loki takes a long time to upgrade. Usually it is solved by running flux reconcile hr loki -n oas again.

Upgrading to 0.6.0

A few things are important when upgrading to 0.6.0:

  • We now use Flux 2 and the installation procedure has been overhauled. For this reason we advice you to set up a completely new cluster.

  • Copy your configuration details from settings.yaml to a new .flux.env. See install/.flux.env.example and the Installation overview instructions for more information.

Please reach out to us if you are using, or plan to use OAS in production.

Upgrading from 0.4.0 to 0.5.0

Unfortunately we can’t ensure a smooth upgrade for this version neither. Please read the section below on how to do an upgrade by installing a the new OAS version from scratch after backing up your data.

Upgrading from 0.3.0 to 0.4.0

There is no easy upgrade path from version 0.3.0 to version 0.4.0. As far as we know, nobody was running OpenAppStack apart from the developers, so we assume this is not a problem.

If you do need to upgrade, this is how you can migrate your data. Backup all the data available under /var/lib/OpenAppStack/local-storage, create a new cluster using the installation instructions, and putting back the data. This migration procedure might not work perfectly.

Use kubectl get pvc -A on your old cluster to get a mapping of all the PVC uuids (and thus their folder names in /var/lib/OpenAppStack/local-storage) to the pods they are bound to.

Then, delete your old OpenAppStack, and install a new one with version number 0.4.0 or higher. You can upload your backed up data into /var/lib/OpenAppStack/local-storage. All PVCs will have new unique IDs (and thus different folder names). You have to manually match the folders from your backup with the new folders.

Additionally, if you want to re-use your old settings.yaml file, this data needs to be added to it:

backup:
  s3:
    # Disabled by default. To enable, change to `true` and configure the
    # settings below. You'll also want to add "velero" to the enabled
    # applications a bit further in this file.
    # Finally, you'll also need to provide access credentials as
    # secrets; see the documentation:
    # https://docs.openappstack.net/en/latest/installation_instructions.html#step-2-optional-cluster-backups-using-velero
    enabled: false
    # URL of S3 service. Please use the principal domain name here, without the
    # bucket name.
    url: "https://store.greenhost.net"
    # Region of S3 service that's used for backups.
    # For some on-premise providers this may be irrelevant, but the S3
    # apparently requires it at some point.
    region: "ceph"
    # Name of the S3 bucket that backups will be stored in.
    # This has to exist already: Velero will not create it for you.
    bucket: "openappstack-backup"
    # Prefix that's added to backup filenames.
    prefix: "test-instance"

# A whitelist of applications that will be enabled.
enabled_applications:
  # System components, necessary for the system to function.
  - 'cert-manager'
  - 'letsencrypt-production'
  - 'letsencrypt-staging'
  - 'ingress'
  - 'local-path-provisioner'
  - 'single-sign-on'
  # The backup system Velero is disabled by default, see settings under `backup` above.
  # - 'velero'
  # Applications.
  - 'grafana'
  - 'loki'
  - 'promtail'
  - 'nextcloud'
  - 'prometheus'
  - 'rocketchat'
  - 'wordpress'

Upgrading to 0.3.0

Upgrading from versions earlier than 0.3.0 requires manual intervention.

  • Move your local settings.yml file to a different location:

    cd CLUSTER_DIR
    mkdir -p ./group_vars/all/
    mv settings.yml ./group_vars/all/
    
  • Flux is now used to install and update applications. For that reason, we need you to remove all helm charts (WARNING: You will lose your data!):

    helm delete --purge oas-test-cert-manager oas-test-local-storage \
        oas-test-prometheus oas-test-proxy oas-test-files`
    
    • After removing all helm charts, you probably also want to remove all the pvcs that are left behind. Flux will not re-use the database PVCs created for these applications. Find all the pvcs by running kubectl get pvc   --namespace oas-apps and kubectl get pvc --namespace oas