Logging ======= Logs from pods and containers can be read in different ways: - In the cluster filesystem at ``/var/log/pods/`` or ``/var/logs/containers/``. - Using `kubectl logs`_ - Querying aggregated logs with Grafana, see below. Central log aggregation ----------------------- We use `Promtail`_, `Loki`_ and `Grafana`_ for easy access of aggregated logs. The `Loki documentation`_ is a good starting point how this setup works. There are two ways of viewing aggregated logs: * Via the Grafana web interface * Using the ``logcli`` command line tool Viewing logs in Grafana ~~~~~~~~~~~~~~~~~~~~~~~ The `Using Loki in Grafana`_ gets you started with querying your cluster logs with Grafana. You will find the Loki Grafana integration on your cluster at https://grafana.stackspin.example.org/explore together with some generic query examples. Please follow :ref:`system_administration/logging:LogQL query examples` for more LogQL query examples. Query logs with logcli ~~~~~~~~~~~~~~~~~~~~~~~ Please refer to `logcli`_ for installing ``logcli`` on your Laptop. The create a port-forwarding to your cluster using the ``kubectl`` tool: .. code:: console $ kubectl -n stackspin port-forward pod/loki-0 3100 In another terminal you can now use ``logcli`` to query ``loki`` like this: .. code:: console $ logcli query '{app=~".+"}' Please follow :ref:`system_administration/logging:LogQL query examples` for more LogQL query examples. Search older messages (in this case the last week and limit the output to 2000 lines): .. code:: console $ logcli query --since=168h --limit=2000 --forward '{app="helm-controller"}' LogQL query examples ~~~~~~~~~~~~~~~~~~~~ Please also refer to the `LogQL documentation`_ and the `log queries documentation`_ . Query all aggregated logs (unfortunatly we can’t find a better way of doing this since LogQL always expects a stream label to get queried): .. code:: PromQL {app=~".+"} Query all logs for a keyword: .. code:: PromQL {app=~".+"} |= "error" Query all k8s apps for errors using a regular expression, case-insensitive: .. code:: PromQL {app=~".+"} |~ `(error|fail|exception|fatal)` Flux ^^^^ `Flux`_ is responsible for installing applications. It uses four controllers: - ``source-controller`` that tracks Helm and Git repositories like https://open.greenhost.net/stackspin/stackspin for updates. - ``kustomize-controller`` to deploy ``kustomizations`` that often install ``helmreleases``. - ``helm-controller`` to deploy the ``helmreleases``. - ``notification-controller`` that is responsible for inbound and outbound flux messages Query all messages from the ``source-controller``: .. code:: PromQL {app="source-controller"} Query all messages from ``flux`` and ``helm-controller``: .. code:: PromQL {app=~"(source-controller|helm-controller)"} ``helm-controller`` messages containing ``wordpress``: .. code:: PromQL '{app = "helm-controller"} |= "wordpress"' ``helm-controller`` messages containing ``wordpress`` without ``unchanged`` events (to only show the installation messages): .. code:: PromQL '{app = "helm-controller"} |= "wordpress" != "unchanged"' Filter out redundant ``helm-controller`` messages: .. code:: PromQL '{app="helm-controller"} !~ `(unchanged|event=refreshed|method=Sync|component=checkpoint)`' Cert-manager ^^^^^^^^^^^^ Cert manager is responsible for requesting Let’s Encrypt TLS certificates. Query ``cert-manager`` messages containing ``chat``: .. code:: PromQL '{app="cert-manager"} |= "chat"' Hydra ^^^^^ Hydra is the single sign-on system. Show only warnings and errors from ``hydra``: .. code:: PromQL {container_name="hydra"} != "level=info" Debug oauth2 single sign-on with zulip: .. code:: PromQL {container_name=~"(hydra|zulip)"} Etc ^^^ Query kubernetes events processed by the ``eventrouter`` app containing ``warning``: .. code:: PromQL '{app="eventrouter"} |~ "warning"' .. _kubectl logs: https://kubernetes.io/docs/concepts/cluster-administration/logging .. _Promtail: https://grafana.com/docs/loki/latest/clients/promtail/ .. _Loki: https://grafana.com/oss/loki/ .. _Grafana: https://grafana.com/ .. _Loki documentation: https://grafana.com/docs/loki/latest/ .. _Using Loki in Grafana: https://grafana.com/docs/grafana/latest/datasources/loki .. _logcli: https://grafana.com/docs/loki/latest/getting-started/logcli/ .. _LogQL documentation: https://grafana.com/docs/loki/latest/logql .. _log queries documentation: https://grafana.com/docs/loki/latest/logql/log_queries/ .. _Flux: https://fluxcd.io/