I had written a blog earlier about the kubernetes solution and how you can configure it . In this blog we will talk about an upgrade to that solution .
If you have already logged in to oracle cloud you can navigate to the url https://cloud.oracle.com/loganalytics/oke_monitoring to choose the kubernetes cluster you want to monitor.
Once the cluster is chosen and click next you will get these options.
If its your first time keep the policies option checked. If policies and dynamic groups are already created you can uncheck this .
Metric server is needed to collect the metrics . In your cluster if you have already deployed the metric server then you can uncheck this else leave it checked.
For Solution deployment options i will pick the manual deploy for private clusters. If your cluster is public you can proceed with Enable cluster automatically.
Once you click on configure log collection it will auto-create loggroup and a resource manager stack will start to create other resources based on the selection.
Since i have chosen manual deploy the stack only create the management agent key resource as part of the stack and helm deployment will be done manually. Once the workflow is completed you will get a page to download the script(oci-kubernetes-monitoring-manual-deployment-script.sh) containing the helm command. It will look like below
#!/bin/bash
helm repo add oci-onm https://oracle-quickstart.github.io/oci-kubernetes-monitoring
helm repo update
helm install oci-kubernetes-monitoring oci-onm/oci-onm ………….
Once these commands were executed the required k8s resources like fluentd daemonset, management agent statefulset ,cronjob etc ..will get created and related pods should be in running state.
daemonset.apps/oci-onm-logan
statefulset.apps/oci-onm-mgmt-agent
cronjob.batch/oci-onm-discovery
You will see a hyperlink with the cluster name and some metrics. If the hyperlink does not appear there could be issues with policy or metric server not installed
Clicking on the cluster hyperlink will take you to a dashboard with some good insights. If there are any warning events the k8s resource and its dependent resource will be shown as red.
You can scroll through the widgets at the right side.You will find Kubernetes system widget and OS system widget.This will help in finding the issues related to overall Kubernetes cluster and Operating system .
If you expand this widget you will get the details like below screenshot to find issues.
You can view logs by right clicking on the respective icon which will take you to the Issues dashboard comparing the logs to the last 60 minutes by applying the respective filters related to the source.
Metrics:
You can apply the filter to see specific pod/node metrics
Status of the pod
You can look at the events for this pod as well.
You can also use the log explorer and other capabilities of logging analytics to analyse the container ,OS logs etc. .
If you have any feedback let me know in the comments.
Reference : Kubernetes Solution
NOTE: To make this even easier there is a new feature which is in Limited availability to enable observability for OKE during the OKE creation or afterwards. Please refer to the blog for more information .