If you have not tried it there is a solution already available in Oracle GitHub repo.
There is a marketplace stack available to deploy this solution in OCI(Oracle Cloud Infrastructure) as well . This stack works only for public cluster as of now.
In this blog I am not using the stack for the deployment instead we will use helm chart to deploy the solution . So what’s the advantage of using helm chart .
- You have more control and its applicable for both private and public cluster.
- You can automate with Gitops tools like ArgoCD,FluxCD etc..
- You can customize the solution as per your requirements.
I have created a OKE cluster using the terraform module.
I used OCI CloudShell/CodeEditor to connect to the cluster to run the helm commands.You can run the helm commands from your local system or any other system from where you have the connectivity to the Kubernetes API endpoint.
Create the dynamic group and policies
Dynamic group →ALL {resource.type=’managementagent’, resource.compartment.id=’OCI Management Agent Compartment OCID’}
Policies → Allow dynamic-group <OCI Management Agent Dynamic Group> to use metrics in compartment <Compartment Name> WHERE target.metrics.namespace = ‘mgmtagent_kubernetes_metrics’
The above dynamic group and policy is needed for management agent to send kubernetes related metrics to the specific namespace mgmtagent_kubernetes_metrics in the compartment specified
Dynamic group → All {instance.compartment.id = ‘<oke compartmentid>’}
Policy → Allow dynamic-group <oke dynamicgroup name> to {LOG_ANALYTICS_LOG_GROUP_UPLOAD_LOGS} in compartment <compartmentname>
The above dynamic group and policies will be needed for fluentd to send logs to log groups in logging analytics from the OKE nodes.
If you have other compute instances in the same compartment you can use defined tags for OKE instances and mention that in the dynamic group statement to manage the members of the dynamic group.
Example : All {instance.compartment.id = ‘<compartment_ocid>’, tag.<tagnamespace>.<tagkey>.value=’<tagvalue>’}
Create a overridevalues.yaml file with the below content
global:
#-- OCID for OKE cluster or a unique ID for other Kubernetes clusters.
kubernetesClusterID:
# -- Provide a unique name for the cluster. This would help in uniquely identifying the logs and metrics data at OCI Logging Analytics and OCI Monitoring respectively.
kubernetesClusterName:
oci-onm-logan:
# Go to OCI Logging Analytics Administration, click Service Details, and note the namespace value.
ociLANamespace:
# OCI Logging Analytics Log Group OCID
ociLALogGroupID:
oci-onm-mgmt-agent:
deployMetricServer: true
mgmtagent:
# Provide the base64 encoded content of the Management Agent Install Key file
installKeyFileContent:
- global.kubernetesClusterID → You can find this value by navigating to the kubernetes cluster page in OCI console and look for Cluster Id
2. global.kubernetesClusterName → This can be the name of the OKE cluster or any other identification for the cluster
3.oci-onm-logan.ociLANamespace → You can run the command “oci os ns get” in cloudshell to get the namespace value or find the details in OCI console under Logging analytics administration page →Service Details → Service Namespace
4.oci-onm-logan.ociLALogGroupID → In OCI console navigate to Logging analytics administration → Log Groups and create one if not done before. In the below image I have created a log group named testoke and copied the OCID to be used for this parameter.
5.oci-onm-mgmt-agent.deployMetricServer → Set this to true(default) if you have not installed metric server in your kubernetes cluster and set to false if you have already installed it in kube-system namespace. metric server is required to collect metrics.
6.oci-onm-mgmt-agen.tmgmtagent.installKeyFileContent → Navigate to Management Agents → Downloads and Keys in OCI console and create a new key if its not created already or expired.
Once created click the three dots next to respective key and select Download key to file .It will download the file with install key and other parameters .
You can remove all the commented lines and also the AgentDisplayname and just keep the ManagementAgentInstallKey key and its value.
The file content needs to be base64 encoded . Below is a sample command to get base64 encoded content in one line
cat <filename> |base64 -w 0(zero)
You have to use this base64 encoded value in the overridevalues.yaml for installKeyFileContent.
Lets install the helm chart by downloading the latest helm chart
tar -zxvf helm-chart.tgz;cd charts
helm install <releasename> oci-onm -f <path to overridevalues.yaml>
Ex: helm install okemon oci-onm -f ../overridevalues.yaml
Once the helm is installed successfully run the below command to check all the kubernetes resources are running fine.
kubectl get all -n oci-onm
If everything went well you should see the logs in logging analytics log explorer in the compartment where the log group is present.
Management agents should be running which will collect pod and container metrics .In metrics explorer you will see the mgmtagent_kubernetes_metrics namespace listed as well.
Troubleshooting:
If the logs are not flowing in connect to the oke worker nodes and check the logs
/var/log/oci-logging-analytics.log
If any issue with policies you will see errors like below
ERROR — : oci upload exception : Error while uploading the payload. Authorization failed for given oci_la_log_group_id against given Tenancy Namespace.
If everything is fine you should see INFO message like below in the logs
INFO — : The payload has been successfully uploaded to logAnalytics
Most of the steps can be automated like policy creation ,log group creation, management agent key creation etc via terraform ,CLI etc.
The repo has terraform code as well you can reuse if needed. Please let me know in the comments if you face any issues or suggestions.
Steps to Enable OKE control plane logs:
OKE control plane logs are available now. Navigate to OCI Logging and enable service logs for Container Engine for Kubernetes.
You can choose specific category or All log sources as per your needs.
Once the logs are available in OCI Logging .You can move these logs to Logging Analytics using service connector .Please refer this doc on how to create service connector for logging as source and logging analytics as target.
Once configured you will see these logs in Logging Analytics under the Log Source named OKE Control Plane Logs