1 - Create and user monitoring dashboards.

You can use Google Cloud Monitoring dashboards to create custom dashboards and charts. Kf comes with a default template which can be used to create dashboards to monitor the performance of your applications.

Application performance dashboard

Run the following commands to deploy a dashboard in your monitoring workspace in Cloud monitoring dashboards to monitor performance of your apps. This dashboard has application performance metrics like requests/sec, round trip latency, HTTP error codes, and more.

git clone https://github.com/google/kf
cd ./kf/dashboards
./create-dashboard.py my-dashboard my-cluster my-space

System resources and performance dashboard

You can view all the system resources and performance metrics such as list of nodes, pods, containers and much more using a built-in dashboard. Click the link below to access the system dashboard.

System dashboard

More details about this dashboard can be found here.

Create SLO and alerts

You can create SLOs and Alerts on available metrics to monitor performance and availability of both system and applications. For example, you can use the metrics istio.io/service/server/response_latencies to setup an alert on the application roundtrip latency.

Configure dashboard access control

Follow these instructions to provide dashboard access to developers and other members on the team. The role roles/monitoring.dashboardViewer provides read-only access to dashboards.

2 - Logging and monitoring

Kf can use GKE’s Google Cloud integrations to send a log of events to your Cloud Monitoring and Cloud Logging project for observability. For more information, see Overview of GKE operations.

Kf deploys two server side components:

  1. Controller
  2. Webhook

To view the logs for these components, use the following Cloud Logging query:

resource.type="k8s_container"
resource.labels.project_id=<PROJECT ID>
resource.labels.location=<GCP ZONE>
resource.labels.cluster_name=<CLUSTER NAME>
resource.labels.namespace_name="kf"
labels.k8s-pod/app=<controller OR webhook>

3 - Logging and monitoring overview

By default, Kf includes native integration with Cloud Monitoring and Cloud Logging. When you create a cluster, both Monitoring and Cloud Logging are enabled by default. This integration lets you monitor your running clusters and help analyze your system and application performance using advanced profiling and tracing capabilities.

Application level performance metrics is provided by Istio sidecar injection which is injected alongside applications deployed via Kf. You can also create SLO and Alerts using this default integration to monitor performance and availability of both system and applications.

Ensure the following are setup on your cluster:

4 - View logs

Kf provides you with several types of logs. This document describes these logs and how to access them.

Application logs

All logs written to standard output stdout and standard error stderr, are uploaded to Cloud Logging and stored under the log name user-container.

Open Cloud Logging and run the following query:

resource.type="k8s_container"
log_name="projects/YOUR_PROJECT_ID/logs/user-container"
resource.labels.project_id=YOUR_PROJECT_ID
resource.labels.location=GCP_COMPUTE_ZONE (e.g. us-central1-a)
resource.labels.cluster_name=YOUR_CLUSTER_NAME
resource.labels.namespace_name=YOUR_KF_SPACE_NAME
resource.labels.pod_name:YOUR_KF_APP_NAME

You should see all your application logs written on standard stdout and standard error stderr.

Access logs for your applications

Kf provides access logs using Istio sidecar injection. Access logs are stored under the log name server-accesslog-stackdriver.

Open Cloud Logging and run the following query:

resource.type="k8s_container"
log_name="projects/YOUR_PROJECT_ID/logs/server-accesslog-stackdriver"
resource.labels.project_id=YOUR_PROJECT_ID
resource.labels.location=GCP_COMPUTE_ZONE (e.g. us-central1-a)
resource.labels.cluster_name=YOUR_CLUSTER_NAME
resource.labels.namespace_name=YOUR_KF_SPACE_NAME
resource.labels.pod_name:YOUR_KF_APP_NAME

You should see access logs for your application. Sample access log:

{
  "insertId": "166tsrsg273q5mf",
  "httpRequest": {
    "requestMethod": "GET",
    "requestUrl": "http://test-app-38n6dgwh9kx7h-c72edc13nkcm.***. ***.nip.io/",
    "requestSize": "738",
    "status": 200,
    "responseSize": "3353",
    "remoteIp": "10.128.0.54:0",
    "serverIp": "10.48.0.18:8080",
    "latency": "0.000723777s",
    "protocol": "http"
  },
  "resource": {
    "type": "k8s_container",
    "labels": {
      "container_name": "user-container",
      "project_id": ***,
      "namespace_name": ***,
      "pod_name": "test-app-85888b9796-bqg7b",
      "location": "us-central1-a",
      "cluster_name": ***
    }
  },
  "timestamp": "2020-11-19T20:09:21.721815Z",
  "severity": "INFO",
  "labels": {
    "source_canonical_service": "istio-ingressgateway",
    "source_principal": "spiffe://***.svc.id.goog/ns/istio-system/sa/istio-ingressgateway-service-account",
    "request_id": "0e3bac08-ab68-408f-9b14-0aec671845bf",
    "source_app": "istio-ingressgateway",
    "response_flag": "-",
    "route_name": "default",
    "upstream_cluster": "inbound|80|http-user-port|test-app.***.svc.cluster.local",
    "destination_name": "test-app-85888b9796-bqg7b",
    "destination_canonical_revision": "latest",
    "destination_principal": "spiffe://***.svc.id.goog/ns/***/sa/sa-test-app",
    "connection_id": "82261",
    "destination_workload": "test-app",
    "destination_namespace": ***,
    "destination_canonical_service": "test-app",
    "upstream_host": "127.0.0.1:8080",
    "log_sampled": "false",
    "mesh_uid": "proj-228179605852",
    "source_namespace": "istio-system",
    "requested_server_name": "outbound_.80_._.test-app.***.svc.cluster.local",
    "source_canonical_revision": "asm-173-6",
    "x-envoy-original-dst-host": "",
    "destination_service_host": "test-app.***.svc.cluster.local",
    "source_name": "istio-ingressgateway-5469f77856-4n2pw",
    "source_workload": "istio-ingressgateway",
    "x-envoy-original-path": "",
    "service_authentication_policy": "MUTUAL_TLS",
    "protocol": "http"
  },
  "logName": "projects/*/logs/server-accesslog-stackdriver",
  "receiveTimestamp": "2020-11-19T20:09:24.627065813Z"
}

Audit logs

Audit Logs provides a chronological record of calls that have been made to the Kubernetes API Server. Kubernetes audit log entries are useful for investigating suspicious API requests, for collecting statistics, or for creating monitoring alerts for unwanted API calls.

Open Cloud Logging and run the following query:

resource.type="k8s_container"
log_name="projects/YOUR_PROJECT_ID/logs/cloudaudit.googleapis.com%2Factivity"
resource.labels.project_id=YOUR_PROJECT_ID
resource.labels.location=GCP_COMPUTE_ZONE (e.g. us-central1-a)
resource.labels.cluster_name=YOUR_CLUSTER_NAME
protoPayload.request.metadata.name=YOUR_APP_NAME
protoPayload.methodName:"deployments."

You should see a trace of calls being made to the Kubernetes API server.

Configure logging access control

Follow these instructions to provide logs access to developers and other members on the team. The role roles/logging.viewer provides read-only access to logs.

Use Logs Router

You can also use Logs Router to route the logs to supported destinations.