PCF - Logging, Scaling and High Availability

- December 09, 2017

How do you access application logs?
cf logs APP_NAME

cf start APP_NAME

To see the logs of particular pcf sub system.

cf logs APP_NAME | grep "API\|CELL"

To exclude particular logs

cf logs APP_NAME | grep -v "API\|CELL"

To see application events i.e. start, stop, crash etc...

cf events APP_NAME

To display all the lines in the Loggregator buffer

cf logs APP_NAME --recent

 What are the components of the Loggregator system?

Loggregator is the next generation system for aggregating and streaming logs and metrics from all of the user apps and system components in a Cloud Foundry deployment.

Primary use:
1. Tail/dump logs using CLI.
2. Stream to 3rd party log archive and analysis service
3. Operators and admins can access Loggregator Firehouse, the combined stream from all the apps and metrics data.
4. Operators can deploy nozzle to the firehouse.

A nozzle is a component that monitors the Firehose for specified events and metrics, and streams this data to external services.

The Loggregator system uses gRPC for communication between the Metron Agent and the Doppler, and between the Doppler and the Traffic Controller. This improves the stability and the performance of the Loggregator system, but it may require operators to scale their Dopplers.

Source

Sources are logging agents that run on the Cloud Foundry components.

Metron

Metron agents are co-located with sources. They collect logs and forward them to the Doppler servers.

Doppler

Dopplers gather logs from the Metron agents, store them in temporary buffers, and forward them to the Traffic Controller or to third-party syslog drains.

Traffic Controller

The Traffic Controller handles client requests for logs. It gathers and collates messages from all Doppler servers, and provides external API and message translation as needed for legacy APIs. Th Traffic Controller also exposes the Firehose.

Firehose

The Firehose is a WebSocket endpoint that streams all the event data coming from a Cloud Foundry deployment. The data stream includes logs, HTTP events and container metrics from all applications, and metrics from all Cloud Foundry system components. Logs from system components such as Cloud Controller are not included in the firehose and are typically accessed through rsyslog configuration.

Because the data coming from the Firehose may contain sensitive information, such as customer information in the application logs, only users with the correct permissions can access the Firehose.

 How do you scale an application manually?

cf scale

 What are the four levels of high-availability provided by PCF?

1. Availability zones

PCF supports deploying applications instances across multiple AZs. This level of high availability requires that you define AZs in your IaaS. PCF balances the applications you deploy across the AZs you defined. If an AZ goes down, you still have application instances running in another.

2. Health management of app instances

If you lose application instances for any reason, such as a bug in the app or an AZ going down, PCF restarts new instances to maintain capacity. Under Diego architecture, the nsync, BBS, and Cell Rep components track the number of instances of each application that are running across all of the Diego cells. When these components detect a discrepancy between the actual state of the app instances in the cloud and the desired state as known by the Cloud Controller, they advise the Cloud Controller of the difference and the Cloud Controller initiates the deployment of new application instances.

3. Process monitoring

PCF uses a BOSH agent, monit, to monitor the processes on the component VMs that work together to keep your applications running, such as nsync, BBS, and Cell Rep. If monit detects a failure, it restarts the process and notifies the BOSH agent on the VM. The BOSH agent notifies the BOSH Health Monitor, which triggers responders through plugins such as email notifications or paging.

4. Resurrection of VM's

BOSH detects if a VM is present by listening for heartbeat messages that are sent from the BOSH agent every 60 seconds. The BOSH Health Monitor listens for those heartbeats. When the Health Monitor finds that a VM is not responding, it passes an alert to the Resurrector component. If the Resurrector is enabled, it sends the IaaS a request to create a new VM instance to replace the one that failed.

 What is the difference between scaling up and scaling out?

To scale vertically/up: (increasing memory)
cf scale APP_NAME -m 1G

Note: Downtime will be there

Scale horizontally/out: (increasing instances)
cf scale APP_NAME -i 3

Note: No downtime and total 3 instances will be present

Search This Blog

Raj kumar Bhakthavachalam

PCF - Logging, Scaling and High Availability

Comments

Post a Comment

Popular posts from this blog

தீபம் பிளக்ஸ்

காது புடி வாத்தியார்

PCF - Cloud Foundry Overview - Starting, Restarting and Restaging applications