Elastic Beanstalk - how to setup CloudWatch Logs

CloudWatch Logs is an AWS service to collect and monitor system and application logs. On the top level setup is this:

  • install CloudWatch agent to collect logs data and send to CloudWatch Logs service
  • define log metric filters to extract useful data, like number of all errors or information about some specific events
  • create alarms for metrics to get notifications about logs
  • make sure that the instance role has permissions to push logs to CloudWatch (see comments for details about this issue)

All the configuration can be done using the Elastic Beanstalk config. In this case when the new environment is launched or existing environment is updated the CloudWatch Logs setup is done automatically.

There is an example of the configuration in the Elastic Beanstalk docs - Using Elastic Beanstalk with Amazon CloudWatch Logs. The Setting Up CloudWatch Logs Integration with Configuration Files section briefly describes how to use config examples for different environments, but there is no detailed information about the config files. And config files are complex. It is easy to use examples as is, but it is not that easy to do the setup for own logs. I will start with the review of the Apache access log example and will show how to change it to collect data from the error log as well.

An example for php and python is an archive containing:

cloudwatchlogs-apache/
  cwl-setup.config
  eb-logs.config
  cwl-webrequest-metrics.config

Files need to be placed under .ebextensions folder. I recommend extracting an archive into .ebextensions/cloudwatchlogs-apache. You can also put an archive file itself, but it will be not convenient to view/edit configs then.

First two files (cwl-setup.config, click to preview and eb-logs.config, click to preview) are generic and can be used as is. These files will setup CloudWatch Logs agent on the instance and configure Elastic Beanstalk logs publication to S3.

The last one (cwl-webrequest-metrics.config, click to preview) is an example of CloudWatch Logs setup for Apache's access log.

Config file consists of several sections - 'Mappings', 'Outputs', 'Resources'. There are cross-references between these sections, like filters defined in 'Mappings' section are used later in 'Resources' section for metric filters configuration.

Unfortunately, there is no complete description of this config format in the Elastic Beanstalk documentation. As I understand this is actually a CloudFormation template but written in yaml (Elastic Beanstalk config format) instead of json (regular CloudFormation template format).

General template structure and information about its sections can be found here: CloudFormation - Template Anatomy.

Let's examine a cwl-webrequest-metrics.config file. The first section is 'Mappings', here it is:

# Apache access log pattern:
## "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""
## remote_host  remote_logname  remote_user  time_received "request" status response_size "referrer" "user-agent"
Mappings:
  CWLogs:
    WebRequestLogGroup:
      LogFile: "/var/log/httpd/access_log"
      TimestampFormat: "%d/%b/%Y:%H:%M:%S %z"
    FilterPatterns:
      Http4xxMetricFilter: "[..., status=4*, size, referer, agent]"
      HttpNon4xxMetricFilter: "[..., status!=4*, size, referer, agent]"
      Http5xxMetricFilter: "[..., status=5*, size, referer, agent]"
      HttpNon5xxMetricFilter: "[..., status!=5*, size, referer, agent]"

Here we see some definitions like log file path (LogFile), log timestamp format (TimestampFormat) and filter patterns (Http4xxMetricfilter, HttpNon4xxMetricFilter, ...). These definitions work as constants defined at the top of the template and are referred from other sections of the file.

Filter patterns will be used to setup metric filters for the access log. Here we have patterns which will find all requests with 4XX response code, all requests with non 4XX code, all 5XX responses and all non 5XX responses.

The TimestampFormat setting is used by CloudWatch Logs agent to get timestamps for log records, so it is important to verify that format is set correctly. The timestamp format is the same as used by python's strptime function. Some more information about the format can be found in the CloudWatch Logs agent setup file, check around the middle of the file, the description of the datetime_format parameter - there is a table with placeholders and examples.

Since timestamp format is the same as used by python, it is easy to test it using python interpreter. Start python in command line and use code like this:

  >>> import time
  >>> time.strptime('30/03/09 16:31:32.123', '%d/%m/%y %H:%M:%S.%f')

In some cases, it can be complex to set the timestamp format because there can be just no placeholders to express the real format in the log. In this case, try to match at least part of the timestamp.

For example, Apache error log has a timestamp like Sun May 17 21:59:15.837463 2015. The problem is that there is a fractional part of the second in the middle, before the year and official python docs doesn't have a placeholder for this case (edit: there is an %f for microseconds, but let's pretend we don't know this).

The pattern I used for Apache error log is %a %b %d %H:%M:%S (short weekday, short month name, day, hour:minute:second) and it matches only the start of the timestamp and does not include year. But it works good and I guess that CloudWatch agent takes the current year as default.

Next section in the config file is Outputs:

Outputs:
  WebRequestCWLogGroup:
    Description: "The name of the Cloudwatch Logs Log Group created for this environments web server access logs. You can specify this by setting the value for the environment variable: WebRequestCWLogGroup. Please note: if you update this value, then you will need to go and clear out the old cloudwatch logs group and delete it through Cloudwatch Logs."
    Value: { "Ref" : "AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebRequestLogGroup"}

Here we describe CloudWatch Logs group. Log group is a top level entity, like "production-apache-errors", "staging-syslog", etc. Inside the group we have streams, each stream contains log records from some EC2 instance.

Next section is Resources where we define AWS resources used in our setup:

Resources :
  AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebRequestLogGroup:    ## Must have prefix:  AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0
    Type: "AWS::Logs::LogGroup"
    DependsOn: AWSEBBeanstalkMetadata
    DeletionPolicy: Retain     ## this is required
    Properties:
      LogGroupName:
        "Fn::GetOptionSetting":
          Namespace: "aws:elasticbeanstalk:application:environment"
          OptionName: WebRequestCWLogGroup
          DefaultValue: {"Fn::Join":["-", [{ "Ref":"AWSEBEnvironmentName" }, "webrequests"]]}
      RetentionInDays: 14

Above is the definition of the log group resource. Notice the usage of "Fn::FunctionName" constructs, the config file also has a mini-language to refer other sections of the config or to join strings. For example, {"Fn::Join":["-", [{ "Ref":"AWSEBEnvironmentName" }, "webrequests"]]} will take Elastic Beanstalk environment name and add '-webrequests' after it, so the log group name will be "environment-webrequests".

Or function call like this (it is used below) {"Fn::FindInMap":["CWLogs", "WebRequestLogGroup", "TimestampFormat"]} will look up a TimestampFormat value in the Mappings config section.

You can find more information on functions here:

Next resource in the Resources section is an autoscaling group:

  ## Register the files/log groups for monitoring
  AWSEBAutoScalingGroup:
    Metadata:
      "AWS::CloudFormation::Init":
        CWLogsAgentConfigSetup:
          files:
            ## any .conf file put into /tmp/cwlogs/conf.d will be added to the cwlogs config (see cwl-agent.config)
            "/tmp/cwlogs/conf.d/apache-access.conf":
              content : |
                [apache-access_log]
                file = `{"Fn::FindInMap":["CWLogs", "WebRequestLogGroup", "LogFile"]}`
                log_group_name = `{ "Ref" : "AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebRequestLogGroup" }`
                log_stream_name = {instance_id}
                datetime_format = `{"Fn::FindInMap":["CWLogs", "WebRequestLogGroup", "TimestampFormat"]}`
              mode  : "000400"
              owner : root
              group : root

Here we describe autoscaling group and say that during the new EC2 instance initialization (AWS::CloudFormation::Init) the /tmp/cwlogs/conf.d/apache-access.conf file should be created. This is a CloudWatch agent configuration which instructs an agent to collect data from the apache access log. We also describe a content for this file:

content : |
  [apache-access_log]
  ## We take file name from the Mappings - CWLogs - WebRequestLogGroup - LogFile
  file = `{"Fn::FindInMap":["CWLogs", "WebRequestLogGroup", "LogFile"]}`
  ## Log group is a reference to the log group resource we defined above
  log_group_name = `{ "Ref" : "AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebRequestLogGroup" }`
  ## Log stream name is an instance id
  log_stream_name = {instance_id}
  ## date_format for cloudwatch agent is also defined in Mappings section above
  datetime_format = `{"Fn::FindInMap":["CWLogs", "WebRequestLogGroup", "TimestampFormat"]}`

You can find more information about the CloudWatch agent configuration here.

Note: it is not always convenient to have one stream per instance, see the follow up note on how to setup on stream for all instances.

Next four resources are metric filters. These filters will extract and count messages with specific status codes from the Apache access log.

  #######################################
  ## Cloudwatch Logs Metric Filters

  AWSEBCWLHttp4xxMetricFilter :
    Type : "AWS::Logs::MetricFilter"
    Properties :
      LogGroupName: { "Ref" : "AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebRequestLogGroup" }
      FilterPattern : {"Fn::FindInMap":["CWLogs", "FilterPatterns", "Http4xxMetricFilter"]}
      MetricTransformations :
        - MetricValue : 1
          MetricNamespace: {"Fn::Join":["/", ["ElasticBeanstalk", {"Ref":"AWSEBEnvironmentName"}]]}
          MetricName : CWLHttp4xx

  AWSEBCWLHttpNon4xxMetricFilter :
    Type : "AWS::Logs::MetricFilter"
    DependsOn : AWSEBCWLHttp4xxMetricFilter
    Properties :
      LogGroupName: { "Ref" : "AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebRequestLogGroup" }
      FilterPattern : {"Fn::FindInMap":["CWLogs", "FilterPatterns", "HttpNon4xxMetricFilter"]}
      MetricTransformations :
        - MetricValue : 0
          MetricNamespace: {"Fn::Join":["/", ["ElasticBeanstalk", {"Ref":"AWSEBEnvironmentName"}]]}
          MetricName : CWLHttp4xx

  AWSEBCWLHttp5xxMetricFilter :
    Type : "AWS::Logs::MetricFilter"
    Properties :
      LogGroupName: { "Ref" : "AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebRequestLogGroup" }
      FilterPattern : {"Fn::FindInMap":["CWLogs", "FilterPatterns", "Http5xxMetricFilter"]}
      MetricTransformations :
        - MetricValue : 1
          MetricNamespace: {"Fn::Join":["/", ["ElasticBeanstalk", {"Ref":"AWSEBEnvironmentName"}]]}
          MetricName : CWLHttp5xx

  AWSEBCWLHttpNon5xxMetricFilter :
    Type : "AWS::Logs::MetricFilter"
    DependsOn : AWSEBCWLHttp5xxMetricFilter
    Properties :
      LogGroupName: { "Ref" : "AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebRequestLogGroup" }
      FilterPattern : {"Fn::FindInMap":["CWLogs", "FilterPatterns", "HttpNon5xxMetricFilter"]}
      MetricTransformations :
        - MetricValue : 0
          MetricNamespace: {"Fn::Join":["/", ["ElasticBeanstalk", {"Ref":"AWSEBEnvironmentName"}]]}
          MetricName : CWLHttp5xx

Metric filters use filter patterns defined in the Mappings section. Pattern syntax is described here and it is convenient to test patterns in the AWS console:

  • Open CloudWatch service in AWS console
  • Click Logs on the left, select log group on the right
  • Click Create Metric Filter button

Now you can select existing log data if you already have it or just paste some text from the local log file, enter filter pattern and test it on the log data.

For example, entering the filter like [timestamp, type = *, ...] you can see what is taken as a timestamp. If we have log data like this:

[Thu May 14 16:26:54.396347 2015] [suexec:notice] [pid 2631] AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Thu May 14 16:26:54.405887 2015] [auth_digest:notice] [pid 2631] AH01757: generating secret for digest authentication ...
[Thu May 14 16:26:54.406397 2015] [lbmethod_heartbeat:notice] [pid 2631] AH02282: No slotmem from mod_heartmonitor
[Thu May 14 16:26:54.407930 2015] [mpm_prefork:notice] [pid 2631] AH00163: Apache/2.4.10 (Amazon) mod_wsgi/3.5 Python/2.7.5 configured -- resuming normal operations

Then the test result for a pattern [timestamp, type = *, ...] will be this:

Line Number $timestamp                       $type               $3       ...
1           Thu May 14 16:26:54.396347 2015  suexec:notice       pid 2631
2           Thu May 14 16:26:54.405887 2015  auth_digest:notice  pid 2631
3           Thu May 14 16:26:54.406397 2015  ...

Here the data is considered as a space-delimited, so each word becomes a data column. Spaces can be "escaped" with square brackets, like [Thu May 07 07:03:48.655204 2015] will be considered one field.

This way for custom application logs it is better to use square brackets to specify log record fields. For example, it is better to use [2015-05-14 19:00:03] [WARNING] -- the warning message instead of 2015-05-14 19:00:03 WARNING -- the warning message.

Note: in the Mappings section we also specify TimestampFormat, but it has no effect for filters and only used by CloudWatch agent.

And finally we define alarms which will watch for metrics above and generate SNS notification if we have too many 5XX responses (here we count responses) or if percent of 4XX responses becomes high (here we calculate percent value of 4XX responses).

  ######################################################
  ## Alarms

  AWSEBCWLHttp5xxCountAlarm :
    Type : "AWS::CloudWatch::Alarm"
    DependsOn : AWSEBCWLHttpNon5xxMetricFilter
    Properties :
      AlarmDescription: "Application is returning too many 5xx responses (count too high)."
      MetricName: CWLHttp5xx
      Namespace: {"Fn::Join":["/", ["ElasticBeanstalk", {"Ref":"AWSEBEnvironmentName"}]]}
      Statistic: Sum
      Period: 60
      EvaluationPeriods: 1
      Threshold: 10
      ComparisonOperator: GreaterThanThreshold
      AlarmActions:
        - "Fn::If":
            - SNSTopicExists
            - "Fn::FindInMap":
                - AWSEBOptions
                - options
                - EBSNSTopicArn
            - { "Ref" : "AWS::NoValue" }

  AWSEBCWLHttp4xxPercentAlarm :
    Type : "AWS::CloudWatch::Alarm"
    DependsOn : AWSEBCWLHttpNon4xxMetricFilter
    Properties :
      AlarmDescription: "Application is returning too many 4xx responses (percentage too high)."
      MetricName: CWLHttp4xx
      Namespace: {"Fn::Join":["/", ["ElasticBeanstalk", {"Ref":"AWSEBEnvironmentName"}]]}
      Statistic: Average
      Period: 60
      EvaluationPeriods: 1
      Threshold: 0.10
      ComparisonOperator: GreaterThanThreshold
      AlarmActions:
        - "Fn::If":
            - SNSTopicExists
            - "Fn::FindInMap":
                - AWSEBOptions
                - options
                - EBSNSTopicArn
            - { "Ref" : "AWS::NoValue" }

More information about Resources section:

Apache error log setup

To create a CloudWatch Log configuration for another log file do the following:

  • Copy cwl-webrequest-metrics.config and save under new name into the same folder
  • In the Mappings section - change the log file path, timestamp format and filter patterns
  • In the AWSEBAutoScalingGroup resource - change the config file name: apache-access.conf to my-log-name.conf and change [apache-access_log] section name in the content
  • Search for webrequest and replace all occurences with appropriate name, do the case-insensetive search to change webrequest, WebRequest, etc
  • Review filter patterns, metrics and alarms - these will be different for each log file

For example, a config for apache error log can look like this (file cwl-weberror-metrics.config):

Mappings:
  CWLogs:
    WebErrorLogGroup:
      LogFile: "/var/log/httpd/error_log"
      TimestampFormat: "%a %b %d %H:%M:%S"
    FilterPatterns:
      AllErrorsFilter: "[timestamp, type = *error*, ...]"


Outputs:
  WebErrorCWLogGroup:
    Description: "Apache error log - WebErrorCWLogGroup"
    Value: { "Ref" : "AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebErrorLogGroup"}


Resources :
  AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebErrorLogGroup:
    Type: "AWS::Logs::LogGroup"
    DependsOn: AWSEBBeanstalkMetadata
    DeletionPolicy: Retain     ## this is required
    Properties:
      LogGroupName:
        "Fn::GetOptionSetting":
          Namespace: "aws:elasticbeanstalk:application:environment"
          OptionName: WebErrorCWLogGroup
          DefaultValue: {"Fn::Join":["-", [{ "Ref":"AWSEBEnvironmentName" }, "weberrors"]]}
      RetentionInDays: 14


  AWSEBAutoScalingGroup:
    Metadata:
      "AWS::CloudFormation::Init":
        CWLogsAgentConfigSetup:
          files:
            "/tmp/cwlogs/conf.d/apache-error.conf":
              content : |
                [apache-error_log]
                file = `{"Fn::FindInMap":["CWLogs", "WebErrorLogGroup", "LogFile"]}`
                log_group_name = `{ "Ref" : "AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebErrorLogGroup" }`
                log_stream_name = {instance_id}
                datetime_format = `{"Fn::FindInMap":["CWLogs", "WebErrorLogGroup", "TimestampFormat"]}`
              mode  : "000400"
              owner : root
              group : root


  AWSEBCWLAllErrorsFilter :
    Type : "AWS::Logs::MetricFilter"
    Properties :
      LogGroupName: { "Ref" : "AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebErrorLogGroup" }
      FilterPattern : {"Fn::FindInMap":["CWLogs", "FilterPatterns", "AllErrorsFilter"]}
      MetricTransformations :
        - MetricValue : 1
          MetricNamespace: {"Fn::Join":["/", ["ElasticBeanstalk", {"Ref":"AWSEBEnvironmentName"}]]}
          MetricName : CWLAllErrorRecords


  AWSEBCWLAllErrorsCountAlarm :
    Type : "AWS::CloudWatch::Alarm"
    DependsOn : AWSEBCWLAllErrorsFilter
    Properties :
      AlarmDescription: "Application generated an error in error_log"
      MetricName: CWLAllErrorRecords
      Namespace: {"Fn::Join":["/", ["ElasticBeanstalk", {"Ref":"AWSEBEnvironmentName"}]]}
      Statistic: Sum
      Period: 60
      EvaluationPeriods: 1
      Threshold: 1
      ComparisonOperator: GreaterThanThreshold
      AlarmActions:
        - "Fn::If":
            - SNSTopicExists
            - "Fn::FindInMap":
                - AWSEBOptions
                - options
                - EBSNSTopicArn
            - { "Ref" : "AWS::NoValue" }

Note that metrics and alarms are optional and log data will be sent to CloudWatch Logs even if there are no metrics/alarms. But in this case, it will be necessary to review logs manually and setup metrics and alarms later if needed.

Using Elastic Beanstalk with Amazon CloudWatch Logs (+ config examples)

Amazon CloudWatch Monitoring Scripts for Linux (sample Perl scripts that demonstrate how to produce and consume Amazon CloudWatch custom metrics)

AWS Blog: Store and Monitor OS & Application Log Files with Amazon CloudWatch (manual setup, note about elastic beanstalk, information about cloud formation and ops works)

Logs Monitoring Using AWS CloudWatch (awslogs agent setup via script + metrics / alarms from the AWS console)

Stackoverflow: What's a good way to collect logs from Amazon EC2 instances?

How to use CloudWatch to generate alerts from logs?