CloudWatch Logs - how to log data from multiple instances to the single stream

After using CloudWatch Logs for some time I found that it is very inconvenient to have one stream per instance. The Logs UI is really complex to use - I need to remember instance names, open the log group I need and then go into each instance logs one-by-one to check them.

A more convenient alternative is to use one stream like error_log for all instances.

Update: logging to the same stream from multiple sources is not recommended and may cause duplicate records (although in my case this is fine). Check the comments for more information and thanks Sergei for pointing this out.

To use the single stream, we can just set a static string for stream name instead of {instance_id}, like this (see log_stream_name = error_log):

  AWSEBAutoScalingGroup:
    Metadata:
      "AWS::CloudFormation::Init":
        CWLogsAgentConfigSetup:
          files:
            "/tmp/cwlogs/conf.d/apache-error.conf":
              content : |
                [apache-error_log]
                file = `{"Fn::FindInMap":["CWLogs", "WebErrorLogGroup", "LogFile"]}`
                log_group_name = `{ "Ref" : "AWSEBCloudWatchLogs8832c8d3f1a54c238a40e36f31ef55a0WebErrorLogGroup" }`
                log_stream_name = error_log
                datetime_format = `{"Fn::FindInMap":["CWLogs", "WebErrorLogGroup", "TimestampFormat"]}`
              mode  : "000400"
              owner : root
              group : root

Now we need to make sure our log records contain instance id or IP address, so we can understand which instance generated each log record.

Fortunately, python application logs in the apache's error_log already have IP address, so nothing to do here. The log looks like this:

[Wed Jun 17 16:02:46.944612 2015] [:error] [pid 20405] [remote 172.31.11.92:0] mod_wsgi (pid=20405): Exception occurred processing WSGI script '/opt/python/current/app/application.py'.
[Wed Jun 17 16:02:46.944686 2015] [:error] [pid 20405] [remote 172.31.11.92:0] Traceback (most recent call last):
[Wed Jun 17 16:02:46.944711 2015] [:error] [pid 20405] [remote 172.31.11.92:0] File "/opt/python/run/venv/lib/python2.7/site-packages/flask/app.py", line 1701, in __call__
...

Here [remote 172.31.11.92:0] is a private instance IP.

For custom logs generated by application it is better to use python's logging module where we can set a custom format and include instance id.

Here is an example of how to get EC2 instance id or IP or any other metadata using boto:

>>> from boto.utils import get_instance_metadata
>>> print get_instance_metadata()
{
  'ami-manifest-path': '(unknown)',
  'instance-type': 't2.small',
  'instance-id': 'i-977a265c',
  'iam': {...}, 'local-hostname':
  'ip-172-32-22-111.ap-southeast-1.compute.internal',
  'network': {... },
  'hostname': 'ip-172-32-22-111.ap-southeast-1.compute.internal',
  'ami-id': 'ami-44d4e414',
  'instance-action': 'none',
  'profile': 'default-hvm',
  'reservation-id': 'r-fde77b77',
  'security-groups': ['ci-servers', 'awseb-e-3uijdzdiad-stack-AWSEBSecurityGroup-1R39VECLNK1YK'],
  'metrics': {'vhostmd': '<?xml version="1.0" encoding="UTF-8"?>'},
  'mac': '06:2d:22:2b:22:22',
  'public-ipv4': '52.77.99.111',
  'services': {'domain': 'amazonaws.com'},
  'local-ipv4': '172.32.22.111',
  'placement': {'availability-zone': 'ap-southeast-1b'},
  'ami-launch-index': '0',
  'public-hostname': 'ec2-52-77-99-111.ap-southeast-1.compute.amazonaws.com', 'public-keys': {...},
  'block-device-mapping': {'ami': '/dev/xvda', 'root': '/dev/xvda'}
}

And now here is how you can setup a logger to have instance id for log records:

import logging
from boto.utils import get_instance_metadata

# Logging configuration
# Note: - `with_time` formatter contains non-standard [%(hostname)s] parameter
LOGGING_CONFIG = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'with_time': {
            'format': '[%(asctime)s] [%(levelname)s] [%(hostname)s] -- %(message)s',
            'datefmt': '%Y-%m-%d %H:%M:%S'
        }
    },
    'handlers': {
        'sys_info': {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'log/sysinfo.log',
            'formatter': 'with_time'
        },
        'console': {
            'class': 'logging.StreamHandler',
            'formatter': 'with_time'
        }
    },
    'loggers': {
        'sys_info': {
            'level': 'INFO',
            'handlers': ['sys_info']
        }
    }
}
# Use `APP_ENV` environment variable or `development` by default
# We assume that `development` is local and all other enviroments are
# on Amazon EC2 instances
APP_ENV = os.environ['APP_ENV'] if 'APP_ENV' in os.environ else 'development'

# A log filter class to add custom `hostname` property to log records
class LogHostnameFilter(logging.Filter):
    def filter(self, record):
        if APP_ENV != 'development':
            meta = get_instance_metadata()
            record.hostname = meta['instance-id']
        else:
            record.hostname = 'localhost'
        return True

# Create and configure logger
logging.config.dictConfig(LOGGING_CONFIG)
sys_info_logger = logging.getLogger('sys_info')
sys_info_logger.addFilter(LogHostnameFilter())

# Usage example:
sys_info_logger.info('Access token denied: %s' % access_token)

Elastic Beanstalk - how to setup CloudWatch Logs