Evaluate your SIEM
Get the guideComplete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.
April 29, 2020
In this article I’m going to show you three quick and easy ways to enrich your AWS log data in Sumo Logic using fields:
Fields is a feature of Sumo shipped in 2019 as part of our Kubernetes monitoring solution and is how the fluentd pipeline adds Kubernetes metadata like service and pod to each log event.
It’s also really handy for enriching each log event from AWS with fields such as:
Enriching your logs with these extra fields at ingest time will:
This translates into better business outcomes for you:
Did you know in less than 5 minutes you can configure your Sumo Logic collectors to report instanceid as a field for all log events?
When you install Sumo Logic installed collectors on EC2 instances, by default the collector name (_collector) and sourcehost (_sourcehost) default to the local host name. Unlike on-prem environments where hosts typically had meaningful host names, in AWS meaningless ephemeral host names are common. Often it’s hard to correlate these names with key attributes like instance-id, auto scaling group, or service name to troubleshoot issues.
You can write a script to custom configure the collector name property in user.properties using instance metadata during instance start but there is an easier way …
Since mid 2019 collectors installed on AWS EC2 agents report four fields by default to Sumo Logic: availabilityZone, instanceId, instanceType, and region. The fields just don’t appear unless you enable them.
If you have installed collectors on EC2 instances take a couple of minutes to enable the fields in Manage Settings / Fields in Sumo Logic UI. Open the Fields settings in Sumo UI, select ‘Dropped Fields’ and enable each of the four fields. It’s that simple!
New logs ingested from EC2 instances will have these four fields assigned. You can then search and group by these four fields in your searches as shown below:
If you are building in AWS it makes sense to invest in creating a best practice tagging strategy. As your AWS footprint grows these tags on EC2 instances will deliver increasing value: enabling you to categorize resources by costcenter, purpose, owner, environment, or other criteria. How wonderful would it be to search logs in Sumo Logic using your AWS tags you worked so hard to create?
A common approach is to encode values like application names, environments, or tag values into builtin sumo metadata fields like the collector name in user.properties, or configuring source category string in local json files for each source.
In 2019 Sumo Logic released a new source that provides a better approach for EC2- where you can map tags directly to ingeste logs. Configure an AWS Log Metadata Source and the EC2 instance AWS tags are mapped as fields to each log event at ingestion time.
The steps to setup the AWS metadata source can be found in the docs page - but here is a quick overview of the process:
Shortly after setting up the log metadata source you will see tags appear as fields in newly ingested logs from EC2 instances in that AWS account. These fields will appear in the field browser on the left hand side of the sumo search UI.
You can then use these in searches for example:
_sourcecategory=aws/ec2/logs owner=myteam
You can also use these fields for aggregation and grouping:
_sourcecategory=aws/ec2/logs owner=myteam | sum(_size) by owner, cost_center
Scaling To Many Accounts
If you have 10s or 100s of AWS accounts you can create these AWS sources using a code pipeline via the sources management API. Here is an example JSON for a AWS metadata source.
{ "api.version":"v1", "source":{ "name":"aws_metadata_046921848075", "description":"Poll metadata for acme-xxx-test 0123456789", "automaticDateParsing":false, "multilineProcessingEnabled":false, "useAutolineMatching":false, "contentType":"AwsMetadata", "forceTimeZone":false, "filters":[], "cutoffTimestamp":0, "encoding":"UTF-8", "fields":{ }, "thirdPartyRef":{ "resources":[{ "serviceType":"AwsMetadata", "path":{ "type":"AwsMetadataPath", "limitToRegions":["us-east-1","us-west-2"], "limitToNamespaces":["AWS/EC2"], "tagFilters":["owner","costcenter","cluster","Name"] }, "authentication":{ "type":"AWSRoleBasedAuthentication", "roleARN":"arn:aws:iam::0123456789:role/sumologic-metadata-tags" } }] }, "scanInterval":60000, "paused":false, "sourceType":"Polling" }}
Sumo fields can be posted to an HTTPS log source with custom headers to modify metadata including the X-Sumo-Fields header. X-Sumo-Fields is a comma separated list of key value pairs and we can leverage this feature to add tags or other fields to Cloudwatch logs.
Let’s do a really quick demo of using fields in sumo - but feel free to skip this if you want to go straight to the Cloudwatch Lambda section!
Check field names exist and create if required
Each field you want to send must be defined in the fields setting screen. Go to Manage Data / Settings / Fields and create a new field name - for example: env,team, owner and version.
Create an HTTP source for testing POST
Create a hosted collector by going to Manage Data / Collection.
Then create a new HTTPS streaming source.
Take a note of the url to post data to this new source. You will need this in the next step.
POST log events data with metadata fields
Configure uri from the previous step as the endpoint in this script, save post_test.sh and execute to post data. For example:
export endpoint="https://my_url"./post_test.sh
# MAKE SURE TO SET A VALID ENDPOINT VARIABLE!endpoint="${endpoint:-https://endpoint1.collection.us2.sumologic.com/receiver/v1/http/uri}" # some random test data to make it interestingepoch=`date +%s`dt=`date`h=`hostname`deets=`uname -a` # defaults if we don't set with env varscategory="${category:-test/app/test123}"payload="${payload:-{\"timestamp\":$epoch,\"time_string\":\"$dt\",\"from_host\":\"$h\",\"deets\":\"$deets\",\"random_id\":\"$((1 + RANDOM % 100000))\"}}" # log metadata beta X-Sumo-Fields headerfields="${fields:-url=env=test,team=myteam,owner=someone@acme.com,version=0.0.1}" # this would send a payload Category (_sourcecategory) and Fields set in headers.curl -H "X-Sumo-Category:$category" -H "X-Sumo-Fields:$fields" -d "$payload" -X POST $endpoint
If you have a livetail session open you will see the events arrive immediately:
After a few minutes you can find your logs in a new search:
_sourcecategory=test/app/test123
In the UI you should see your new fields associated with your log on the left side of the screen in the field browser.
We can now use our fields in any search scope or aggregation:
_sourcecategory=test/app/test123 team=* version=0.0.1 | count by team,owner,version
Now you’ve seen how easy it is to post custom field data to Sumo consider how you could enhance your posting process for Cloudwatch logs via lambda or other custom log sources.
The Sumo Logic cloudwatch logs lambda function supports setting X-Sumo-Fields and other headers via an environment variable. This search shows the relevant code in the github repo.
Set the SOURCE_FIELDS_OVERRIDE env variable when the lambda executes to include your own custom field values.
You can easily extend your lambda code to post custom fields to Sumo Logic for other use cases. For example creating code to read tags from AWS objects and include these as extra fields in each log event. It’s possible to auto subscribe lambda log groups and send your key tags as fields along with each event.
I hope you enjoyed this brief tour of enriching your AWS logs for three use cases in Sumo Logic using Fields. If you take a little time to enrich your logs you will find you can search faster and deliver more value and observability with Sumo Logic.
Before you go: here are two ways you can leverage your new fields to build even more value:
If you need a little bit more guidance, or have questions or feedback please reach out to customer-success@sumologic.com.
Reduce downtime and move from reactive to proactive monitoring.
Build, run, and secure modern applications and cloud infrastructures.
Start free trial