Fun with Datadog
Recently we started evaluating Datadog as a monitoring tool and a big part of that is the cloudwatch metrics, log aggregation, and tracing capabilities so you can have a “single pane of glass”. I’m going to talk about what I had to do to get the log collection and apm setup since we run in an elastic beanstalk environment.
Log Aggregation
This required installing the agent so it could do log forwarding to datadog. This also means the agent is
collecting host metrics, so double win. What I had to do here was add an ebextensions file for the datadog
install and configuration. I am installing datadog every time to get updates whenever they come out, but I
had to check and make sure the config lines I needed to add were not already there so I don’t have a thousand
lines of logs_enabled: true
. This was easy enough to accomplish with an if as the command:
if ! grep -q "logs_enabled: true" /etc/datadog-agent/datadog.yaml; then echo "logs_enabled: true" >> /etc/datadog-agent/datadog.yaml; fi
and was even easier on windows because there are no comments so I can just check for what I don’t want and
replace it with what I do want:
powershell.exe -Command \"((Get-Content -Path c:\\ProgramData\\Datadog\\datadog.yaml -Raw) -replace 'logs_enabled: false','logs_enabled: true' | Set-Content c:\\ProgramData\\Datadog\\datadog.yaml)\"
I’ve create a new repo for these datadog ebextensions files so they can live in one place and be pulled in by
all the different components. This also allows us to modify that config without requiring a new release. I use
a files
option to create the log config file and then in each build replace placeholder values for things that
are different, like service name. There is also a restart in my ebextensions commands to pick up the new
config changes
At this point the agent sees the logs and starts forwarding them and you are ready to do any parsing changes, etc on the logs that you need to do.
Tracing (APM)
Since we use beanstalk and single container deployment we could not use the datadog agent sidecar. Since we already had the agent installed I thought we could just setup the tracing agent to send spans to the host. The issue was how to set the ip when the same ebextensions deploys all 40 hosts. That’s when I found an article talking about using the docker0 network. Their local ip, 172.17.0.1, was the same as mine and the host I was experimenting with. So for now I have hardcoded that as an environment variable in my ebextensions file:
option_settings:
- option_name: DD_AGENT_HOST
value: 172.17.0.1 # $(ip -4 addr show docker0 | grep -Po 'inet \K[\d.]+')"
The next hurdle I had was that the agent listens on 127.0.0.1 by default so I had to change the bind_host
key to
0.0.0.0. This is easy to explain but took awhile to figure out.
Log Correlation with Traces
Now I have logs flowing into datadog and traces flowing into datadog. The next step was to correlate the two
so you can see the logs for a specific span. This is supposed to be as easy as setting the DD_LOGS_INJECTION
environment variable to true. But I did not get the dd.trace_id=