How I would automate monitoring DNS queries in basic Prometheus

zdw | 73 points

Interesting. Our contribution to monitoring DNS is to make sure our HTTP uptime checker runs over both ipv4 AND ipv6 as separate checks. That way if we mess up our AAAA record but not our A record (or vice versa) we still get a fail. I've looked at some commercial SAAS uptime checkers and no-one seems to offer this as a feature - have I missed one that does?

The way to do this in Prometheus is to use the following settings on the blackbox module:

    [ preferred_ip_protocol: <string> | default = "ip6" ]
    [ ip_protocol_fallback: <boolean> | default = true ]
https://github.com/prometheus/blackbox_exporter/blob/master/...

So you need 2 modules, one for each ip version. As for automating setting these up, we deploy our Prometheus server with salt so we can use Jinja templating in all our Prometheus config files. That really cuts down on repeating boiler plate code.

This is also interesting for other reasons; in host downtime situations you can sometime see they will drop one type of traffic and not the other.

jarofgreen | a month ago

I thought this was about monitoring all of the outgoing DNS queries.

If you're interested in that, CoreDNS has a Prometheus exporter that lets you see how many requests you've made through it. You can therefore use CoreDNS as a "proxy" DNS to log all of your DNS queries.

In my case, I modified the CoreDNS code to also include the request in the exported metrics - with this you can see all of your DNS requests (over time) and see if there was something odd.

denysvitali | a month ago

I find the standard blackbox_exporter far too limiting and static so I wrote an exporter which queries DNS zones from the Google API and creates targets dynamically from that.

It also has a feature which will query internal databases to find expected targets (kinda like service discovery). This covers more specific checks than what the DNS-based targets will provide.

These together mean that essentially no endpoint in our infrastructure is missing from being monitored in some fashion.

The exporter performs SSL checks (lifetime remaining etc) as well as providing HTTP/TCP latency metrics.

AeroNotix | a month ago

Telegraf can expose Prometheus-scrapable metrics and has a DNS plugin that can monitor several targets with a single config stanza - which was a problem with black box exporter according to TFA. Maybe it could be an alternative? https://github.com/influxdata/telegraf/blob/release-1.29/plu...

loloquwowndueo | a month ago

There is some good stuff on the utoronto blog. But they also have stuff that is re-inventing the wheel in a worse manner. This is one of them.

TL;DR: Correct answer: install dnsdist infront of your DNS servers.

Then you get stats and query monitoring galore, with DNS fault-tolerant load-balancing to boot.

traceroute66 | a month ago