BIND Best Practices - Authoritative Systems


#1

What advice would you give to someone setting up or taking over DNS Administration responsibilities?

We need a new BIND Best Practices document for people who work on DNS but don’t specialize in it. Short, to the point and useful. Checklist-y.

Let’s build the list here, discuss what is most important and why.


#2
  1. Keep your software up to date.
    Make sure you are running a supported, non-EOL version of BIND. If you are using a packaged version, make sure that version is still supported by ISC, some of the packagers ship really old software. Subscribe to bind-announce@lists.isc.org or the ISC Advance Security Notification service so you are informed of any new BIND vulnerabilities ASAP. These two steps will ensure you are not unnecessarily vulnerable to a known specific attack against BIND or a known critical bug.

#3

The thought of keeping BIND up-to-date needs to be kept in lock-step with keeping your platform software up-to-date. While BIND has it’s fair share (?) of vulnerability notifications, we have yet to have (knock on wood) a vulnerability that has allowed remote access (via spawning a shell, or via remote-code-execution). This can not be said of most main-stream operating systems.

Be sure to keep your host platforms (and any hypervisors if you run in virtualized environments) patched up. It would be a shame if your extremely visible (and reachable) name servers were compromised by underlying operating system issues.


#4

Software load

While thinking about host systems, beware the “default install”. There are way too many system distributions that make assumptions as to your needs and the “role” of the server being installed. This may also appear in corporate environments where there are “common builds” of system images.

Rule of thumb: If you don’t need it, don’t have it installed on your server.

I’ve seen servers sitting in data centers that don’t have consoles available with full installs of X11. There is no reason for a DNS server to have an “office desktop” software bundle installed.

The more software you have installed, the larger your vulnerability surface.


#5

ok, what is the next thing after making sure that both BIND and the OS are up to date, and that the OS is the minimum install for the purpose. What is #2?


#6

Diversity is good

There are a number of things that should be done to make sure that your infrastructure remains viable in the face of “bad guys”.

First off, a good “biological diversity” in the platforms on which your name servers are running is important. If you are completely tied to one platform, if a “zero day” vulnerability is discovered for that platform, you are 100% vulnerable across all of your servers. Using, for example, a combination of Debian and Solaris servers would create a UNIX based diversity that would be able to weather a successful attack against either of the operating systems.

The difficulty here is that you are now creating a number of differing servers to maintain, but the security and reliability created by diversity is well worth the effort.

An additional avenue for extended survivability would be to add additional remote DNS infrastructure outside of your existing corporate network. There are a number of organizations that will provide a “secondary DNS service” to augment your exiting servers. Configuration of these servers may be as easy as adding them to your zone’s NS records and allowing the remote system to zone transfer from one or more of your servers.

Another very easy to maintain alternative is to rent one or more Virtual Private Servers (VPS) and use them as an off-prem DNS server. Be sure to consider all of the security implications before investing time in deploying a VPS.


#8

Stealth secondary servers – it’s a thing!

Be sure, as you are deploying your DNS infrastructure, to think about the way your network actually works. There are certain parts of DNS that should be close together and certain parts that need to be well separated.

Your clients and the servers that provide them answers should be as close together as possible. If you are a business that has multiple remote sites, consider running a BIND instance at each one with its own copy of your internal zone. In this way, even if there are network interruptions that cause connectivity problems, your remote offices will still be able to communicate.

One thing to remember when setting up remote offices is that you probably don’t want these servers to be known to the outside world. To prevent this, you will need to make sure that they do NOT appear in the list of nameservers that are authoritative for your zones. You will want to have the remote office servers listed in an “also-notify” clause, but not in the definitional NS records for the zone…

options { ... };
masters "remote-offices" { 192.168.12.2; 192.168.13.2; };

zone "example.com" in {
   type master;
   also-notify { "remote-offices"; };
   ...
};

This configuration will send NOTIFY messages to the remote servers at 192.168.12.2 and 192.168.13.2 in addition to the servers listed in the NS records in the zone any time that the zone is updated. There is no need for those servers to appear in the NS records contained in the zone, thus making them “invisible” as far as their duties as name servers.

The complexity added here is obvious. Any time you change IP addresses at the remote offices, you will need to remember to update the “remote-offices” masters list.

The remote office servers will need to have bi-directional visibility back to the master server over UDP and TCP port 53 to allow the notifications and zone transfers to take place.

To make use of these remote servers, each of them would be configured as the recursive server for the office in which the reside. Being recursive for “outside” zones and authoritative for “internal” zones works well in this scenario.


#9

Anycast. It’s cool. It’s useful. It’s not normally required.

The concept of anycast is easy to grasp: A netblock is announced on your network in such a way that it appears at more than one location - someone trying to reach an address in that netblock is routed to the service at the closest (network topology wise) location to them.

This is very useful in complex networks where there may be tens, hundreds or even thousands of networks, each with their own name server - put your name servers into the anycast network, configure your network correctly, provide your clients with the single anycast nameserver address and magically all of your name servers become a single network address and all of your clients use the closest one!

That’s awesome. Except when it isn’t.

The initial configuration of the anycast DNS instances must take into account a number of additional issues. These issues include the ability to quickly withdraw the anycast route from your network in the case of a DNS server malfunction, the ability to correctly transfer zone data between anycast server instances, and the ability of your support team to debug issues that stem from different clients using different servers.

If a DNS server malfunctions - hung, crashed, unable to provide correct data - the route advertisement to that specific DNS server must be quickly (and automatically) removed. Clients attempting to resolve DNS names using the affected server should be routed to other instances

Debugging a client DNS issue in an anycast network is much more complex and involves many more hands and eyes than does debugging an issue on a traditional network. The most vital concern: When debugging in an anycast environment, be absolutely sure that you and the client with the issue are looking at the same server.

tl;dr: If you don’t already deploy anycast for other purposes and don’t have huge numbers of networks with huge numbers of clients, pass on deploying anycast for now.


#10

A couple of simple configuration settings to harden your authoritative servers

Most DNS servers have differing configurations - that said, there are a few settings that in general, apply to all authoritative servers.

  • Don’t allow recursion on externally visible servers. This is a recommendation that you don’t have to even think about, as the default “allow-recursion” changed from “any” to a much more restrictive “localhost; localnets;” a number of years ago. All versions of BIND that allow global recursion are well beyond EOL and should not be in use.

Global option:

allow-recursion no;

  • Employ Response Rate Limiting (BIND 9.10 and above). RRL is a mechanism that limits the number of identical responses to a query. There are a number of tuning parameters that need a bit of thought when deploying RRL, but for the most part, the defaults are good.

    RRL works by dropping responses into different buckets. Each bucket is the IP address (or a collection of addresses) to which the response is being sent. When a given number of identical responses are seen within a certain length of time in a single bucket, the responses hosts in that bucket are limited. The tunable parameters include the number of identical response before limiting is triggered, the length of time a response stays in the bucket, and the size of the network that each bucket contains.

    It is impractical to create one bucket per IP address. The default bucket size is a /24 network (256 IP addresses) for IPv4 and a /56 network (256 networks of 18,446,744,073,709,551,616 addresses each) for IPv6. These bucket sizes represent common subnet sizes for each of the address families.

    There are some circumstances under which these bucket sizes may be too small, most revolving around the use of NAT on IPv4 networks. If you discover that you are rate limiting hosts that are innocent because they live with a large number of other hosts behind a single NAT’d IP address, you can either change the bucket size or “white-list” the network(s) by adding them to the exempt-clients list…

rate-limit {
  slip 2; // Every other response truncated
  window 15; // Seconds to bucket
  responses-per-second 5;// # of good responses per prefix-length/sec
  referrals-per-second 5; // referral responses
  nodata-per-second 5; // nodata responses
  nxdomains-per-second 5; // nxdomain responses
  errors-per-second 5; // error responses
  all-per-second 20; // When we drop all
  log-only no; // Debugging mode
  pps-scale 250; // x / 1000 * per-second
 // = new drop limit
  exempt-clients { 127.0.0.1; 192.153.154.0/24; 192.160.238.0/24 };
  ipv4-prefix-length 24; // Define the IPv4 block size
  ipv6-prefix-length 56; // Define the IPv6 block size
  max-table-size 20000; // 40 bytes * this number = max memory
  min-table-size 500; // pre-allocate to speed startup
};

The example above provides all of the tunable parameters, but as noted, the most useful for initial tuning are the “responses-per-second”, “ipv4-prefix-length” and “exempt-clients”.

There is the “log-only” option that can be set to “yes” to test configurations without actually changing the network performance.


#11

Separate your authoritative servers from your recursive servers

Back in the 1990s, the cost of computer hardware was relatively high. In those days, it was quite common to run both your internal recursive and external authoritative BIND instances on the same server - and it was often the same “instance”, configured to provide both sets of data.

It’s no-longer the 1990s (for good or bad), hardware is cheap(er) and it’s no-longer good to run authoritative and recursive servers on the same system. BIND has a number of features that will allow you to safely run both services on the same server, but it’s simpler (and yes, safer) to separate the services.

In another part of this series, you may find comments about how to configure remote offices with authoritative data (your own zones) and also have the server configured as recursive. This configuration is a bit different, you will notice, as these are servers that are not visible to the outside world.


#12

Protect those servers

Controversial? Sure, let’s do this. Should your authoritative DNS servers be deployed behind firewalls?

While many services (e-mail, web, etc) may need a firewall between themselves and the outside world, DNS rarely benefits from the protection provided by firewalls, and is often negatively impacted.

Most firewalls are not intelligent enough to understand the underlying information in a DNS packet beyond “this is legal” and “this is not”. Discarding illegal (or malformed) DNS packets is handled well by BIND and logging produced by the named executable assists in debugging and correcting problems that may arise from these packets.

It is often the case that a firewall will damage DNS in some way than for it to provide protection. See other posts regarding hardening your operating system and reducing traffic using RRL.

Similarly, without good reason, installing a load balancer in front of your authoritative DNS servers may cause more problems than they solve. The DNS protocol is very robust in determining the “best” server for to use, and with the constraints of bandwidth and CPU utilization lifted, using a load balancer for authoritative data is not commended.


#13

Internal vs. External Data

Originally, DNS was designed to be the same data provided to all clients from all servers. Then, the concept of a split namespace was introduced and included in BIND as the concept of “views”. The use of views is often used to separate internal devices from external devices. lab01printer.example.com might be visible from inside, and possibly via a VPN connection, is probably something that you would want to have visible from the outside.

Problems usually occur in two places with views:

  1. an accidental “leak” of data that is internal-only
  2. external clients, expecting internal views get external views

Number 1 is usually caused by an incorrect access control list (ACL) that allows the internal zone to be transferred to one or more external facing servers. This failure may not be obvious, and without specific testing, for example, creation of a canary entry that only resolves in one configuration and then testing for it from locations that should NOT be able to resolve them.

Number 2 is caused when a VPN fails to connect or when connected, is still seen by the DNS server as “outside” the list of internal networks. Both of these boil down to network infrastructure issues.

While internal vs. external DNS names using views are nearly ubiquitous these days, splitting your DNS into internal and external zones solves the problem in a very obvious and safe way. Internal names, for example, living only in int.example.com and sub-zones of int. External name servers would be configured without any knowledge of the int zone.


#14

DNSSEC

I initially thought this was a good idea, but I’m not so sure. Perhaps just a pointer to a how-to… there’s not a simple “best practice” here that doesn’t immediately go into the weeds of how to chose an algorithm, key generation, etc.


#15

What to you recommend for load balancing for a smaller-scale deployment? If the alternative is a load-balancer, perhaps you could discuss the pros and cons of that option (single point of failure being the most obvious one)

At what scale do you recommend anycast?) Several dozen DNS servers? Or would it be a question of # of authoritative servers at a given location?


#16

I know it is complicated, but can you give a few examples of how to tell if a firewall is impeding your DNS traffic? Firewalls are pretty useful, so people may need some clear use cases where they cause more harm than good in order to consider putting their DNS system outside the firewall.

Can you also say why it is safe to put the dns server outside the firewall? Maybe a simple diagram would be helpful here, showing a hidden master, firewall and external master?


#17

A table of Pros and Cons for multi-dns server deployments would be an efficient way to present the tradeoffs here.

It would also be helpful to link to a presentation or two about managing multi-dns systems. I believe you will find Anand B from RIPE has given one, and I think there was also one from CIRA recently. I think there is a draft presented at the last IETF about multi-dns and DNSSEC from Shumon Huque that discusses some of the problems with that scenario. If you are signing, multi-dns is a much bigger hassle.

Is there any reason NOT to have a secondary publisher? I would think that could be an even stronger recommendation…