We built turing with the ability to visualise DNS data in different ways; applying filters, ordering packets, colouring by different metrics etc. The reason is that as humans we are very good at spotting patterns (whether or not they are significant is a whole different question).
One morning we noticed that the previous days traffic to one of our nameservers was a little odd-looking…
Here we see 24 hours worth of traffic to one of our servers broken down by qtype:
The interesting feature here is not so much the drop in traffic at around 6:30am but the fact that the A requests have a lot of noise (i.e. short-term variation in volume) which suddenly goes away at about 9:40pm.
The way we investigate this sort of phenomenon is to apply filters one by one and if the change pulls the signal out from the noise we keep it (although in this case the signal is the noise).
This process led to a set of filters of qtype = A; EDNS = true and CD bit clear, with these set the timeseries now looks like this:
Feeling quite happy that this was the best set of filters we could invoke for now we brought up an overview to see how those filters play out over a month’s worth of traffic. (This will tell us if we see the same behaviour daily; just on Thursdays; if it was a one-off etc.) Sometimes patterns are hard to spot and require the “eye of faith”, sometimes…
Well we didn’t see that one coming. To explain what we are seeing here I’ll go through what this plot is telling us.
Each dot represents an hour’s worth of traffic (each row is a day); the size of the dot is proportional to the volume of traffic (with all the filters you can see applied) and the colour of the dot is determined in this case by the ratio of NOERROR responses to all other responses (including NXDOMAIN). Green dots are where the NOERROR dominate and red dots are where the NOERROR responses are in a minority, with a scale of colour in between.
Let’s take a slice for a particular day, Friday 3 June, and visualise the rcodes returned as a timeseries:
So, with this particular set of filters we have a signal where there is a sudden jump in NXDomain responses followed sometime later by a jump in NOERROR responses; where the two peaks overlap the ratio becomes more even and so the dots are orange. This explains the pretty overview we see; but the question of what exactly is being done and by who still needs investigation.
Starting with the who first we can plot the top IP addresses making up the signal, in fact let’s concentrate on the NXDOMAIN responses because it is still possible that this is actually two different events which are overlapping.
Here we see that the top 100 IPs mostly fall into a small number of subnets and just two ASNs; taking the larger subnets and looking at the timeseries we get the following graphs.
On the left we have two subnets from one ASN and on the right we have two from the other…
This shows that the behaviour is distributed between the ASNs and over a number of subnets (maybe in order to avoid our response rate limits). We can pick on a particular subnet and look at the individual packets coming from it.
This view shows each query/response pair as a dot which can be coloured in many ways; the arrangement of the dots is (in this case) a raster scan ordered by the time we received the queries (so time increases left to right and top to bottom). We can see that the upper half of the plot is dominated by purple dots (authoritative NXDOMAIN responses) while the lower half is dominated by red dots (delegations). So what we are seeing is an alternative way of viewing the timeseries (note that the number of packets shown is limited and so the time window here finishes at 12:10 – just where the NXDOMAIN signal falls to near zero). We can change the way the dots are coloured so that they reflect the many different dimensions that we collect; for example we can convert the qname into a colour such that lexicographically similar names have similar colours.
More rainbows! But here the meaning is quite different; here the inference is that some process is going through a list of domains (sorted alphabetically) querying for each one in turn. In a few cases above the signal is very clear; e.g. one IP goes through the following sequence “yua.uk”, “yuaa.uk”,”yuab.uk”, “yuac.uk”,…,”yuaz.uk”, “yub.uk”,… The same IP was earlier going through a list like “silktiesformen.co.uk”, “silktiesoflondon.co.uk”, “silktieswholesale.co.uk”, “silktiger.co.uk”…
We now know who is doing what; but we don’t know why. Obviously there is value in knowing what names are registered within .uk (and what names are not registered); but this information is available for free to our members. Surely the cost of doing this sort of scan, which will need to be repeated to keep it up to date, will be higher than the cost of joining Nominet and downloading the data directly?