

I did a session at SharkFest’18 Europe in Vienna with the title of “Crash Course: IPv6 and Network Protocols“. Since the presentation slides + audio were recorded you can listen to the talk, too. Here are some notes about the motivation for this session as well as feedback from the attendees.
This was the official agenda of SharkFest’18 EU for my talk: “This presentation is split into 2 sections. Both are network protocol “crash courses” explained by a pcap walk-through.
My motivation for these two parts of the talk was the assumption that IT engineers are seeing many “unknown” packets in their own trace files due to the fact that neither IPv6 nor network protocols in general are known to them. Hence the idea to give an overview of those two areas.
However, though I got positive feedback from the attendees in general, they were not that happy about a single session with two different topics. Many have not read the title completely and simply thought it is an IPv6-only session (with a focus of “IPv6 network protocols”), while they were disappointed to hear some well-known general network protocol stuff in the second part. Ok, sorry guys, I fully agree that it was not the best idea to have both topics in one session, while the title was misleading as well. Next time I’ll submit a session for IPv6 only with even more details on it.
Anyway, here is the YouTube video from my session. If you’re interested in IPv6, listen to the first part. Interested in network protocols in general? Start listening at 43 minutes:
The PDF of my slides are available as well:
Featured image “Sharks” by Zach Dill is licensed under CC BY-NC 2.0.
I got an email where someone asked whether I know how to change the link-local IPv6 addresses on a FortiGate similar to any other network/firewall devices. He could not find anything about this on the Fortinet documentation nor on Google.
Well, I could not find anything either. What’s up? It’s not new to me that you cannot really configure IPv6 on the FortiGate GUI, but even on the CLI I couldn’t find anything about changing this link-local IPv6 address from the default EUI-64 based one to a manually assigned one. Hence I opened a ticket at Fortinet. It turned out that you cannot *change* this address at all, but that you must *add* another LL address which will be used for the router advertisements (RA) after a reboot (!) of the firewall. Stupid design!
Again and again and again I am not happy at all with the IPv6 implementation on the FortiGates. Too many bugs and features missing, while everything is too complicated to configure. (Have a look at my Fortinet feature requests.) For the following tests I used a FortiGate FG-90D with firmware v5.6.5 build1600 (GA).
Before I touched the config the state of IPv6 was the following. Have a look at the “fg-trust” interface with its link-local address in line 12:
fg # diagnose ipv6 address list dev=31 devname=vsys_fgfm flag=P scope=254 prefix=128 addr=::1 dev=29 devname=vsys_ha flag=P scope=254 prefix=128 addr=::1 dev=28 devname=fg-server flag=P scope=0 prefix=64 addr=2003:de:2016:220::1 dev=27 devname=fg-trust2 flag=P scope=0 prefix=64 addr=2003:de:2016:211::1 dev=26 devname=fg-trust flag=P scope=0 prefix=64 addr=2003:de:2016:210::1 dev=24 devname=root flag=P scope=254 prefix=128 addr=::1 dev=5 devname=wan1 flag=P scope=0 prefix=64 addr=2003:de:2016::2 dev=6 devname=wan2 flag=P scope=253 prefix=10 addr=fe80::a5b:eff:fea1:8360 dev=28 devname=fg-server flag=P scope=253 prefix=10 addr=fe80::a5b:eff:fea1:835e dev=27 devname=fg-trust2 flag=P scope=253 prefix=10 addr=fe80::a5b:eff:fea1:835e dev=26 devname=fg-trust flag=P scope=253 prefix=10 addr=fe80::a5b:eff:fea1:835e dev=5 devname=wan1 flag=P scope=253 prefix=10 addr=fe80::a5b:eff:fea1:835f
The configuration at this point was:
config system interface edit "fg-trust" set vdom "root" set ip 192.168.210.1 255.255.255.0 set allowaccess ping https ssh set role lan set snmp-index 5 config ipv6 set ip6-address 2003:de:2016:210::1/64 set ip6-allowaccess ping https ssh set ip6-send-adv enable config ip6-prefix-list edit 2003:de:2016:210::/64 set autonomous-flag enable set onlink-flag enable next end end set interface "internal1" set vlanid 210 next end
And a Linux machine got the following routing table, in which the default route had a gateway of
fe80::a5b:eff:fea1:835e:
weberjoh@jw-vm05-Ubuntu-Test-3:~$ ip -6 r s 2003:de:2016:210::/64 dev ens32 proto kernel metric 256 expires 2591699sec pref medium fe80::/64 dev ens32 proto kernel metric 256 pref medium default via fe80::a5b:eff:fea1:835e dev ens32 proto ra metric 1024 expires 1499sec pref medium
To add a link-local address you need the “config ip6-extra-addr” submenu. I added the quite simple
fe80::1/64address to that interface, that is:
config system interface edit fg-trust config ipv6 config ip6-extra-addr edit fe80::1/64 next end end end
Now, in order to have the router advertisements sent from this newly created link-local address, you have to reboot the firewall! Come on Fortinet, you need a complete reboot for this?!? (Note that the support ticket told me to disable the “ip6-send-adv” before adding the LL address, and enabling it again after that. But this was not successful. At this point the RAs were still sent from the old EUI-64 based LL address.) Hence a reboot:
execute reboot
After this changes and the reboot the added link-local IPv6 was present (line 6):
fg # diagnose ipv6 address list dev=31 devname=vsys_fgfm flag=P scope=254 prefix=128 addr=::1 dev=29 devname=vsys_ha flag=P scope=254 prefix=128 addr=::1 dev=28 devname=fg-server flag=P scope=0 prefix=64 addr=2003:de:2016:220::1 dev=27 devname=fg-trust2 flag=P scope=0 prefix=64 addr=2003:de:2016:211::1 dev=26 devname=fg-trust flag=SP scope=253 prefix=64 addr=fe80::1 dev=26 devname=fg-trust flag=P scope=0 prefix=64 addr=2003:de:2016:210::1 dev=24 devname=root flag=P scope=254 prefix=128 addr=::1 dev=5 devname=wan1 flag=P scope=0 prefix=64 addr=2003:de:2016::2 dev=6 devname=wan2 flag=P scope=253 prefix=10 addr=fe80::a5b:eff:fea1:8360 dev=28 devname=fg-server flag=P scope=253 prefix=10 addr=fe80::a5b:eff:fea1:835e dev=27 devname=fg-trust2 flag=P scope=253 prefix=10 addr=fe80::a5b:eff:fea1:835e dev=26 devname=fg-trust flag=P scope=253 prefix=10 addr=fe80::a5b:eff:fea1:835e dev=5 devname=wan1 flag=P scope=253 prefix=10 addr=fe80::a5b:eff:fea1:835f
The complete configuration section for this interface looked like this:
config system interface edit "fg-trust" set vdom "root" set ip 192.168.210.1 255.255.255.0 set allowaccess ping https ssh set role lan set snmp-index 5 config ipv6 set ip6-address 2003:de:2016:210::1/64 set ip6-allowaccess ping https ssh config ip6-extra-addr edit fe80::1/64 next end set ip6-send-adv enable config ip6-prefix-list edit 2003:de:2016:210::/64 set autonomous-flag enable set onlink-flag enable next end end set interface "internal1" set vlanid 210 next end
And the Linux machine (after a reboot as well) got the correct next hop for its default route:
weberjoh@jw-vm05-Ubuntu-Test-3:~$ ip -6 r s 2003:de:2016:210::/64 dev ens32 proto kernel metric 256 expires 2591853sec pref medium fe80::/64 dev ens32 proto kernel metric 256 pref medium default via fe80::1 dev ens32 proto ra metric 1024 expires 1653sec pref medium
Accordingly I could verify that the router advertisements were sent from my added link-local address
fe80::1:
That’s it. I am not happy with this approach from Fortinet in “changing” the link-local address. On other firewalls such as the Palo Alto Networks firewall you can clearly change the behaviour of the interface ID portion, and it even works without rebooting the firewall:
Cheers.
Featured image “Buy Local” by Mariano Mantel is licensed under CC BY-NC 2.0.
I was interested in how a recursive DNS server resolves DNS queries in detail. That is, not only the mere AAAA or A record, but also DNSSEC keys and signatures, the authority and additional section when testing with
dig, and so on. For this I made two simple DNS queries to my recursive DNS server which resulted in more than 100 DNS packets at all. Wow.
In the following I am publishing a downloadable pcap so that you can analyse it by yourself. Furthermore I am showing some listings and screenshots to get an idea of the DNS resolution process.
Of course such tests heavily depend on the queried names. I chose the following two:
For both queries I used dig to ask my recursive DNS server BIND (with a cleared cache!) for the A record. Since this server has DNSSEC validation enabled, it looked for DNSKEY/DS records as well. All DNS sessions are either sent via IPv6 or legacy IP, and over UDP or TCP. (I cut off the TCP overheads completely to only have the DNS related packets. Note the different colors in Wireshark or look at the udp/tcp.stream columns.)
Feel free to use this capture file (zipped, 10 KB) and open it with Wireshark:
The first query for the A record of atlas.ripe.net generated 14 DNS packets. Beside the query for the A record and the corresponding CNAME record (both with RRSIGs included), BIND also queried the DNSKEY (from the authoritative name server) and DS records (from the parent zone) in order to completely validate the answers via DNSSEC. The query looked like this. Note the “ad” flag since the reply is DNSSEC validated:
weberjoh@nb15-lx:~$ dig atlas.ripe.net ; <<>> DiG 9.10.3-P4-Ubuntu <<>> atlas.ripe.net ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41063 ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 7, ADDITIONAL: 13 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;atlas.ripe.net. IN A ;; ANSWER SECTION: atlas.ripe.net. 21600 IN CNAME atlas-ui.ripe.net. atlas-ui.ripe.net. 21600 IN A 193.0.6.158 ;; AUTHORITY SECTION: ripe.net. 3600 IN NS a2.verisigndns.com. ripe.net. 3600 IN NS a3.verisigndns.com. ripe.net. 3600 IN NS manus.authdns.ripe.net. ripe.net. 3600 IN NS sns-pb.isc.org. ripe.net. 3600 IN NS a1.verisigndns.com. ripe.net. 3600 IN NS ns4.apnic.net. ripe.net. 3600 IN NS tinnie.arin.net. ;; ADDITIONAL SECTION: a1.verisigndns.com. 156744 IN A 209.112.113.33 a1.verisigndns.com. 156744 IN AAAA 2001:500:7967::2:33 a2.verisigndns.com. 156744 IN A 209.112.114.33 a2.verisigndns.com. 156744 IN AAAA 2620:74:19::33 a3.verisigndns.com. 156744 IN A 69.36.145.33 a3.verisigndns.com. 156744 IN AAAA 2001:502:cbe4::33 ns4.apnic.net. 156733 IN A 202.12.31.53 ns4.apnic.net. 156733 IN AAAA 2001:dd8:12::53 manus.authdns.ripe.net. 60142 IN A 193.0.9.7 manus.authdns.ripe.net. 60142 IN AAAA 2001:67c:e0::7 tinnie.arin.net. 156733 IN A 199.212.0.53 tinnie.arin.net. 156733 IN AAAA 2001:500:13::c7d4:35 ;; Query time: 370 msec ;; SERVER: 2003:de:2016:120::a08:53#53(2003:de:2016:120::a08:53) ;; WHEN: Tue Aug 21 09:32:49 CEST 2018 ;; MSG SIZE rcvd: 518 weberjoh@nb15-lx:~$
For Wireshark I used a couple of custom columns to display the TCP and UDP stream indices as well as the DNS query and DNS type. The first and last packet shown in the screenshot is the query from my Linux machine to the recursive DNS server, while all other packets are generated by this server itself (plus the answers):
As you can see in the background color for each line, some sessions used UDP while others used TCP. Answers with RRSIGs that do not match into single DNS packets used TCP:
The single query for www.netflix.net produced about 110 DNS packets! (Not counting the TCP overhead here, only DNS. With TCP it’s even more.) This was the dig request:
weberjoh@nb15-lx:~$ dig www.netflix.com ; <<>> DiG 9.10.3-P4-Ubuntu <<>> www.netflix.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64210 ;; flags: qr rd ra; QUERY: 1, ANSWER: 10, AUTHORITY: 4, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;www.netflix.com. IN A ;; ANSWER SECTION: www.netflix.com. 1800 IN CNAME www.geo.netflix.com. www.geo.netflix.com. 1800 IN CNAME www.eu-west-1.prodaa.netflix.com. www.eu-west-1.prodaa.netflix.com. 60 IN A 34.253.36.84 www.eu-west-1.prodaa.netflix.com. 60 IN A 52.208.174.58 www.eu-west-1.prodaa.netflix.com. 60 IN A 54.229.85.178 www.eu-west-1.prodaa.netflix.com. 60 IN A 54.194.103.216 www.eu-west-1.prodaa.netflix.com. 60 IN A 54.154.12.178 www.eu-west-1.prodaa.netflix.com. 60 IN A 54.194.216.197 www.eu-west-1.prodaa.netflix.com. 60 IN A 54.77.199.154 www.eu-west-1.prodaa.netflix.com. 60 IN A 34.252.114.84 ;; AUTHORITY SECTION: prodaa.netflix.com. 86400 IN NS ns-1489.awsdns-58.org. prodaa.netflix.com. 86400 IN NS ns-1606.awsdns-08.co.uk. prodaa.netflix.com. 86400 IN NS ns-749.awsdns-29.net. prodaa.netflix.com. 86400 IN NS ns-375.awsdns-46.com. ;; Query time: 287 msec ;; SERVER: 2003:de:2016:120::a08:53#53(2003:de:2016:120::a08:53) ;; WHEN: Tue Aug 21 09:33:18 CEST 2018 ;; MSG SIZE rcvd: 366 weberjoh@nb15-lx:~$
While in Wireshark it looks like this. Many of those packets are related to the authority and additional section of dig/BIND that even asked for the A/AAAA records for name servers:
For this single query Wireshark lists 14 TCP and 42 UDP conversations, while 12 took place over legacy IP and 10 over IPv6. Not that bad, isn’t it? ;)
Since Netflix does not sign their zone via DNSSEC, the answer for the DS record is signed with NSEC3 – kind of NXDOMAIN for DNSSEC:
And since Netflix (among other big players) uses some kind of distributed DNS servers and geo-based load-balancing the overall picture looks quite confusing, for example when looking at the DNSViz graph for www.netflix.com.
Have you ever thought of “one DNS query – one DNS answer”? Well … no. Possibly hundreds of packets. That’s the reason why we are using recursive and caching DNS servers. For example, the DNSSEC related resource records for the root zone and the TLDs have quite long TTLs. Hence caching servers really have an advantage here.
But keep in mind that you should NOT use public DNS resolvers such as 8.8.8.8 if you’re interested in your privacy. Have a look at this paper. It’s not that hard to run Unbound or BIND at your own home/company.
I was mostly interested in how the DNS server validates DNSSEC. This can be seen in both queries. While the first one used the DNSKEYs and DS records to validate the signature, the second one simply verified that DNSSEC is not used for this zone (signed NSEC3 from the root server that the DS does not exist).
For further reading have a look at some articles from Geoff Huston such as Measuring DNSSEC Performance or The Cost of DNSSEC.
Featured image “Schutz vor … / Protection against …” by Frank Lindecke west is licensed under CC BY-ND 2.0.
This post is not about software but hardware tools for network admins. Which network gadgets am I using during my daily business? At least three, namely the Airconsole, the Pockethernet and the ProfiShark, which help me in connecting to serial ports, testing basic network connectivity, and capturing packets in a high professional way. Come in and have a look at how I’m working.
One typical task during my work as a network security consultant is to install/pre-configure new firewalls. This involves at least the following two steps: Using the serial console port to set the management IP address/netmask/default gateway as well as testing basic network connectivity after the configuration is done. Of course you can do both steps with your laptop, but I tend to leave it on my working table while using some gadgets in the data center.
Another task is to capture network packets inline (that is: not with Wireshark installed on a computer, but with a network TAP that captures *all* packets on the wire) in order to solve high-level problems. Again, you could use your notebook with its Ethernet port plugged into a port mirror, but then you’re loosing all your connections. Hence a special device…
The first step is always setting some basic parameters on any network device, such as IP addresses or a username/password. Or, if an interface gets its IP address through DHCP (IPv4) or RA/SLAAC (IPv6), to figure out which one it has. One way to get rid of the laptop is to use the Airconsole, a small terminal server with bluetooth and WiFi connectivity to a smartphone app called “Get Console”. Of course you won’t configure many complicated CLI commands through your phone, but for getting very limited information it fits.
Looking at the price we are not talking about a high-end professional device but about an affordable gadget. Personally, I won’t trust it as an always-on terminal server that is possibly able to connect through the Internet via its LAN port, but as long as I am using only bluetooth for a limited time, I think it’s secure enough. In the following photo you can see the Airconsole hanging on a Palo Alto Networks firewall in the data center. I got some strange failures on the firewall that day, I think the data plane crashed and it was stuck in the management plane. Reboot solved it. ;)
After I applied all network and Internet settings on the firewall (via my laptop and some central management stations for the firewalls such as Palo Alto Panorama or Juniper NSM or FortiGate FortiManager), I always want to test at least the DHCP, DNS and routing functionality. Rather than using my laptop again (which would terminate all current SSH/whatever sessions), I am using the Pockethernet, again a small device plugged in via Ethernet and connected to an app called “Pockethernet” via bluetooth. Using the Link, DHCP, and Ping features, it tests the layer 2 link such as 1000 Mbit MDI/MDIX, DHCP for IPv4 as well as DNS and ping. Great. And easy! It’s just too bad that it does not work with IPv6 at all. ;( But according to the author this is on the roadmap.
Note that you can use lots of other tests with the Pockethernet, such as as cable wiremap, PoE & cable length measurements, CDP/LLDP information gathering, and external IP detection. In the photo you can see it as I plugged it into a Cisco 3750G switch. The screenshots show the Link/DHCP/Ping results at the overview page, as well as details about DHCP and CDP in the drop-down sections.
This one comes into place if I am facing really complicated network problems: The ProfiShark from Profitap, a network TAP connected via USB 3.0 to my computer. That is: capturing network packets inline without the need of additional switches (port mirrors) nor network cards. You can simply capture 1 Gbit full duplex directly into your disk or Wireshark. Great. And that small! Please have a look at this blogpost in which I explained the ProfiShark in detail. Also note some other blogposts from myself tagged with “ProfiShark” in which I used this TAP to solve some problems.
Compared to the other two mentioned tools, the ProfiShark really is a professional network capture device. Hence the price, as you’re advised to not rely on cheap capture devices when you really want to solve network problems. In the following photo I captured all connections from a home receiver.
These are the three hardware tools I am carrying with me all the time. They are quite small and easily fit into my bag (though it’s getting heavier over time). And, beside of their actual functionalities, I like to play around with them. Just as any other gadgets out there. ;) Cheers!
Featured image “tools of the trade” by liz west is licensed under CC BY 2.0.
Working with Infoblox can be challenging when it comes to their naming of features, licenses, marketing slides, and GUI options. So let’s bring some clarity into this chaos. :D I have listed the most common DNS security features and their corresponding Infoblox names. I hope you folks can use it as well.
I am focussing on the DNS security features here only. Not on core NS1 Grid, NetMRI, and so on.
Feature | Marketing | License | GUI |
---|---|---|---|
response policy zone RPZ trigger & action | DNS Firewall DFW | see left | DNS -> Response Policy Zones |
RPZ feed for malware blocking (recursive DNS server) | ActiveTrust AT Threat Intelligence TIDE/DOSSIER | see left (includes DFW for RPZ) standard/plus/advanced | see above |
DNS exfiltration/tunneling blocking (recursive DNS server) | Threat Insight TI | Threat Anlytics TA (requires DFW or AT for RPZ) | Grid -> Threat Analytics Data Management -> Threat Analytics |
DDos & exploit defense (authoritative DNS server) | Advanced DNS Protection (v)ADP | Threat Protection & Threat Protection Update | Grid -> Threat Protection Data Management -> Security |
Please note that at least the “TI” acronym is used twice, cause it can be either “Threat Intelligence” or “Threat Insight”. To my mind it’s better to omit those acronyms at all while using the full two/three words when talking about it.
Features I have not listed here:
Merry christmas everyone! Christ is born. That’s what it’s all about!
Featured image “Buntstifte” by Dennis Skley is licensed under CC BY-ND 2.0.
… since we all can use
pool.ntp.org? Easy answer: Many modern (security) techniques rely on accurate time. Certificate validation, two-factor authentication, backup auto-deletion, logs generation, and many more. Meanwhile we use an unauthenticated protocol (via stateless UDP) from unauthenticated sources (NTP pool) to rely on! Really?
If you are using couple of different NTP sources it might be not that easy for an attacker to spoof your time – though not unfeasible at all. And think about small routers with VPN endpoints and DNSSEC resolving enabled, or IoT devices such as cameras or door openers – they don’t even have a real-time clock with battery inside. They fully rely on NTP.
This is what this blogpost series is all about. Let’s dig into it. ;)
Why am I starting a long series about NTP? Because I have many customers and colleagues that are way too lazy when it comes to NTP. “It’s working with some servers on the Internet, so why should I care?”
I basically want to cite this mail from the NANOG (North American Network Operators Group) mailing list:
“How about you set the time on your server ahead by 5 years. Got any idea what would happen?
Until recently, setting your iPhone to 1 Jan 1970 would brick it. I’m sure there are many more examples, but likely you can no longer log in, via SSH or HTTPS, and your iPhone is dead. I think any of those would qualify as more than an annoyance.”
Think about it.
Want some more resources? Have a look at this paper: Bypassing HTTP Strict Transport Security [PDF] which almost only leverages a man-in-the-middle tool for NTP. Uh.
Or this one: “The most common cause for secure connection errors turns out to be user systems having the wrong time“, Panos Astithas at Feeling safer online with Firefox.
And something from Geoff Huston: “What can we say about time and the Internet? The safest assumption is that most systems will be in sync with a UTC reference clock source as long as the definition of “in sync” is a window of 24 hours! If the ‘correct’ behaviour of an Internet application relies on a tighter level of time convergence with UTC time than this rather large window, then it’s likely that a set of clients will fall outside of the application’s view of what’s acceptable“, at APNIC Labs – What’s the Time?.
And what about the Internet of Things? IoT devices merely rely on NTP since they don’t have a built-in battery-powered real-time clock. That is: After booting they don’t have a time at all until they get one over NTP. Spoofing the NTP servers or packets would immediately mess up their time.
And one more example out of a networker’s daily business: Taking multipoint packet captures without synchronized times is a nightmare! You have to sync the time on your capture devices to have a chance to find your relevant packets all over your traces.
You should use three on-site NTP servers as well as NTP authentication. That is:
For this blogpost series I primarily used a couple of Raspberry Pis. Simply because they are cheap, have small power consumption, and you can easily use the IO ports to connect other devices such as a GPS or DCF77 receiver modules. Works fine. However: You should NOT rely on those Pis in your production environment. Why? Because they don’t have a real-time clock which keeps the own time accurate. Hence they merely rely on their stratum 0 input. If this signal is lost, you will drift in time. Furthermore, they are not stable at all, hard to update (as you will see during my series) and everything needs to be configured by hand. Hence: Only use dedicated NTP appliances that are made to serve precise timestamps all over your network. For my posts I am additionally using a Meinberg LANTIME M-200 device which I am comparing to the Pis at some points.
This blogpost series is split into three parts:
All posts are related to my three IPv6-only stratum 1 NTP servers:
ntp1.weberlab.de, a Raspberry Pi with a DCF-77 receiver,
ntp2.weberlab.de, a Raspberry Pi with a GPS receiver and
ntp3.weberlab.de, a Meinberg LANTIME M200 NTP Server with DCF77 and phase modulated PRNG.
Feel free to use them via
ntp.weberlab.deonly. This FQDN will list only the IP addresses of those servers that are available. Of course you should only use them for test purposes, NOT for your enterprise, as this article is all about running your own servers. ;)
Throughout this series I am only covering NTP but not PTP (Precision Time Protocol). Furthermore, I am not attacking NTP via MITM attacks (at least not now) but only show how to prevent them with NTP authentication, nor do I attack the external time sources DCF77 or GPS. And I am not diving into DDoS attacks that use NTP servers (reflection & amplification attack), since I assume that you’ll primarily use your servers only within your own network and not on the public Internet.
As always you can read even more. ;) If you’re interested in the deep details about the network time protocol itself you should have a look at the Internet Protocol Journal volume 15, number 4 [PDF] and its article “Protocol Basics: The Network Time Protocol” from Geoff Huston. Anyway, here are some more links:
As well as security related stuff:
I am not set against the NTP pool project in general. It is a great community driven project that I am using client-side (for many small devices such as routers, Raspberry Pis, IoT, …) as well as server-side since I am contributing with at least one online NTP server. However, my focus with this blogpost series is the enterprise network. And you should not use the NTP pool within your enterprise as at all, but own independent stratum 1 NTP appliances along with NTP authentication.
Featured image “Breitling Super Avenger” by W.A.Smith is licensed under CC BY-NC-ND 2.0.
What’s the first step in a networker’s life if he wants to work with an unknown protocol: he captures and wiresharks it. ;) Following is a downloadable pcap in which I am showing the most common NTP packets such as basic client-server messages, as well as control and authenticated packets. I am also showing how to analyze the delta time with Wireshark, that is: how long an NTP server needs to respond to a request.
As always in my “packet capture” blogposts you are invited to download the following pcap (zipped, 16 KB) and to open it with Wireshark to have a look at it by yourself:
This file consists of many different NTP packet types. Hence I am using display filters within Wireshark to have a look at specific scenarios. The standard UDP destination port for NTP is 123, while the source port *might* be 123 as well.
Have a look at the current NTPv4 RFC 5905 “Network Time Protocol Version 4: Protocol and Algorithms Specification” in order to understand the packets and protocol details. Looking on the wire you should understand the packet header (section 7.3 in the RFC). Note that I am NOT explaining the NTP algorithm at all, but only the packets and its fields that are present on the network. The most important fields are:
These variables are seen on the wire for NTP packets. Note that on any NTP server or client you have a couple of columns that are listed in many documentation and are NOT part of the packets but of calculations by the NTP algorithms. Those are when, poll, reach, delay, offset, and jitter. Have a look at the blogpost from Aaron Toponce “Real Life NTP” in which he describes these columns of ntpq (among other things). Or, of course, at the official ntpq documentation.
In my pcap,
udp.stream eq 21shows a basic client to server communication. An NTP clients asks a server for the time. In the answer of the server you can see its stratum (1) and reference clock (DCFa). Normally an NTP communication is ongoing over the lifetime of the ntp service running; it queries the server at the “poll” interval. You can see this behaviour in
udp.stream eq 2where my NTP server asks (as a client) another NTP server on the Internet. The polling interval in this case was 64 seconds, the stratum of the server was 2, while the reference ID shows the IPv4 address (or the first bytes of the MD5 hash of the IPv6 address) of the reference from the queried NTP server.
When you’re running multiple NTP servers connected as “peers” rather than “server” (refer to the ntp.conf manpage) in order to sync their clocks against each other, you’ll see symmetric active (mode 1) packets on the wire.
udp.stream eq 1shows the peering between two of my stratum 1 NTP servers.
You can send control packets to NTP servers for setting and getting specific information. I am using queries via ntpq from my monitoring server to poll some stats from the NTP servers. (Details are covered in an upcoming blogpost.) An example is
udp.stream eq 15in which my monitoring server polled the peers from the NTP server via “ntpq -p ntp1.weberlab.de”. All active connections were sent back to this monitoring server, one by one. Hence a couple of NTP packets within a few milliseconds.
For NTP authentication there are two extension fields added to the packets: the key ID and the message authentication code MAC. (I am covering NTP authentication in a couple of other posts in detail as well.) Depending on the authentication method, MD5 or SHA-1, the length of the MAC differs.
udp.stream eq 33shows an MD5 authentication,
udp.stream eq 9a SHA-1, and
udp.stream eq 0a failure in the authentication, namely a crypto-NAK. Refer to RFC 7822 (Network Time Protocol Version 4 (NTPv4) Extension Fields): “If a MAC is used, it resides at the end of the packet. This field can be either 24 octets long, 20 octets long, or a 4-octet crypto-NAK.”
Due to my Wireshark bug report aka feature request “NTP Analysis: Delta time between Client-Server“, one of the core developers, Pascal Quantin, added the field
ntp.delta_timein which Wireshark calculates the time between the client’s request and the corresponding server’s response (similar to the dns.time or http.time fields). You can see this calculated value in square brackets [as always for Wireshark-added fields]. Additionally I have added a column in my Wireshark GUI to show these values, as you can see in this screenshot for
udp.stream eq 2:
Furthermore you can use the “IO Graphs” from Wireshark to display the ntp.delta_time for certain connections. In the following graph you can see the analysis of
udp.stream eq 2again, while the Y axis shows the
ntp.delta_timefield. Since this particular NTP client sent an NTP request every 64 seconds, you can see those ticks in the graph, as well as one spike near 1040 seconds of the trace:
Yeah, that’s it for now. Have a look at your own network and verify the kinds of used NTP versions/servers/stratums/reference clocks/delta_times and so on. ;)
Featured image “Great White Shark” by Elias Levy is licensed under CC BY 2.0.
In this tutorial I will show how to set up a Raspberry Pi with a DCF77 receiver as an NTP server. Since the external radio clock via DCF77 is a stratum 0 source, the NTP server itself is stratum 1. I am showing how to connect the DCF77 module and I am listing all relevant commands as a step by step guide to install the NTP things. With this tutorial you will be able to operate your own stratum 1 NTP server. Nice DIY project. ;) However, keep in mind that you should only use it on a private playground and not on an enterprise network that should consist of high reliable NTP servers rather than DIY Raspberry Pis. Anyway, let’s go:
At the time of writing (Nov 2018) I am using a Raspberry Pi 1 B (yes, the old one), kernel 4.14.71+ and Raspbian GNU/Linux 9 (stretch). I installed a few relevant packages and gave it a static IPv6 address. Legacy IP (IPv4) is not used at all, only IPv6.
At first you need a DCF77 module in order to receive the radio clock from Germany. I used the “Conrad” module. It needs a pull up resistor with 3-10 kOhm between ports 2 and 4 (VCC and DCF-). Have a look at the PDFs from the Conrad store that show the port assignment, as well as this two german pages that give several hints. Long story short: use three cables:
I first soldered the cables directly to the Pi, while I later added a 3,5 mm jack to the housing of the Pi with a 2 meter cable on the DCF77 module:
The first step is to install NTP with the support of DCF77. This is not the case if you’re simply doing a
sudo apt-get install ntp. Anyway, please install the NTP server with this command first in order to have the startup scripts in place, etc. After that you need to build NTP from source (note the “–enable-RAWDCF” option) which involves the following steps:
sudo nano /etc/apt/sources.list # uncomment the line with "deb-src" deb-src http://raspbian.raspberrypi.org/raspbian/ stretch main contrib non-free rpi sudo apt-get update sudo apt-get -y build-dep ntp # download the latest ntp: http://ntp.org/downloads.html # in this example it was 4.2.8p12 wget http://www.eecis.udel.edu/~ntp/ntp_spool/ntp4/ntp-4.2/ntp-4.2.8p12.tar.gz cd ntp-4.2.8p12/ # note: the following step takes some time: ./configure --enable-RAWDCF --prefix=/usr # note: the following "make" takes even longer, especially on an old Pi 1 B make sudo service ntp stop sudo make install # to disable updating ntp via apt-get, echo the following: echo "ntp hold" | sudo dpkg --set-selections sudo service ntp start
(Fun fact: Since I am running my NTP server via IPv6-only meanwhile the NTP download page is IPv4-only, I had to download and copy the ntp package from another machine to the Raspberry Pi. Sigh. Yes, I know, DNS64/NAT64 would solve the problem.)
You have just installed the latest version of NTP, while “holding” the ntp package within dpkg to not overriding/downgrading it with future “apt-get update”s. You can verify this with the following: (Note the “h” in the very first column which indicates that the ntp package is on hold.)
pi@ntp1-dcf77:~ $ dpkg -l | grep ntp hi ntp 1:4.2.8p10+dfsg-3+deb9u2 armhf Network Time Protocol daemon and utility programs
Similarly you can verify the running ntp version with:
pi@ntp1-dcf77:~ $ ntpq --version ntpq 4.2.8p12@1.3728-o Thu Nov 8 11:51:22 UTC 2018 (1)
At first you must disable the serial port of the Pi while keeping the port hardware enabled. Open
sudo raspi-config, navigate to “5 Interface Options” -> “P6 Serial” and:
followed by a reboot when exiting raspi-config.
A very basic test with “screen” should indicate some output, though completely useless:
pi@ntp1-dcf77:~ $ sudo screen /dev/ttyAMA0 9600 ���� Ctrl+a \ y
Add the user “ntp” (which runs the ntp daemon) to the “tty” group in order to have access to the tty console. Verify it with the second command shown below:
pi@ntp1-dcf77:~ $ sudo adduser ntp tty Adding user `ntp' to group `tty' ... Adding user ntp to group tty Done. pi@ntp1-dcf77:~ $ cat /etc/group | grep tty tty:x:5:ntp
In order to have a symbolic link to /dev/refclock-0 (which is needed by NTP later), add the following three lines to the init script from NTP, while commenting out the latter three lines
sudo nano /etc/init.d/ntp:
if [ ! -L /dev/refclock-0 ]; then ln -s /dev/ttyAMA0 /dev/refclock-0 fi #if [ -e /var/lib/ntp/ntp.conf.dhcp ]; then # NTPD_OPTS="$NTPD_OPTS -c /var/lib/ntp/ntp.conf.dhcp" #fi
Reload the systemctl and restart the NTP process:
sudo systemctl daemon-reload sudo service ntp restart
Verify that the symbolic link is present:
pi@ntp1-dcf77:/etc/init.d $ ls /dev/ref* /dev/refclock-0
Now you need to use driver number 8 from the NTP software in “mode 5” for the Conrad DCF77 module. That is: Open the ntp configuration file:
sudo nano /etc/ntp.confand add your “server” aka DCF77 module in the following way:
server 127.127.8.0 mode 5 prefer
I have commented out the default “pool 0.debian.pool.ntp.org iburst” lines but entered a few IPv6 enabled servers as well, though without the “prefer” option:
#http://support.ntp.org/bin/view/Servers/PublicTimeServer000388 server ntp.probe-networks.de #http://support.ntp.org/bin/view/Servers/PublicTimeServer001352 server time.hueske-edv.de #http://support.ntp.org/bin/view/Servers/PublicTimeServer000840 server ntp2.301-moved.de #http://support.ntp.org/bin/view/Servers/PublicTimeServer001363 server ntp.fanlin.de
Finally, restart the NTP daemon:
sudo service ntp restart.
Now, in theory, the DCF77 receiver should work, but only if you have a good radio signal and quality and correct angle of the antenna, which was not the case in my scenario at the first time. Use “ntpq -p” to print a list of NTP servers that are in use. As you can see in this example, the “GENERIC(0) .DCFa.” server (first line) was never reached so far:
pi@ntp1-dcf77:~ $ ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== GENERIC(0) .DCFa. 0 l - 64 0 0.000 0.000 0.000 ptbtime2.ptb.de .PTB. 1 u 57 64 377 9.207 4.270 4.483 *ns2.probe-netwo 192.53.103.108 2 u 2 64 377 2.553 3.727 4.777 -2a01:4f8:201:41 131.188.3.221 2 u 57 64 377 16.104 -2.650 2.955 +2a02:a00:1009:6 40.179.132.91 2 u 9 64 377 6.683 -0.115 2.329 +2001:4ba0:ffa4: 51.254.155.97 2 u 11 64 377 5.818 -4.302 3.673
Have a look at the syslog messages. If they list something like this, you don’t have a good signal quality, while the receiver at least works in general: ;)
pi@ntp1-dcf77:~ $ tail /var/log/syslog | grep ntpd Nov 8 16:05:59 ntp1-dcf77 ntpd[500]: parse: convert_rawdcf: INCOMPLETE DATA - time code only has 3 bits Nov 8 16:06:04 ntp1-dcf77 ntpd[500]: parse: convert_rawdcf: INCOMPLETE DATA - time code only has 6 bits Nov 8 16:06:08 ntp1-dcf77 ntpd[500]: parse: convert_rawdcf: INCOMPLETE DATA - time code only has 5 bits Nov 8 16:07:00 ntp1-dcf77 ntpd[500]: parse: convert_rawdcf: INCOMPLETE DATA - time code only has 12 bits Nov 8 16:07:04 ntp1-dcf77 ntpd[500]: parse: convert_rawdcf: INCOMPLETE DATA - time code only has 3 bits Nov 8 16:07:07 ntp1-dcf77 ntpd[500]: parse: convert_rawdcf: INCOMPLETE DATA - time code only has 4 bits Nov 8 16:07:18 ntp1-dcf77 ntpd[500]: parse: convert_rawdcf: INCOMPLETE DATA - time code only has 12 bits Nov 8 16:07:24 ntp1-dcf77 ntpd[500]: parse: convert_rawdcf: INCOMPLETE DATA - time code only has 7 bits Nov 8 16:07:55 ntp1-dcf77 ntpd[500]: parse: convert_rawdcf: INCOMPLETE DATA - time code only has 32 bits Nov 8 16:08:00 ntp1-dcf77 ntpd[500]: parse: convert_rawdcf: INCOMPLETE DATA - time code only has 6 bits
Note that the DCF77 receiver must point orthogonal with its longer side to Mainflingen, Germany, which is south-east of Frankfurt. (Fun fact: Since I am living just a few kilometers in the north of Frankfurt, my actual direction of the receiver must not show to Frankfurt directly, but more precisely to Mainflingen. However, for almost everyone else the receiver must simply “look” into Frankfurt.) Furthermore, you must omit any interfering/disturbing sources such as switching power supplies or the like in the room where your receiver resides. At its best, the receiver is placed outside a building (waterproof, lightning protection!), which is not that easy at all. In addition I came across some weird observations, since my signal quality was way better as I turned the antenna just a few degrees in one direction. TL;DR: If you’re lacking signal quality, try a few different positions of your antenna. Be creative. ;)
If you have a good signal strength and quality, the ntpq -p output should look like this, in which the “GENERIC(0) .DCFa.” source has a reach of 377 while the * symbol indicates, that it is used as the system peer:
pi@ntp1-dcf77:~ $ ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== *GENERIC(0) .DCFa. 0 l 35 64 377 0.000 -0.019 0.346 +2003:de:2016:33 .PPS. 1 s 34 64 376 6.445 0.496 5.006 ptbtime2.ptb.de .PTB. 1 u 23 64 377 16.437 1.195 3.226 +ns2.probe-netwo 124.216.164.14 2 u 6 64 377 9.554 1.594 2.322 -2a01:4f8:201:41 192.53.103.108 2 u 30 64 377 23.610 -3.326 4.884 -2a02:a00:1009:6 192.53.103.108 2 u 9 64 377 13.279 1.295 5.938 -2001:4ba0:ffa4: 130.149.17.21 2 u 47 64 377 17.065 1.689 7.871
You need to adjust the fudgetime of the DCF77 receiver which is the compensation of the offset from the radio clock to the “real” time. That is: You need to compare the time received from your DCF77 module to some other working NTP sources. The offset should be minimal after that.
Without any “fudgetime” options in your ntp.conf file, the “RAW DCF77 CODE (Conrad DCF77 receiver module)”, as it is called by NTP, has a built-in fudgetime1 of 292.000 ms. You can see this with “ntpq -c cv”, since it “displays a list of clock variables for those associations supporting a reference clock”:
pi@ntp1-dcf77:~ $ ntpq -c cv associd=0 status=0020 2 events, clk_unspec, device="RAW DCF77 CODE (Conrad DCF77 receiver module)", timecode="-####----####-#---M-S1-----4p-2---2p-2---2--41---1---81---P", poll=33, noreply=0, badformat=2, baddata=0, fudgetime1=292.000, stratum=0, refid=DCFa, flags=0, refclock_time="dfa1a0fa.00000000 Thu, Nov 22 2018 21:41:14.000", refclock_status="TIME CODE; (LEAP INDICATION; CALLBIT)", refclock_format="RAW DCF77 Timecode", refclock_states="*NOMINAL: 00:29:02 (88.06%); BAD FORMAT: 00:03:56 (11.93%); running time: 00:32:58"
After some time running NTP you will see either this, in which all other NTP servers have a comparable but huge offset (in this situation about 590 ms, while the DCFa receiver is not used due to the “x” in the first column):
pi@ntp1-dcf77:~ $ ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== xGENERIC(0) .DCFa. 0 l 52 64 37 0.000 -0.792 2.046 -2003:de:2016:33 .PPS. 1 s 48 64 36 7.255 591.525 3.128 ptbtime2.ptb.de .PTB. 1 u 41 64 37 15.637 590.647 1.982 +ns2.probe-netwo 124.216.164.14 2 u 43 64 37 9.333 591.296 1.737 +2a01:4f8:201:41 131.188.3.221 2 u 45 64 37 24.371 586.926 3.137 *2a02:a00:1009:6 205.46.178.169 2 u 44 64 37 12.479 592.152 7.343 2001:4ba0:ffa4: .STEP. 16 u - 64 0 0.000 0.000 0.000
or this, in which all other NTP servers have a low offset, while the DCFa receiver has one with a much higher negative value (while still not used):
pi@ntp1-dcf77:~ $ ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== xGENERIC(0) .DCFa. 0 l 5 64 77 0.000 -593.24 1.561 -2003:de:2016:33 .PPS. 1 s 60 64 73 7.738 -0.359 0.778 ptbtime2.ptb.de .PTB. 1 u 64 64 77 16.143 0.222 1.091 +ns2.probe-netwo 124.216.164.14 2 u 61 64 77 9.862 0.471 0.862 +2a01:4f8:201:41 192.53.103.108 2 u 61 64 77 23.879 -4.579 0.615 *2a02:a00:1009:6 205.46.178.169 2 u 63 64 77 12.489 -0.316 1.477 2001:4ba0:ffa4: .STEP. 16 u - 256 0 0.000 0.000 0.000
–> Now you need to adjust the fudgetime1 to compensate this difference. In my case, since the pre-configured fudgetime was 292 ms while my offset of the DCF77 receiver was about -592 ms (hence needed an even higher compensation), I had to add a fudgetime1 of 292 + 592 = 884. Open the configuration file again:
sudo nano /etc/ntp.confand add the following below your “server 127.127.8.0 […]” line:
fudge 127.127.8.0 time1 0.884
Followed by a restart of NTP:
sudo service ntp restart.
A couple of minutes later you should have very small offsets among all NTP servers, the external ones as well as your DCFa receiver:
pi@ntp1-dcf77:~ $ ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== *GENERIC(0) .DCFa. 0 l 63 64 377 0.000 -0.451 0.857 -2003:de:2016:33 .PPS. 1 s 47 64 175 7.557 0.539 2.272 ptbtime2.ptb.de .PTB. 1 u 49 64 377 15.739 1.224 0.850 +ns2.probe-netwo 124.216.164.14 2 u 42 64 377 8.975 1.550 0.757 -2a01:4f8:201:41 192.53.103.108 2 u 52 64 377 24.224 -4.928 2.822 +2a02:a00:1009:6 40.33.41.76 2 u 49 64 377 12.530 -0.400 2.792 2001:4ba0:ffa4: .INIT. 16 u - 128 0 0.000 0.000 0.000
Very good! You’re almost done.
One hint from David Taylor about reducing the Ethernet latency on a Pi: Adding an option to the single line in
sudo nano /boot/cmdline.txtthat states:
smsc95xx.turbo_mode=N(reboot needed). This reduces the “delay” as shown in the “ntpq -p” output. I used two Raspberry Pis connected via a single switch, first time without the option, second with the option set by both. The delay between those NTP servers decreased from about 0.916 ms to 0.632 ms. Nice.
# BEFORE pi@ntp1-dcf77:~ $ ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== GENERIC(0) .DCFa. 0 l - 64 0 0.000 0.000 0.000 *2003:de:2016:33 .PPS. 1 s 28 64 377 0.916 1.125 0.759 ptbtime2.ptb.de .PTB. 1 u 26 64 377 9.782 1.081 0.419 # AFTER pi@ntp1-dcf77:~ $ ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== GENERIC(0) .DCFa. 0 l - 64 0 0.000 0.000 0.000 *2003:de:2016:33 .PPS. 1 s 24 64 377 0.632 0.120 0.737 ptbtime2.ptb.de .PTB. 1 u 35 64 377 9.444 1.592 0.554
That’s it for now. Cheers! If you have any suggestions, please write a comment.
This post used several information from a couple of (german) articles and posts:
Featured image “dcf77 module-stapel” by ledmaster33 is licensed under CC BY-NC 2.0.