Using a FortiGate with a 6in4 Tunnel

July 2, 2019, 1:13 pm

≫ Next: Workaround for Not Using a Palo Alto with a 6in4 Tunnel

For some reason, I am currently using a FortiGate on a location that has no native IPv6 support. Uh, I don’t want to talk about that. ;) However, at least the FortiGate firewalls are capable of 6in4 tunnels. Hence I am using the IPv6 Tunnel Broker from Hurricane Electric again. Quite easy so far.

But note, as always: Though FortiGate supports these IPv6 features such as a 6in4 tunnel or stateful/-less DHCPv6 server, those features are NOT stable or well designed at all. I had many bugs and outages during my last years. Having “NAT enabled” on every new IPv6 policy is ridiculous. Furthermore, having independent security policies for legacy IP and IPv6 is obviously a really bad design. One single policy responsible for both Internet protocols is a MUST. Anyway, let’s look at the 6in4 tunnel:

Note that this post is one of many related to IPv6. Click here for a structured list.

Configuring this IPv6-in-IPv4 tunnel is quite easy since HE itself offers the configuration:

Of course, you need an internal layer 3 interface as well. That is, a complete configuration (6in4 tunnel, default route, inside interface with RDNSS) looks like that:

config system sit-tunnel
    edit "HE"
        set destination 216.66.80.30
        set ip6 2001:470:1f0a:16b0::2/64
        set source 194.247.4.10
    next
end
config router static6
    edit 1
        set device "HE"
    next
end
config system interface
    edit "internal"
        config ipv6
            set ip6-address 2001:470:1f0b:16b0::1/64
            set ip6-allowaccess ping https ssh
            set ip6-send-adv enable
            config ip6-prefix-list
                edit 2001:470:1f0b:16b0::/64
                    set autonomous-flag enable
                    set onlink-flag enable
                    set rdnss 2620:fe::fe
                    set dnssl "weberlab.de"
                next
            end
        end
    next
end

Finally, you need some IPv6 policy entries to permit traffic. Again, note that you MUST NOT select the NAT, which is stupidly pre-selected by Fortinet:

On every new IPv6 policy on a @Fortinet FortiGate firewall the default selection is "NAT". #fail #IPv6 pic.twitter.com/BfS4qKEfc6

— Johannes Weber (@webernetz) March 1, 2017

Stumbling Blocks

I am using a FortiGate FG-90D with FortiOS 5.6.8 build1672 (GA).

Note that this “HE” interface, as it is named in the example configuration above, is NOT visible in the interface section in the GUI:

while it IS visible in the routing section:

Honestly: Who is approving such decisions at Fortinet? This is not sound at all, isn’t it?

Verifying

You can have a look at the routing monitor to see the default route in place:

Some CLI commands are as follows. Getting information about the tunnel interface you can use this kind of hidden command:

fnsysctl ifconfig

such as:

fg2 # fnsysctl ifconfig HE
HE      Link encap:Unknown  HWaddr C2:F7:04:0A:00:00
        inet addr6: 2001:470:1f0a:16b0::2 prefixlen 64
        link-local6: fe80::c2f7:40a prefixlen 128
        UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1480  Metric:1
        RX packets:664858 errors:0 dropped:0 overruns:0 frame:0
        TX packets:1015185 errors:0 dropped:0 overruns:0 carrier:0
        collisions:0 txqueuelen:0
        RX bytes:179754139 (171.4 MB)  TX bytes:114587175 (109.3 MB)

IPv6 routing table:

fg2 # get router info6 routing-table
IPv6 Routing Table
Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF,
       IA - OSPF inter area
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       I - IS-IS, B - BGP
       * - candidate default

Timers: Uptime

S*      ::/0 [10/0] via ::, HE, 01w2d12h
C       ::1/128 via ::, root, 04w2d19h
C       2001:470:1f0a:16b0::/64 via ::, HE, 01w2d12h
C       2001:470:1f0b:16b0::/64 via ::, internal, 01w2d11h
C       fe80::/10 via ::, internal, 01w2d11h
C       fe80::c2f7:40a/128 via ::, HE, 01w2d12h

And some basic network connectivity test, aka ping:

fg2 # execute ping6-options reset

fg2 # execute ping6 weberblog.net
PING weberblog.net(2a01:488:42:1000:50ed:8588:8a:c570) 56 data bytes
64 bytes from 2a01:488:42:1000:50ed:8588:8a:c570: icmp_seq=1 ttl=56 time=9.13 ms
64 bytes from 2a01:488:42:1000:50ed:8588:8a:c570: icmp_seq=2 ttl=56 time=11.4 ms
64 bytes from 2a01:488:42:1000:50ed:8588:8a:c570: icmp_seq=3 ttl=56 time=9.57 ms
64 bytes from 2a01:488:42:1000:50ed:8588:8a:c570: icmp_seq=4 ttl=56 time=10.3 ms
64 bytes from 2a01:488:42:1000:50ed:8588:8a:c570: icmp_seq=5 ttl=56 time=10.1 ms

--- weberblog.net ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss, time 4045ms
rtt min/avg/max/mdev = 9.136/10.145/11.487/0.806 ms

That’s it. Thanks for watching. ;) Don’t forget to hit the subscribe button.

Featured image “Make It Count” by Mr. Nixter is licensed under CC BY-NC 2.0.

↧

Workaround for Not Using a Palo Alto with a 6in4 Tunnel

July 10, 2019, 1:21 pm

≫ Next: Juniper ScreenOS with a 6in4 Tunnel

≪ Previous: Using a FortiGate with a 6in4 Tunnel

Of course, you should use dual-stack networks for almost everything on the Internet. Or even better: IPv6-only with DNS64/NAT64 and so on. ;) Unfortunately, still not every site has native IPv6 support. However, we can simply use the IPv6 Tunnel Broker from Hurricane Electric to overcome this time-based issue.

Well, wait… Not when using a Palo Alto Networks firewall which lacks 6in4 tunnel support. Sigh. Here’s my workaround:

Note that this post is one of many related to IPv6. Click here for a structured list.

Please note that my approach only works when you have at least 2x public IPv4 addresses. This might not be the case on almost all residential ISP connections. :( Since I am using the Palo in my lab which has a couple of public legacy IP addresses, it works quite good. Here is the idea:

Using a Cisco router for doing the 6in4 tunnel to HE.
Using 2x “untrust” interfaces on the Palo, one for legacy IP to the Internet, one for IPv6 to the Cisco router. I am using the first routed /64 from HE for this transfer segment between the Palo and the router, while I am routing the other /48 to the Palo. (Note that you get at least 1x /64 and 1x /48 from HE.)
Since both interfaces are within the same “untrust” zone, your policies are as normal. You don’t have to distinguish between the outgoing interface nor the Internet Protocol. After all!

Here’s a rough sketch:

Cisco Router with 6in4

This is my Cisco router config. I am using a Cisco 2811 (revision 3.0), IOS version 15.1(4)M12a. Probably nothing new for you. Default IPv4 route to the ISP, default IPv6 route into the tunnel, another /48 route to the Palo Alto:

interface Tunnel0
 description Hurricane Electric IPv6 Tunnel Broker
 no ip address
 ipv6 address 2001:470:1F0A:101A::2/64
 ipv6 enable
 tunnel source 193.24.227.12
 tunnel mode ipv6ip
 tunnel destination 216.66.80.30
!
interface FastEthernet0/0
 description ISP
 ip address 193.24.227.12 255.255.255.224
!
interface FastEthernet0/1
 description PA-220-eth1/1
 no ip address
 ipv6 address 2001:470:1F0B:1024::1/64
 ipv6 enable
!
ip route 0.0.0.0 0.0.0.0 FastEthernet0/0 193.24.227.1
!
ipv6 route 2001:470:765B::/48 FastEthernet0/1 2001:470:1F0B:1024::2
ipv6 route ::/0 Tunnel0

Show of routes:

router1#show ip route
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       + - replicated route, % - next hop override

Gateway of last resort is 193.24.227.1 to network 0.0.0.0

S*    0.0.0.0/0 [1/0] via 193.24.227.1, FastEthernet0/0
      193.24.227.0/24 is variably subnetted, 2 subnets, 2 masks
C        193.24.227.0/27 is directly connected, FastEthernet0/0
L        193.24.227.12/32 is directly connected, FastEthernet0/0
router1#
router1#show ipv6 route
IPv6 Routing Table - default - 7 entries
Codes: C - Connected, L - Local, S - Static, U - Per-user Static route
       B - BGP, HA - Home Agent, MR - Mobile Router, R - RIP
       I1 - ISIS L1, I2 - ISIS L2, IA - ISIS interarea, IS - ISIS summary
       D - EIGRP, EX - EIGRP external, NM - NEMO, ND - Neighbor Discovery
       l - LISP
       O - OSPF Intra, OI - OSPF Inter, OE1 - OSPF ext 1, OE2 - OSPF ext 2
       ON1 - OSPF NSSA ext 1, ON2 - OSPF NSSA ext 2
S   ::/0 [1/0]
     via Tunnel0, directly connected
C   2001:470:1F0A:101A::/64 [0/0]
     via Tunnel0, directly connected
L   2001:470:1F0A:101A::2/128 [0/0]
     via Tunnel0, receive
C   2001:470:1F0B:1024::/64 [0/0]
     via FastEthernet0/1, directly connected
L   2001:470:1F0B:1024::1/128 [0/0]
     via FastEthernet0/1, receive
S   2001:470:765B::/48 [1/0]
     via 2001:470:1F0B:1024::2, FastEthernet0/1
L   FF00::/8 [0/0]
     via Null0, receive

Palo Alto with 2x Untrust Interfaces

I am using a PA-220 with PAN-OS 8.1.7 in this lab. Two hardware layer 3 interfaces, one with IPv4-only directly attached to the ISP, the other one with IPv6-only plugged into the Cisco router. Note that both interfaces are of the same “untrust” security zone:

Default IPv6 route pointing to the Cisco router:

One policy to rule them all:

Likewise, the traffic log shows both Internet Protocols from this single policy:

CLI show of routes:

weberjoh@pa> show routing route

flags: A:active, ?:loose, C:connect, H:host, S:static, ~:internal, R:rip, O:ospf, B:bgp,
       Oi:ospf intra-area, Oo:ospf inter-area, O1:ospf ext-type-1, O2:ospf ext-type-2, E:ecmp, M:multicast


VIRTUAL ROUTER: default (id 1)
  ==========
destination                                 nexthop                                 metric flags      age   interface          next-AS
0.0.0.0/0                                   193.24.227.1                            10     A S              ethernet1/1
193.24.227.0/27                             193.24.227.9                            0      A C              ethernet1/1
193.24.227.9/32                             0.0.0.0                                 0      A H
193.24.227.224/27                           193.24.227.225                          0      A C              ethernet1/5.224
193.24.227.225/32                           0.0.0.0                                 0      A H
::/0                                        2001:470:1f0b:1024::1                   10     A S              ethernet1/2
2001:470:1f0b:1024::/64                     2001:470:1f0b:1024::2                   0      A C              ethernet1/2
2001:470:1f0b:1024::2/128                   ::                                      0      A H
2001:470:765b::/64                          2001:470:765b::1                        0      A C              ethernet1/5.224
2001:470:765b::1/128                        ::                                      0      A H
total routes shown: 10

Ping from the trust interface (selected with its source IPv6 address):

weberjoh@pa> ping inet6 yes source 2001:470:765b::1 host weberblog.net
PING weberblog.net(webernetz.net) from 2001:470:765b::1 : 56 data bytes
64 bytes from webernetz.net: icmp_seq=0 ttl=55 time=12.1 ms
64 bytes from webernetz.net: icmp_seq=1 ttl=55 time=5.85 ms
64 bytes from webernetz.net: icmp_seq=2 ttl=55 time=5.90 ms
64 bytes from webernetz.net: icmp_seq=3 ttl=55 time=6.58 ms
64 bytes from webernetz.net: icmp_seq=4 ttl=55 time=5.79 ms
^C
--- weberblog.net ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4043ms
rtt min/avg/max/mdev = 5.794/7.252/12.126/2.453 ms, pipe 2

Traceroute. Its “ipv6 yes” rather than “inet6 yes” as of ping. Uh:

weberjoh@pa> traceroute ipv6 yes source 2001:470:765b::1 host weberblog.net
traceroute to weberblog.net (2a01:488:42:1000:50ed:8588:8a:c570), 30 hops max, 40 byte packets
 1   (2001:470:1f0b:1024::1)  2.255 ms  1.797 ms  2.436 ms
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  * * ae4.cr-nashira.cgn1.bb.godaddy.com (2a01:488:bb::f2)  5.888 ms
 7  ae0.cr-artemis.cgn3.hosteurope.de (2a01:488:bb::a3)  5.665 ms  5.618 ms  5.599 ms
 8  2a01:488:42::a1e:1e77 (2a01:488:42::a1e:1e77)  5.578 ms  5.566 ms  5.683 ms
 9  webernetz.net (2a01:488:42:1000:50ed:8588:8a:c570)  6.255 ms  6.238 ms  6.187 ms

Works. Good.

Conclusion

Obviously, I am not happy that Palo Alto Networks has not implemented 6in4 tunnels so far. It shouldn’t be that hard.

However, due to the good design of having security zones summing up multiple interfaces, as well as a single security policy set that is able to handle IPv4 and IPv6 traffic, this workaround is feasible.

(Note that on a FortiGate firewall it’s vice versa: They have 6in4 tunnels but distinct security policies – one for v4 and another one for v6. That is: Quite simple to run a tunnel to HE while quite stupid to have different policy sets. In summary, they aren’t better.)

Featured image “255/365 Umleitung – Selzer Kerb vom 12. bis 16. September” by Frank Hamm is licensed under CC BY-NC-ND 2.0.

↧

Juniper ScreenOS with a 6in4 Tunnel

July 20, 2019, 11:36 am

≫ Next: 6in4 Traffic Capture

≪ Previous: Workaround for Not Using a Palo Alto with a 6in4 Tunnel

Yes, I know I know, the Juniper ScreenOS devices are Out-of-Everything (OoE), but I am still using them for a couple of labs. They simply work as a router and VPN gateway as well as a port-based firewall. Perfect for labs.

For some reasons I had another lab without native IPv6 Internet. Hence I used the IPv6 Tunnel Broker one more time. Quite easy with the SSGs, since HE offers a sample config. But even through the GUI it’s just a few steps:

Note that this post is one of many related to IPv6. Click here for a structured list.

I am using a SSG 140 with ScreenOS 6.3.0r27.0. Prerequisite is a static IPv4 address on the Internet facing “untrust” interface.

The “Example Configuration” from Hurricane Electric is already almost complete. You simple have to replace the “untrust” keyword for your layer 3 untrust interface:

Step-by-Step through the GUI

Anyway, doing it by hand through the GUI involves these steps:

Creating a new tunnel interface within the “Untrust” zone.
Enabling IPv6 type “host” on that tunnel interface, IPv6 address as the “Client IPv6 Address” from the HE tunnel information.
Disable NUD, the Neighbor Unreachability Detection.
Enable and configure the 6in4 tunnel aka “IPv6 in IPv4 Tunneling Encapsulation Settings”.
Add a (permanent) default route.
Add IPv6 subnets to your internal interfaces with “Allow RA Transmission” and so on as always.
Add security policies as always.

GUI Screenshots:

CLI Commands

CLI commands incl. user subnet config. I am using ethernet0/8 as my untrust interface and bgroup0/0 as my trust interface:

set interface "tunnel.1" zone "Untrust"
set interface "bgroup0/0" ipv6 mode "router"
set interface "bgroup0/0" ipv6 ip 2001:470:6d:a1::1/64
set interface "bgroup0/0" ipv6 enable
set interface "tunnel.1" ipv6 mode "host"
set interface "tunnel.1" ipv6 ip 2001:470:6c:a1::2/64
set interface "tunnel.1" ipv6 enable
set interface tunnel.1 tunnel encap ip6in4 manual
set interface tunnel.1 tunnel local-if ethernet0/8 dst-ip 216.66.86.114
set interface bgroup0/0 ipv6 ra link-address
set interface bgroup0/0 ipv6 ra transmit
set interface bgroup0/0 ipv6 nd nud
unset interface tunnel.1 ipv6 nd nud
set interface tunnel.1 ipv6 nd dad-count 0
set route ::/0 interface tunnel.1 gateway 2001:470:6c:a1::1 permanent

Up and Running

Just a few IPv6 related CLI commands (link for some more):

ssg-> get interface tunnel.1
Interface tunnel.1:
  description tunnel.1
  number 20, if_info 16168, if_index 1, mode route
  if_signature 0x4e53434e
  sess token 4, flow flag 0x60 if flag 0xc00203 flag2 0x0
  link ready, admin status up
  ipv6 is enable/operable, host mode.
  ipv6 operating mtu 1480, learned mtu 0
  ipv6 Interface-ID: 00000000c118e30a
  ipv6 fe80::c118:e30a/64, link local, PREFIX
  ipv6 2001:470:6c:a1::2/64, global aggregatable, STATEFUL
  ipv6 ff02::1:ff00:2, solicited-node scope
  ipv6 ff02::1:ff18:e30a, solicited-node scope
  vsys Root, zone Untrust, vr trust-vr
  hwif tunnel flag 0xc00200 flag2 0x0 flag3 0x10000000, vsys Root
  admin mtu 1480, operating mtu 1480, default mtu 1500
  *ip 0.0.0.0/0
  *manage ip 0.0.0.0
  pmtu-v4 disabled, pmtu-v6 enabled(1480),
  ping disabled, telnet disabled, SSH disabled, SNMP disabled
  web disabled, ident-reset disabled, SSL disabled

  OSPF disabled  OSPFv3 disabled  BGP disabled  RIP disabled  RIPng disabled
  mtrace disabled
  PIM: not configured  IGMP not configured
  MLD not configured
  NHRP disabled
  bandwidth: physical 0kbps, configured egress [gbw 0kbps mbw 0kbps]
             configured ingress mbw 0kbps, current bw 0kbps
             total allocated gbw 0kbps
tunnel: local ethernet0/8, remote 216.66.86.114
  encap: IP6IN4_MANUAL (2)
  keep-alive: off, interval 10(using default), threshold 3(using default)
      status: last send 0, last recv 0
ssg->
ssg->
ssg-> get route v6


IPv6 Dest-Routes for <untrust-vr> (0 entries)
--------------------------------------------------------------------------------------
H: Host C: Connected S: Static A: Auto-Exported
I: Imported R: RIP/RIPng P: Permanent D: Auto-Discovered
N: NHRP
iB: IBGP eB: EBGP O: OSPF/OSPFv3 E1: OSPF external type 1
E2: OSPF/OSPFv3 external type 2 trailing B: backup route


IPv6 Dest-Routes for <trust-vr> (5 entries)
--------------------------------------------------------------------------------------
         ID                                   IP-Prefix       Interface
                                                Gateway   P Pref    Mtr     Vsys
--------------------------------------------------------------------------------------
*         3                                        ::/0           tun.1
                                      2001:470:6c:a1::1  SP   20      1     Root
*         1                         2001:470:6c:a1::/64           tun.1
                                                     ::   C    0      0     Root
*         5                       2001:470:6d:a1::1/128       bgroup0/0
                                                     ::   H    0      0     Root
*         4                         2001:470:6d:a1::/64       bgroup0/0
                                                     ::   C    0      0     Root
*         2                       2001:470:6c:a1::2/128           tun.1
                                                     ::   H    0      0     Root

ssg->
ssg->
ssg-> get ndp
usage: 3/2048 miss: 0 always-on-dest: disabled
states(S): N Undefined, X Deleted, I Incomplete, R Reachable, L Stale, D Delay,
P Probe, F Probe forever S Static, A Active, I Inactive, * persistent
--------------------------------------------------------------------------------
IPv6 Address                            Link-Layer Addr S Interface    Age      Pk
2001:470:6d:a1::dcfb:123                001395243404   R bgroup0/0    00h00m17s 0
fe80::d842:5672                         0000d8425672   A*tunnel.1     00h00m01s 0
fe80::213:95ff:fe24:3404                001395243404   R bgroup0/0    00h00m23s 0
ssg->
ssg->

Happy IPv6-firewalling! ;D

Featured image “Rabštejn – Lightpaint” by david_drei is licensed under CC BY-NC-ND 2.0.

↧

6in4 Traffic Capture

July 22, 2019, 12:41 pm

≫ Next: Palo Alto Networks Feature Requests

≪ Previous: Juniper ScreenOS with a 6in4 Tunnel

Since my last blogposts covered many 6in4 IPv6 tunnel setups (1, 2, 3) I took a packet capture of some tunneled IPv6 sessions to get an idea how these packets look like on the wire. Feel free to download this small pcap and to have a look at it by yourself.

A couple of spontaneous challenges from the pcap round things up. ;)

Note that this post is one of many related to IPv6. Click here for a structured list.

At first, this is the pcap. Zipped with 7z, 19.5 Kbyte:

Setup

Two different tunnel endpoints were involved, both with tunnels from Hurricane Electric and its tunnel broker:

193.24.227.10, a Juniper SSG 140, tunneling to 216.66.86.114 for IPv6 prefix 2001:470:6d:a1::/64
193.24.227.12, a Cisco router, tunneling to 216.66.80.30 for IPv6 prefixes 2001:470:1F0B:1024::/64 and 2001:470:765B::/48

There are some IPv4 packets left within the trace intentionally, just to have some for reference. Within the trace you’ll find tunneled IPv6 connections with the following protocols:

Ping aka ICMPv6 echo-request
DNS via UDP
DNS via TCP
NTP
HTTP
HTTPS aka TLS
Syslog

IP Protocol 41

Basically, every IPv6 packet is encapsulated in IPv4, using the IP protocol number 41. This is *not* UDP or TCP, but its own protocol. Ref: IANA – Assigned Internet Protocol Numbers. Looking at it with Wireshark reveals the outer IPv4 packet with “Protocol: IPv6 (41)”, as well as the inner IPv6 packet with its actual payload:

Note that Wireshark displays the IPv6 source and destination address in the packet list and NOT the IPv4 ones. This can be confusing, especially when using display filters such as “ip”. Normally this should show ONLY IPv4 packets, but since those tunneled IPv6 packets are within IPv4, they are still present:

You can filter for some of those mentioned layer 7 protocols such as DNS, NTP, or ICMPv6:

TCP stream 0 shows an HTTP session, TCP stream 1 an HTTPS one:

Spontaneous Challenges

While preparing some screenshots for this post, I came across some ideas for a packet challenge. Just for fun, as always. However, they are not related to those 6in4 packets, but quite generic. Please comment below for the answers!

What’s the serial number of the Juniper SSG 140?
What reference time source is used on the stratum 1 NTP server?
Which operating system sent the ping?
Which server (vhost) was accessed in TCP stream 4?
How many authoritative DNS answers were sent from my lab?
What are the authoritative name servers?
Which DNSSEC algorithm is weberlab.de using?
Which DNS client sent a cookie? What’s its value?
What’s the first HTML line from the answer in TCP stream 2?
How many different server TLS certificates are in the trace?
What are the subject alternative names of the 1st certificate in TCP stream 5?

Have fun! ;)

Featured image “Charli Blake” by Thomas Hawk is licensed under CC BY-NC 2.0.

↧

Palo Alto Networks Feature Requests

July 31, 2019, 7:37 am

≫ Next: DNS Capture: UDP, TCP, IP-Fragmentation, EDNS, ECS, Cookie

≪ Previous: 6in4 Traffic Capture

This is a list of missing features for the next-generation firewall from Palo Alto Networks from my point of view (though I have not that many compared to other vendors such as Fortinet). Let’s see whether some of them will find their way into PAN-OS in the next years…

This is a living list. I’ll update it whenever I discover something new.

Possibility to disable the “application dependency warning” messages on a per-rule basis. They appear after each commit. Sometimes they are correct – often they aren’t. I have customers with thousands of these warnings while the whole security ruleset is sound and working. In the end, nobody reads these warnings anymore which is contrary to its purpose.
IPv6 DHCPv6 Prefix Delegation: In order to operate a Palo Alto at german residential ISP connections, DHCPv6-PD is mandatory. (Sample here.) Since it is working with fairly old Juniper ScreenOS firewalls and even FortiGates, it shouldn’t be that big problem to add it as well. Report.
IPv6 6in4 tunnel support. Again, working with ScreenOS and FortiGates out of the box. Report.
Email Server Profile with SMTP authentication. That is: Possibility to use a smart host rather than own internal SMTP servers. Report.
Precise CLI output whether or not NTP authentication was successful or not. Details here.
Grouping of policy entries rather than displaying all at once. Added in PAN-OS 9.0.
Dashboard widget to write down some notes. Report.

Featured image “Baustellentick?” by Dennis Skley is licensed under CC BY-ND 2.0.

↧

DNS Capture: UDP, TCP, IP-Fragmentation, EDNS, ECS, Cookie

August 13, 2019, 6:09 am

≫ Next: Basic NTP Client Test: ntpdate & sntp

≪ Previous: Palo Alto Networks Feature Requests

It’s not always this simple DNS thing such as “single query – single answer, both via UDP”. Sometimes you have some more options or bigger messages that look and behave differently on the network. For example: IP fragmentation for larger DNS answers that do not fit into a single UDP datagram (hopefully not after the DNS flag day 2020 anymore), or DNS via TCP, or some newer options within the EDNS space such as “EDNS Client Subnet” (ECS) or DNS cookies.

I won’t explain any details about those options, but I am publishing a pcap with that kind of packets along with some Wireshark screenshots. Feel free to dig into it.

(Please note that I have a couple of downloadable pcaps on my blog, especially this one which looks at DNS and DNSSEC packets. However, with this post here I am looking into other DNS details.)

Download the pcap capture file here. 7-zipped, 7 KB:

I am using dig version 9.11.3-1ubuntu1.7 (former versions are not using the EDNS cookie) on a Ubuntu 18.04.2 LTS. The Wireshark screenshots are from version 3.0.2. Note my custom Wireshark columns to show some special packet details that are relevant for DNS, likewise the TCP.Stream and UDP.Stream columns.

Common DNS: UDP

Find some very basic DNS queries/responses in UDP streams 11-14. First one with the “ad” flag (authentic data via DNSSEC), second one with DNSSEC failure aka SERVFAIL. Third and fourth: A and AAAA records:

dig @2606:4700:4700::1111 sigok.verteiltesysteme.net
dig @2606:4700:4700::1111 sigfail.verteiltesysteme.net
dig @2620:fe::fe formel1.de
dig @2620:fe::fe erfpop.de aaaa

Full listing:

weberjoh@vm22-lx2:~$ dig @2606:4700:4700::1111 sigok.verteiltesysteme.net

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> @2606:4700:4700::1111 sigok.verteiltesysteme.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37343
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1452
;; QUESTION SECTION:
;sigok.verteiltesysteme.net.    IN      A

;; ANSWER SECTION:
sigok.verteiltesysteme.net. 60  IN      A       134.91.78.139

;; Query time: 17 msec
;; SERVER: 2606:4700:4700::1111#53(2606:4700:4700::1111)
;; WHEN: Tue Jun 18 14:58:02 UTC 2019
;; MSG SIZE  rcvd: 71

weberjoh@vm22-lx2:~$ dig @2606:4700:4700::1111 sigfail.verteiltesysteme.net

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> @2606:4700:4700::1111 sigfail.verteiltesysteme.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 64199
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;sigfail.verteiltesysteme.net.  IN      A

;; Query time: 29 msec
;; SERVER: 2606:4700:4700::1111#53(2606:4700:4700::1111)
;; WHEN: Tue Jun 18 14:58:06 UTC 2019
;; MSG SIZE  rcvd: 46

weberjoh@vm22-lx2:~$ dig @2620:fe::fe formel1.de

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> @2620:fe::fe formel1.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1764
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;formel1.de.                    IN      A

;; ANSWER SECTION:
formel1.de.             3600    IN      A       85.25.234.253

;; Query time: 19 msec
;; SERVER: 2620:fe::fe#53(2620:fe::fe)
;; WHEN: Tue Jun 18 14:58:09 UTC 2019
;; MSG SIZE  rcvd: 55

weberjoh@vm22-lx2:~$ dig @2620:fe::fe erfpop.de aaaa

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> @2620:fe::fe erfpop.de aaaa
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36076
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;erfpop.de.                     IN      AAAA

;; ANSWER SECTION:
erfpop.de.              300     IN      AAAA    2606:4700:30::6818:6291
erfpop.de.              300     IN      AAAA    2606:4700:30::6818:6391

;; Query time: 24 msec
;; SERVER: 2620:fe::fe#53(2620:fe::fe)
;; WHEN: Tue Jun 18 14:58:15 UTC 2019
;; MSG SIZE  rcvd: 94

Wireshark screenshots (quite boring at this time ;D):

Bigger Sizes: IP Fragmentation & TCP

Now it’s getting a bit more interesting. Querying for records that are bigger in size requires either IP fragmentation (there is no fragmentation in UDP, hence IP must do it) or the fallback/usage of TCP with its basic three-way handshake. Note that IP fragmentation behaves a bit different for IPv4 and IPv6. At least for IPv6 there is a huge discussion whether this fragmentation header should be dropped at *any* border router/firewall anyway. Keep that in mind when troubleshooting it! Links:

RFC 7872: Observations on the Dropping of Packets with IPv6 Extension Headers in the Real World
Enno Rey: Some Notes on the “Drop IPv6 Fragments” vs. “This Will Break DNS[SEC]” Debate
Geoff Huston: Dealing with IPv6 fragmentation in the DNS, and Part 2

At least TCP is straightforward and without problems. (Note that zone transfers via AXFR and IXFR are handled by TCP by default.) For my test queries I used IPv6 and legacy IP, both via UDP (forcing IP fragmentation) and TCP:

dig -6 @ns1.weberdns.de weberlab.de dnskey +dnssec +notcp
dig -4 @ns1.weberdns.de weberlab.de dnskey +dnssec +notcp
dig -6 @ns2.weberdns.de weberlab.de dnskey +dnssec +tcp
dig -4 @ns2.weberdns.de weberlab.de dnskey +dnssec +tcp

Full listing (quite long dnskeys, not of interest):

weberjoh@vm22-lx2:~$ dig -6 @ns1.weberdns.de weberlab.de dnskey +dnssec +notcp

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> -6 @ns1.weberdns.de weberlab.de dnskey +dnssec +notcp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49824
;; flags: qr aa rd; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
; COOKIE: f16ce6e62f7d465c2a33773e5d08fc11654e05c083606c96 (good)
;; QUESTION SECTION:
;weberlab.de.                   IN      DNSKEY

;; ANSWER SECTION:
weberlab.de.            60      IN      DNSKEY  257 3 10 AwEAAd3v/e0irXYKOwtYEB3VPe7z99qvi5le9/y1XXyplp5y/5xaqrm/ relG8pgx8GsNW2IgviJKAJ6UiU45ERKoH+fz2qf2SUFHFWwkweiWyLZ4 EZHhowviCEx94P4OswNKXmdYHe38rlHPa+3OypW9gYfR9lhCKK3neCPq 8/aFFsTTI7dQ+Q2kERWiCMCybl4WOwsBo/RlnPM4yufMKIlABiM5NWQP NmI6jYzAYpYoyUhd9HnnIIDlNQ89HpXQdFmysMraXYb7qDOoOEiOodtt KH0y/vtJ2SRU05RF4AEumacIUzAi5LL2cMQxC7t7rlDI4X42NRfOLAqG uOeclFjzqz3OdAJWeg/AAnSbb02AGCkQ370TX1hWveAXt6xpPWOLgHXS LIF/lz+wl+Dm8ZNWDnn5zEJuEj3xova1g8zmRXJOmqA6VhGqewxF8c+y KeNEOHz4X4/RLmWHIuEbvboP00Dk5A9bhyZGVsytOJg+NwhFQtvBWLmD 82FFtfSt2vmbFFNwAZOnRZWJOG9L7TFcGIm1OEULmohUyFLsBGMXDFOu 1k0o6pqm495tsBuMyJNpfdQoPwOkUpsKi6jmNq6vRjvvNiJbcFylTQrq HGTGuOopuUsBbUXj/nOr4I6j42k6GDIuTyLDkaVrdrxXmGnfNnStdqWm vHXo/YFwdls9bcT7
weberlab.de.            60      IN      DNSKEY  256 3 10 AwEAAdBU3CjxUKw7SeYza7cxyq/Xg3znVQsMzuF/UeLaigOubtJHhxhL +m129IxQkTKo8JRIXcKXD+aViztiml8+8BPCXFNPftFpdFCzBRNGHj/c a1g/Flck6v5avafB/hGqbWKY2LEGKb5ktYWGj8JB0mrKGqDZVPyieC0d YVv02iOaOvUhdl7QtgVybR3V6gHlhoG0BxG+GbjUp+NyPClbuMOIwflb VGB5946PyQGQgnGNX2L1MHumOaYC/D3UnyzQZNMmqj85GwDNPwEeDfLq 6wm1BUfx7MwwcEVuO2B0YmUyiPiSfUoGTwm2P1nGNMhlYij3bY9VvyxC qPQnK0s5Tr0=
weberlab.de.            60      IN      RRSIG   DNSKEY 10 2 60 20190711220120 20190611215721 13179 weberlab.de. w2TKIWrDQs3Wcvql0/ovRLl/4p5WqB3JKD0OqWVDOeRziol1sGwhdm4E dr02rZbX4HUDGLIxy2/cvP/CtesdWyDaO4x4T4YkN5mT54UQ2wvNff5o PwwCZRH6OPzyB66Gjlx5BLMXJQN0n5eV86jGPHSCIwjEiEaIZqc92o3d YPEOZz4b67ok4t625JGhbrwLuiQWs0sb6ZQUxkbC4sA/5Hk9R9HIewVX NaGesZpKsuvLyLA+txY9a9T67e6Zqwh0wrhILDu/A/XQkLlgi5CUMRlF VyY+QiNx7+8RRYKbtlnSuxIhAR+ax6YY2/ISTxXqdXIIM4HUju4K+Dy9 elECeKWhVhKL5UkkijegqldCpc2OVCqaAuGznETIoro5HIC0XCcwzY03 D+fevfXokQWHJ0lDbuzUfwEDrC3tYHBgTGFSNvLlwAYoxSST1GEBRUSi kGGMu1PGroH6HSQpTTPuT4w6ze1VGqMlvwrWxYoC2Wo3v0myxZKHfaE5 LoZ3rS+38hJ5Rn8SphGdtfKj3LCgyUa0cspnPlnWfrT3Gb0F6+gFUu0W UqFE1f/uXVXdpFrrhdsOa7JwCDlN8MCBD4bE8KvZ44W/QafYzjB9xRW6 OLKzb2OKaHDrPS4nNTrPNd27u1ckM9za7/8Xzzj77xv/7n3DxDtdxZwe 3cIP8pk3nrs=
weberlab.de.            60      IN      RRSIG   DNSKEY 10 2 60 20190711220120 20190611215721 36935 weberlab.de. pTxMnEVpxiufOPOuJsnKHMj89lMzzWjszQ3ndtWEaKP73OElE16hQbZl uEkBQwQEC7uB5qnEntTXP5SqGQVKLxC7qNE6cyKHnHOaLFc6M7ZGIdPx 4zNAweqKWt57GZ3P7usfiMKCCkCDZh6dEzOm+Gt/T44RZQ2HCrp01hWU 1aDVh/WjEJGxnpeKral6aV7go6SChtYQKB0QtoychkpQnRa2kBkm4JsA g+9qTdiAdw09HhJvHWUpFM9bpDGMWwcnlf8HqY0xW2ob3vDNo7+6BXAf zVC3YuWmPlZvzvcC0xt3s5BgvCEnt+HEn3E0mfpKVVGnoL7U/ZbK7/tT SaA/6w==

;; Query time: 10 msec
;; SERVER: 2001:470:765b::a25:53#53(2001:470:765b::a25:53)
;; WHEN: Tue Jun 18 14:58:25 UTC 2019
;; MSG SIZE  rcvd: 1730

weberjoh@vm22-lx2:~$ dig -4 @ns1.weberdns.de weberlab.de dnskey +dnssec +notcp

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> -4 @ns1.weberdns.de weberlab.de dnskey +dnssec +notcp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35278
;; flags: qr aa rd; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
; COOKIE: 1dc404dc580e966c943e15565d08fc16a4345a7a72133f61 (good)
;; QUESTION SECTION:
;weberlab.de.                   IN      DNSKEY

;; ANSWER SECTION:
weberlab.de.            60      IN      DNSKEY  256 3 10 AwEAAdBU3CjxUKw7SeYza7cxyq/Xg3znVQsMzuF/UeLaigOubtJHhxhL +m129IxQkTKo8JRIXcKXD+aViztiml8+8BPCXFNPftFpdFCzBRNGHj/c a1g/Flck6v5avafB/hGqbWKY2LEGKb5ktYWGj8JB0mrKGqDZVPyieC0d YVv02iOaOvUhdl7QtgVybR3V6gHlhoG0BxG+GbjUp+NyPClbuMOIwflb VGB5946PyQGQgnGNX2L1MHumOaYC/D3UnyzQZNMmqj85GwDNPwEeDfLq 6wm1BUfx7MwwcEVuO2B0YmUyiPiSfUoGTwm2P1nGNMhlYij3bY9VvyxC qPQnK0s5Tr0=
weberlab.de.            60      IN      DNSKEY  257 3 10 AwEAAd3v/e0irXYKOwtYEB3VPe7z99qvi5le9/y1XXyplp5y/5xaqrm/ relG8pgx8GsNW2IgviJKAJ6UiU45ERKoH+fz2qf2SUFHFWwkweiWyLZ4 EZHhowviCEx94P4OswNKXmdYHe38rlHPa+3OypW9gYfR9lhCKK3neCPq 8/aFFsTTI7dQ+Q2kERWiCMCybl4WOwsBo/RlnPM4yufMKIlABiM5NWQP NmI6jYzAYpYoyUhd9HnnIIDlNQ89HpXQdFmysMraXYb7qDOoOEiOodtt KH0y/vtJ2SRU05RF4AEumacIUzAi5LL2cMQxC7t7rlDI4X42NRfOLAqG uOeclFjzqz3OdAJWeg/AAnSbb02AGCkQ370TX1hWveAXt6xpPWOLgHXS LIF/lz+wl+Dm8ZNWDnn5zEJuEj3xova1g8zmRXJOmqA6VhGqewxF8c+y KeNEOHz4X4/RLmWHIuEbvboP00Dk5A9bhyZGVsytOJg+NwhFQtvBWLmD 82FFtfSt2vmbFFNwAZOnRZWJOG9L7TFcGIm1OEULmohUyFLsBGMXDFOu 1k0o6pqm495tsBuMyJNpfdQoPwOkUpsKi6jmNq6vRjvvNiJbcFylTQrq HGTGuOopuUsBbUXj/nOr4I6j42k6GDIuTyLDkaVrdrxXmGnfNnStdqWm vHXo/YFwdls9bcT7
weberlab.de.            60      IN      RRSIG   DNSKEY 10 2 60 20190711220120 20190611215721 13179 weberlab.de. w2TKIWrDQs3Wcvql0/ovRLl/4p5WqB3JKD0OqWVDOeRziol1sGwhdm4E dr02rZbX4HUDGLIxy2/cvP/CtesdWyDaO4x4T4YkN5mT54UQ2wvNff5o PwwCZRH6OPzyB66Gjlx5BLMXJQN0n5eV86jGPHSCIwjEiEaIZqc92o3d YPEOZz4b67ok4t625JGhbrwLuiQWs0sb6ZQUxkbC4sA/5Hk9R9HIewVX NaGesZpKsuvLyLA+txY9a9T67e6Zqwh0wrhILDu/A/XQkLlgi5CUMRlF VyY+QiNx7+8RRYKbtlnSuxIhAR+ax6YY2/ISTxXqdXIIM4HUju4K+Dy9 elECeKWhVhKL5UkkijegqldCpc2OVCqaAuGznETIoro5HIC0XCcwzY03 D+fevfXokQWHJ0lDbuzUfwEDrC3tYHBgTGFSNvLlwAYoxSST1GEBRUSi kGGMu1PGroH6HSQpTTPuT4w6ze1VGqMlvwrWxYoC2Wo3v0myxZKHfaE5 LoZ3rS+38hJ5Rn8SphGdtfKj3LCgyUa0cspnPlnWfrT3Gb0F6+gFUu0W UqFE1f/uXVXdpFrrhdsOa7JwCDlN8MCBD4bE8KvZ44W/QafYzjB9xRW6 OLKzb2OKaHDrPS4nNTrPNd27u1ckM9za7/8Xzzj77xv/7n3DxDtdxZwe 3cIP8pk3nrs=
weberlab.de.            60      IN      RRSIG   DNSKEY 10 2 60 20190711220120 20190611215721 36935 weberlab.de. pTxMnEVpxiufOPOuJsnKHMj89lMzzWjszQ3ndtWEaKP73OElE16hQbZl uEkBQwQEC7uB5qnEntTXP5SqGQVKLxC7qNE6cyKHnHOaLFc6M7ZGIdPx 4zNAweqKWt57GZ3P7usfiMKCCkCDZh6dEzOm+Gt/T44RZQ2HCrp01hWU 1aDVh/WjEJGxnpeKral6aV7go6SChtYQKB0QtoychkpQnRa2kBkm4JsA g+9qTdiAdw09HhJvHWUpFM9bpDGMWwcnlf8HqY0xW2ob3vDNo7+6BXAf zVC3YuWmPlZvzvcC0xt3s5BgvCEnt+HEn3E0mfpKVVGnoL7U/ZbK7/tT SaA/6w==

;; Query time: 13 msec
;; SERVER: 193.24.227.238#53(193.24.227.238)
;; WHEN: Tue Jun 18 14:58:30 UTC 2019
;; MSG SIZE  rcvd: 1730

weberjoh@vm22-lx2:~$ dig -6 @ns2.weberdns.de weberlab.de dnskey +dnssec +tcp

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> -6 @ns2.weberdns.de weberlab.de dnskey +dnssec +tcp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 32996
;; flags: qr aa rd; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
; COOKIE: 1b9af622ab2c9740e40584c05d08fc19f07835a115ea8294 (good)
;; QUESTION SECTION:
;weberlab.de.                   IN      DNSKEY

;; ANSWER SECTION:
weberlab.de.            60      IN      DNSKEY  256 3 10 AwEAAdBU3CjxUKw7SeYza7cxyq/Xg3znVQsMzuF/UeLaigOubtJHhxhL +m129IxQkTKo8JRIXcKXD+aViztiml8+8BPCXFNPftFpdFCzBRNGHj/c a1g/Flck6v5avafB/hGqbWKY2LEGKb5ktYWGj8JB0mrKGqDZVPyieC0d YVv02iOaOvUhdl7QtgVybR3V6gHlhoG0BxG+GbjUp+NyPClbuMOIwflb VGB5946PyQGQgnGNX2L1MHumOaYC/D3UnyzQZNMmqj85GwDNPwEeDfLq 6wm1BUfx7MwwcEVuO2B0YmUyiPiSfUoGTwm2P1nGNMhlYij3bY9VvyxC qPQnK0s5Tr0=
weberlab.de.            60      IN      DNSKEY  257 3 10 AwEAAd3v/e0irXYKOwtYEB3VPe7z99qvi5le9/y1XXyplp5y/5xaqrm/ relG8pgx8GsNW2IgviJKAJ6UiU45ERKoH+fz2qf2SUFHFWwkweiWyLZ4 EZHhowviCEx94P4OswNKXmdYHe38rlHPa+3OypW9gYfR9lhCKK3neCPq 8/aFFsTTI7dQ+Q2kERWiCMCybl4WOwsBo/RlnPM4yufMKIlABiM5NWQP NmI6jYzAYpYoyUhd9HnnIIDlNQ89HpXQdFmysMraXYb7qDOoOEiOodtt KH0y/vtJ2SRU05RF4AEumacIUzAi5LL2cMQxC7t7rlDI4X42NRfOLAqG uOeclFjzqz3OdAJWeg/AAnSbb02AGCkQ370TX1hWveAXt6xpPWOLgHXS LIF/lz+wl+Dm8ZNWDnn5zEJuEj3xova1g8zmRXJOmqA6VhGqewxF8c+y KeNEOHz4X4/RLmWHIuEbvboP00Dk5A9bhyZGVsytOJg+NwhFQtvBWLmD 82FFtfSt2vmbFFNwAZOnRZWJOG9L7TFcGIm1OEULmohUyFLsBGMXDFOu 1k0o6pqm495tsBuMyJNpfdQoPwOkUpsKi6jmNq6vRjvvNiJbcFylTQrq HGTGuOopuUsBbUXj/nOr4I6j42k6GDIuTyLDkaVrdrxXmGnfNnStdqWm vHXo/YFwdls9bcT7
weberlab.de.            60      IN      RRSIG   DNSKEY 10 2 60 20190711220120 20190611215721 13179 weberlab.de. w2TKIWrDQs3Wcvql0/ovRLl/4p5WqB3JKD0OqWVDOeRziol1sGwhdm4E dr02rZbX4HUDGLIxy2/cvP/CtesdWyDaO4x4T4YkN5mT54UQ2wvNff5o PwwCZRH6OPzyB66Gjlx5BLMXJQN0n5eV86jGPHSCIwjEiEaIZqc92o3d YPEOZz4b67ok4t625JGhbrwLuiQWs0sb6ZQUxkbC4sA/5Hk9R9HIewVX NaGesZpKsuvLyLA+txY9a9T67e6Zqwh0wrhILDu/A/XQkLlgi5CUMRlF VyY+QiNx7+8RRYKbtlnSuxIhAR+ax6YY2/ISTxXqdXIIM4HUju4K+Dy9 elECeKWhVhKL5UkkijegqldCpc2OVCqaAuGznETIoro5HIC0XCcwzY03 D+fevfXokQWHJ0lDbuzUfwEDrC3tYHBgTGFSNvLlwAYoxSST1GEBRUSi kGGMu1PGroH6HSQpTTPuT4w6ze1VGqMlvwrWxYoC2Wo3v0myxZKHfaE5 LoZ3rS+38hJ5Rn8SphGdtfKj3LCgyUa0cspnPlnWfrT3Gb0F6+gFUu0W UqFE1f/uXVXdpFrrhdsOa7JwCDlN8MCBD4bE8KvZ44W/QafYzjB9xRW6 OLKzb2OKaHDrPS4nNTrPNd27u1ckM9za7/8Xzzj77xv/7n3DxDtdxZwe 3cIP8pk3nrs=
weberlab.de.            60      IN      RRSIG   DNSKEY 10 2 60 20190711220120 20190611215721 36935 weberlab.de. pTxMnEVpxiufOPOuJsnKHMj89lMzzWjszQ3ndtWEaKP73OElE16hQbZl uEkBQwQEC7uB5qnEntTXP5SqGQVKLxC7qNE6cyKHnHOaLFc6M7ZGIdPx 4zNAweqKWt57GZ3P7usfiMKCCkCDZh6dEzOm+Gt/T44RZQ2HCrp01hWU 1aDVh/WjEJGxnpeKral6aV7go6SChtYQKB0QtoychkpQnRa2kBkm4JsA g+9qTdiAdw09HhJvHWUpFM9bpDGMWwcnlf8HqY0xW2ob3vDNo7+6BXAf zVC3YuWmPlZvzvcC0xt3s5BgvCEnt+HEn3E0mfpKVVGnoL7U/ZbK7/tT SaA/6w==

;; Query time: 0 msec
;; SERVER: 2001:470:1f0b:16b0::a26:53#53(2001:470:1f0b:16b0::a26:53)
;; WHEN: Tue Jun 18 14:58:33 UTC 2019
;; MSG SIZE  rcvd: 1730

weberjoh@vm22-lx2:~$ dig -4 @ns2.weberdns.de weberlab.de dnskey +dnssec +tcp

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> -4 @ns2.weberdns.de weberlab.de dnskey +dnssec +tcp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1754
;; flags: qr aa rd; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
; COOKIE: c6ab1a7fa365f223dd0e36455d08fc1c37dacb4464d60422 (good)
;; QUESTION SECTION:
;weberlab.de.                   IN      DNSKEY

;; ANSWER SECTION:
weberlab.de.            60      IN      DNSKEY  257 3 10 AwEAAd3v/e0irXYKOwtYEB3VPe7z99qvi5le9/y1XXyplp5y/5xaqrm/ relG8pgx8GsNW2IgviJKAJ6UiU45ERKoH+fz2qf2SUFHFWwkweiWyLZ4 EZHhowviCEx94P4OswNKXmdYHe38rlHPa+3OypW9gYfR9lhCKK3neCPq 8/aFFsTTI7dQ+Q2kERWiCMCybl4WOwsBo/RlnPM4yufMKIlABiM5NWQP NmI6jYzAYpYoyUhd9HnnIIDlNQ89HpXQdFmysMraXYb7qDOoOEiOodtt KH0y/vtJ2SRU05RF4AEumacIUzAi5LL2cMQxC7t7rlDI4X42NRfOLAqG uOeclFjzqz3OdAJWeg/AAnSbb02AGCkQ370TX1hWveAXt6xpPWOLgHXS LIF/lz+wl+Dm8ZNWDnn5zEJuEj3xova1g8zmRXJOmqA6VhGqewxF8c+y KeNEOHz4X4/RLmWHIuEbvboP00Dk5A9bhyZGVsytOJg+NwhFQtvBWLmD 82FFtfSt2vmbFFNwAZOnRZWJOG9L7TFcGIm1OEULmohUyFLsBGMXDFOu 1k0o6pqm495tsBuMyJNpfdQoPwOkUpsKi6jmNq6vRjvvNiJbcFylTQrq HGTGuOopuUsBbUXj/nOr4I6j42k6GDIuTyLDkaVrdrxXmGnfNnStdqWm vHXo/YFwdls9bcT7
weberlab.de.            60      IN      DNSKEY  256 3 10 AwEAAdBU3CjxUKw7SeYza7cxyq/Xg3znVQsMzuF/UeLaigOubtJHhxhL +m129IxQkTKo8JRIXcKXD+aViztiml8+8BPCXFNPftFpdFCzBRNGHj/c a1g/Flck6v5avafB/hGqbWKY2LEGKb5ktYWGj8JB0mrKGqDZVPyieC0d YVv02iOaOvUhdl7QtgVybR3V6gHlhoG0BxG+GbjUp+NyPClbuMOIwflb VGB5946PyQGQgnGNX2L1MHumOaYC/D3UnyzQZNMmqj85GwDNPwEeDfLq 6wm1BUfx7MwwcEVuO2B0YmUyiPiSfUoGTwm2P1nGNMhlYij3bY9VvyxC qPQnK0s5Tr0=
weberlab.de.            60      IN      RRSIG   DNSKEY 10 2 60 20190711220120 20190611215721 13179 weberlab.de. w2TKIWrDQs3Wcvql0/ovRLl/4p5WqB3JKD0OqWVDOeRziol1sGwhdm4E dr02rZbX4HUDGLIxy2/cvP/CtesdWyDaO4x4T4YkN5mT54UQ2wvNff5o PwwCZRH6OPzyB66Gjlx5BLMXJQN0n5eV86jGPHSCIwjEiEaIZqc92o3d YPEOZz4b67ok4t625JGhbrwLuiQWs0sb6ZQUxkbC4sA/5Hk9R9HIewVX NaGesZpKsuvLyLA+txY9a9T67e6Zqwh0wrhILDu/A/XQkLlgi5CUMRlF VyY+QiNx7+8RRYKbtlnSuxIhAR+ax6YY2/ISTxXqdXIIM4HUju4K+Dy9 elECeKWhVhKL5UkkijegqldCpc2OVCqaAuGznETIoro5HIC0XCcwzY03 D+fevfXokQWHJ0lDbuzUfwEDrC3tYHBgTGFSNvLlwAYoxSST1GEBRUSi kGGMu1PGroH6HSQpTTPuT4w6ze1VGqMlvwrWxYoC2Wo3v0myxZKHfaE5 LoZ3rS+38hJ5Rn8SphGdtfKj3LCgyUa0cspnPlnWfrT3Gb0F6+gFUu0W UqFE1f/uXVXdpFrrhdsOa7JwCDlN8MCBD4bE8KvZ44W/QafYzjB9xRW6 OLKzb2OKaHDrPS4nNTrPNd27u1ckM9za7/8Xzzj77xv/7n3DxDtdxZwe 3cIP8pk3nrs=
weberlab.de.            60      IN      RRSIG   DNSKEY 10 2 60 20190711220120 20190611215721 36935 weberlab.de. pTxMnEVpxiufOPOuJsnKHMj89lMzzWjszQ3ndtWEaKP73OElE16hQbZl uEkBQwQEC7uB5qnEntTXP5SqGQVKLxC7qNE6cyKHnHOaLFc6M7ZGIdPx 4zNAweqKWt57GZ3P7usfiMKCCkCDZh6dEzOm+Gt/T44RZQ2HCrp01hWU 1aDVh/WjEJGxnpeKral6aV7go6SChtYQKB0QtoychkpQnRa2kBkm4JsA g+9qTdiAdw09HhJvHWUpFM9bpDGMWwcnlf8HqY0xW2ob3vDNo7+6BXAf zVC3YuWmPlZvzvcC0xt3s5BgvCEnt+HEn3E0mfpKVVGnoL7U/ZbK7/tT SaA/6w==

;; Query time: 0 msec
;; SERVER: 194.247.5.14#53(194.247.5.14)
;; WHEN: Tue Jun 18 14:58:36 UTC 2019
;; MSG SIZE  rcvd: 1730

weberjoh@vm22-lx2:~$

Find those two UDP with IP fragmentation sessions at UDP streams 15 & 16 (while those IP fragments are NOT part of the stream directly. Again, be careful!) and TCP streams 0 and 1:

Fun fact out of my daily business while dealing with shitty fragments:

Grr. Just failed in capturing DNS traffic with "port 53" tcpdump filter. Why? Because #IPv6 fragment is not "port 53". Took me 1 hour!!! pic.twitter.com/xepm0G9OTF

— Johannes Weber (@webernetz) October 10, 2017

EDNS Extensions

The extension mechanisms for DNS (EDNS), RFC 6891, offer a flexible way to increase the DNS features. I am showing two scenarios here: RFC 7871 “Client Subnet in DNS Queries” and RFC 7873 “Domain Name System (DNS) Cookies”.

EDNS(0) Client Subnet

For the EDNS client subnet (ECS) packets I queried the Google Public DNS Resolver from one of my Linux machines, but I captured the packets on my authoritative DNS server! That is: The packets in the trace file show the resolving process from Google Public DNS to my DNS server ns1.weberdns.de. Google adds the client subnet, which in this case is a /56 IPv6 prefix on which my Linux machine resides.

dig @2001:4860:4860::8888 pa.weberlab.de
dig @2001:4860:4860::8888 pa.weberlab.de aaaa
dig @2001:4860:4860::8888 fg2.weberlab.de
dig @2001:4860:4860::8888 fg2.weberlab.de aaaa

Full listing (nothing to see here, skip it):

weberjoh@vm22-lx2:~$ dig @2001:4860:4860::8888 pa.weberlab.de

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> @2001:4860:4860::8888 pa.weberlab.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43536
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;pa.weberlab.de.                        IN      A

;; ANSWER SECTION:
pa.weberlab.de.         59      IN      A       193.24.227.9

;; Query time: 1068 msec
;; SERVER: 2001:4860:4860::8888#53(2001:4860:4860::8888)
;; WHEN: Mon May 27 14:40:08 UTC 2019
;; MSG SIZE  rcvd: 59

weberjoh@vm22-lx2:~$ dig @2001:4860:4860::8888 pa.weberlab.de aaaa

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> @2001:4860:4860::8888 pa.weberlab.de aaaa
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16871
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;pa.weberlab.de.                        IN      AAAA

;; ANSWER SECTION:
pa.weberlab.de.         59      IN      AAAA    2001:470:1f0b:1024::2

;; Query time: 1036 msec
;; SERVER: 2001:4860:4860::8888#53(2001:4860:4860::8888)
;; WHEN: Mon May 27 14:40:11 UTC 2019
;; MSG SIZE  rcvd: 71

weberjoh@vm22-lx2:~$ dig @2001:4860:4860::8888 fg2.weberlab.de

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> @2001:4860:4860::8888 fg2.weberlab.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63678
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;fg2.weberlab.de.               IN      A

;; ANSWER SECTION:
fg2.weberlab.de.        59      IN      A       194.247.4.10

;; Query time: 2080 msec
;; SERVER: 2001:4860:4860::8888#53(2001:4860:4860::8888)
;; WHEN: Mon May 27 14:40:19 UTC 2019
;; MSG SIZE  rcvd: 60

weberjoh@vm22-lx2:~$ dig @2001:4860:4860::8888 fg2.weberlab.de aaaa

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> @2001:4860:4860::8888 fg2.weberlab.de aaaa
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14923
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;fg2.weberlab.de.               IN      AAAA

;; AUTHORITY SECTION:
weberlab.de.            59      IN      SOA     ns0.weberdns.de. webmaster.webernetz.net. 2019051753 3600 900 2419200 60

;; Query time: 79 msec
;; SERVER: 2001:4860:4860::8888#53(2001:4860:4860::8888)
;; WHEN: Mon May 27 14:40:21 UTC 2019
;; MSG SIZE  rcvd: 116

Not all resolving queries from Google Public DNS are shown here because I only captured on one out of two authoritative DNS servers. However, some queries came in twice, some over IPv6 – some over legacy IP. For whatever reason:

Domain Name System (DNS) Cookies

Finally, DNS cookies: “DNS Cookies are a lightweight DNS transaction security mechanism that provides limited protection to DNS servers and clients against a variety of increasingly common denial-of-service and amplification/ forgery or cache poisoning attacks by off-path attackers”, RFC 7873. I simply queried my authoritative DNS server (hence the “aa”, authoritative answer, flag is set) four times for the same FQDN. For every transaction, a new client and server cookie was generated by both sides. UDP streams 7-10.

dig @ns1.weberdns.de fg2-mgmt.weberlab.de aaaa +noauthority +noadditional
dig @ns1.weberdns.de fg2-mgmt.weberlab.de aaaa +noauthority +noadditional
dig @ns1.weberdns.de fg2-mgmt.weberlab.de aaaa +noauthority +noadditional
dig @ns1.weberdns.de fg2-mgmt.weberlab.de aaaa +noauthority +noadditional

Full listing which also shows the cookies in the dig output:

weberjoh@vm22-lx2:~$ dig @ns1.weberdns.de fg2-mgmt.weberlab.de aaaa +noauthority +noadditional

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> @ns1.weberdns.de fg2-mgmt.weberlab.de aaaa +noauthority +noadditional
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 47560
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 5
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 9c9f2193c4a2cfb86451e9255ced1943f2d12053a311cf67 (good)
;; QUESTION SECTION:
;fg2-mgmt.weberlab.de.          IN      AAAA

;; ANSWER SECTION:
fg2-mgmt.weberlab.de.   60      IN      AAAA    2001:470:1f0b:16b0::1

;; Query time: 11 msec
;; SERVER: 2001:470:765b::a25:53#53(2001:470:765b::a25:53)
;; WHEN: Tue May 28 11:19:31 UTC 2019
;; MSG SIZE  rcvd: 238

weberjoh@vm22-lx2:~$ dig @ns1.weberdns.de fg2-mgmt.weberlab.de aaaa +noauthority +noadditional

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> @ns1.weberdns.de fg2-mgmt.weberlab.de aaaa +noauthority +noadditional
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48553
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 5
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: ec6f12881a07de8fbb97c8ee5ced1944b1230759f243747c (good)
;; QUESTION SECTION:
;fg2-mgmt.weberlab.de.          IN      AAAA

;; ANSWER SECTION:
fg2-mgmt.weberlab.de.   60      IN      AAAA    2001:470:1f0b:16b0::1

;; Query time: 11 msec
;; SERVER: 2001:470:765b::a25:53#53(2001:470:765b::a25:53)
;; WHEN: Tue May 28 11:19:32 UTC 2019
;; MSG SIZE  rcvd: 238

weberjoh@vm22-lx2:~$ dig @ns1.weberdns.de fg2-mgmt.weberlab.de aaaa +noauthority +noadditional

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> @ns1.weberdns.de fg2-mgmt.weberlab.de aaaa +noauthority +noadditional
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56708
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 5
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: a500c2e3af1d06e22a15acc65ced1945c5a0147665ed30a1 (good)
;; QUESTION SECTION:
;fg2-mgmt.weberlab.de.          IN      AAAA

;; ANSWER SECTION:
fg2-mgmt.weberlab.de.   60      IN      AAAA    2001:470:1f0b:16b0::1

;; Query time: 11 msec
;; SERVER: 2001:470:765b::a25:53#53(2001:470:765b::a25:53)
;; WHEN: Tue May 28 11:19:33 UTC 2019
;; MSG SIZE  rcvd: 238

weberjoh@vm22-lx2:~$ dig @ns1.weberdns.de fg2-mgmt.weberlab.de aaaa +noauthority +noadditional

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> @ns1.weberdns.de fg2-mgmt.weberlab.de aaaa +noauthority +noadditional
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59906
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 5
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: b52650aa47cc48463fdb3f075ced19466ddac74de2a595f7 (good)
;; QUESTION SECTION:
;fg2-mgmt.weberlab.de.          IN      AAAA

;; ANSWER SECTION:
fg2-mgmt.weberlab.de.   60      IN      AAAA    2001:470:1f0b:16b0::1

;; Query time: 11 msec
;; SERVER: 2001:470:765b::a25:53#53(2001:470:765b::a25:53)
;; WHEN: Tue May 28 11:19:34 UTC 2019
;; MSG SIZE  rcvd: 238

Wireshark screenshot, again with a custom column:

That’s it. Happy wiresharking!

Further Links

Featured image “Shirley Lo” by Thomas Hawk is licensed under CC BY-NC 2.0.

↧

Basic NTP Client Test: ntpdate & sntp

August 21, 2019, 8:53 am

≫ Next: Basic NTP Server Monitoring

≪ Previous: DNS Capture: UDP, TCP, IP-Fragmentation, EDNS, ECS, Cookie

During my work with a couple of NTP servers, I had many situations in which I just wanted to know whether an NTP server is up and running or not. For this purpose, I used two small Linux tools that fulfill almost the same: single CLI command while not actually updating any clock but only displaying the result. That is: ntpdate & sntp. Of course, the usage of IPv6 is mandatory as well as the possibility to test NTP authentication.

This article is one of many blogposts within this NTP series. Please have a look!

Refer to my “Packet Capture: Network Time Protocol (NTP)” blogpost in order to download a pcap with different NTP packets, or at least heaving a view of them at the screenshots.

ntpdate

You can use ntpdate with the “-q” switch to “Query only – don’t set the clock”. This is a very basic run:

weberjoh@nb15-lx:~$ ntpdate -q ntp3.weberlab.de
server 2003:de:2016:330::dcfb:123, stratum 1, offset -0.002689, delay 0.02620
21 Mar 16:56:05 ntpdate[30627]: adjust time server 2003:de:2016:330::dcfb:123 offset -0.002689 sec

When using an FQDN with a couple of A/AAAA records, ntpdate queries all of them:

weberjoh@nb15-lx:~$ ntpdate -q ntp.weberlab.de
server 2003:de:2016:330::dcfb:123, stratum 1, offset -0.002768, delay 0.02623
server 2003:de:2016:336::dcf7:123, stratum 1, offset -0.024801, delay 0.07458
server 2003:de:2016:330::6b5:123, stratum 1, offset -0.002645, delay 0.02663
21 Mar 17:00:55 ntpdate[30631]: adjust time server 2003:de:2016:330::dcfb:123 offset -0.002768 sec

Furthermore, you can query several names at once:

weberjoh@nb15-lx:~$ ntpdate -q ntp.weberlab.de de.pool.ntp.org
server 2003:de:2016:336::dcf7:123, stratum 1, offset -0.003254, delay 0.03484
server 2003:de:2016:330::6b5:123, stratum 1, offset -0.002702, delay 0.02657
server 2003:de:2016:330::dcfb:123, stratum 1, offset -0.002803, delay 0.02618
server 81.7.4.127, stratum 2, offset 0.000870, delay 0.03607
server 217.91.44.17, stratum 2, offset -0.000245, delay 0.03961
server 212.112.228.242, stratum 2, offset -0.003942, delay 0.03146
server 81.3.27.46, stratum 3, offset -0.001936, delay 0.03152
21 Mar 17:02:30 ntpdate[30636]: adjust time server 2003:de:2016:330::dcfb:123 offset -0.002803 sec

Debugging & NTP Authentication

Even more relevant for my lab tests is the ability to test NTP authentication. Therefore I am using the debugging mode “-d” which will print out all steps (while still not updating the local clock) in conjunction with

-a <key-id>

and

-k <keyfile>

. For this example I used two SHA1 keys from one of my NTP servers (ntp3.weberlab.de):

weberjoh@nb15-lx:~$ cat ntp3.keys
11 SHA1 c8ea1e9d5496925e12b903945a4d87c93450f37d  # SHA1 key
12 SHA1 b8ea1e9d5496925e12b903945a4d87c93450f37d  # SHA1 key

While key number 11 is indeed correct (receive: authentication passed):

weberjoh@nb15-lx:~$ ntpdate -a 11 -k ~/ntp3.keys -d ntp3.weberlab.de
21 Mar 17:08:34 ntpdate[30707]: ntpdate 4.2.8p4@1.3265-o Fri Jul  6 20:10:56 UTC 2018 (1)
Looking for host ntp3.weberlab.de and service ntp
host found : 2003:de:2016:330::dcfb:123
transmit(2003:de:2016:330::dcfb:123)
receive(2003:de:2016:330::dcfb:123)
receive: authentication passed
transmit(2003:de:2016:330::dcfb:123)
receive(2003:de:2016:330::dcfb:123)
receive: authentication passed
transmit(2003:de:2016:330::dcfb:123)
receive(2003:de:2016:330::dcfb:123)
receive: authentication passed
transmit(2003:de:2016:330::dcfb:123)
receive(2003:de:2016:330::dcfb:123)
receive: authentication passed
server 2003:de:2016:330::dcfb:123, port 123
stratum 1, precision -18, leap 00, trust 000
refid [PZF], delay 0.02641, dispersion 0.00002
transmitted 4, in filter 4
reference time:    e03e3586.0962cd3a  Thu, Mar 21 2019 17:08:38.036
originate timestamp: e03e3588.28a90bc1  Thu, Mar 21 2019 17:08:40.158
transmit timestamp:  e03e3588.290dd0a3  Thu, Mar 21 2019 17:08:40.160
filter delay:  0.02678  0.02641  0.02644  0.02647
         0.00000  0.00000  0.00000  0.00000
filter offset: -0.00247 -0.00259 -0.00262 -0.00258
         0.000000 0.000000 0.000000 0.000000
delay 0.02641, dispersion 0.00002
offset -0.002595

21 Mar 17:08:40 ntpdate[30707]: adjust time server 2003:de:2016:330::dcfb:123 offset -0.002595 sec

I falsified key number 12 on purpose to test the output (receive: authentication failed):

weberjoh@nb15-lx:~$ ntpdate -a 12 -k ~/ntp3.keys -d ntp3.weberlab.de
21 Mar 17:14:13 ntpdate[30778]: ntpdate 4.2.8p4@1.3265-o Fri Jul  6 20:10:56 UTC 2018 (1)
Looking for host ntp3.weberlab.de and service ntp
host found : 2003:de:2016:330::dcfb:123
transmit(2003:de:2016:330::dcfb:123)
receive(2003:de:2016:330::dcfb:123)
receive: authentication failed
transmit(2003:de:2016:330::dcfb:123)
receive(2003:de:2016:330::dcfb:123)
receive: authentication failed
transmit(2003:de:2016:330::dcfb:123)
receive(2003:de:2016:330::dcfb:123)
receive: authentication failed
transmit(2003:de:2016:330::dcfb:123)
receive(2003:de:2016:330::dcfb:123)
receive: authentication failed
2003:de:2016:330::dcfb:123: Server dropped: Server is untrusted
server 2003:de:2016:330::dcfb:123, port 123
stratum 1, precision -18, leap 00, trust 017
refid [PZF], delay 0.02647, dispersion 0.00002
transmitted 4, in filter 4
reference time:    e03e36d6.096369ef  Thu, Mar 21 2019 17:14:14.036
originate timestamp: e03e36db.eeb096a4  Thu, Mar 21 2019 17:14:19.932
transmit timestamp:  e03e36db.eefa3bbd  Thu, Mar 21 2019 17:14:19.933
filter delay:  0.02681  0.02658  0.02650  0.02647
         0.00000  0.00000  0.00000  0.00000
filter offset: -0.00200 -0.00217 -0.00213 -0.00214
         0.000000 0.000000 0.000000 0.000000
delay 0.02647, dispersion 0.00002
offset -0.002146

21 Mar 17:14:19 ntpdate[30778]: no server suitable for synchronization found

Mission accomplished.

sntp

Just as an alternative you can use sntp as well. Sntp by default writes “the estimated correct local date and time (i.e. not UTC) to the standard output”:

pi@pi05-random:~ $ sntp ntp3.weberlab.de
sntp 4.2.8p10@1.3728-o Sat Mar 10 17:59:48 UTC 2018 (1)
kod_init_kod_db(): Cannot open KoD db file /var/db/ntp-kod: No such file or directory
2019-03-21 17:23:56.922791 (-0100) +0.00035 +/- 0.000474 ntp3.weberlab.de 2003:de:2016:330::dcfb:123 s1 no-leap

The “kod_init_kod_db” warning is normal and can be ignored.

Debugging & NTP Authentication

Fortunately, the options are quite the same when it comes to debugging and NTP authentication. That is:

-d -a <key-id> -k <keyfile>

. Having the same two keys (11 ok, 12 not ok) in place, this gives the following output for “authenticated using key id 11“:

pi@ntp2-gps:~ $ sntp -d -a 11 -k ~/ntp3.keys ntp3.weberlab.de
sntp 4.2.8p12@1.3728-o Thu Nov  8 11:39:41 UTC 2018 (1)
kod_init_kod_db(): Cannot open KoD db file /var/db/ntp-kod: No such file or directory
handle_lookup(ntp3.weberlab.de,0x2)
move_fd: estimated max descriptors: 1024, initial socket boundary: 16
generate_pkt: key_id 11, key pointer 0x2416878
generate_pkt: mac_size is 20
sntp sendpkt: Sending packet to [2003:de:2016:330::dcfb:123]:123 ...
Packet sent.
sock_cb: ntp3.weberlab.de [2003:de:2016:330::dcfb:123]:123
sntp sock_cb: packet from 2003:de:2016:330::dcfb:123 authenticated using key id 11.
2019-03-21 17:37:39.872110 (-0100) +0.00241 +/- 0.001833 ntp3.weberlab.de 2003:de:2016:330::dcfb:123 s1 no-leap

as well as for “Crypto NAK“:

pi@ntp2-gps:~ $ sntp -d -a 12 -k ~/ntp3.keys ntp3.weberlab.de
sntp 4.2.8p12@1.3728-o Thu Nov  8 11:39:41 UTC 2018 (1)
kod_init_kod_db(): Cannot open KoD db file /var/db/ntp-kod: No such file or directory
handle_lookup(ntp3.weberlab.de,0x2)
move_fd: estimated max descriptors: 1024, initial socket boundary: 16
generate_pkt: key_id 12, key pointer 0x10018e0
generate_pkt: mac_size is 20
sntp sendpkt: Sending packet to [2003:de:2016:330::dcfb:123]:123 ...
Packet sent.
sock_cb: ntp3.weberlab.de [2003:de:2016:330::dcfb:123]:123
Crypto NAK = 0x00000000 from 2003:de:2016:330::dcfb:123
sock_cb: handle_pkt() returned 1

Please note that to my mind there was a bug at least in sntp version 4.2.8p10, since I got this “Segmentation fault” output from another machine (while the above listed outputs were from version 4.2.8p12):

pi@pi05-random:~ $ sntp -d -a 11 -k ~/ntp3.keys ntp3.weberlab.de
sntp 4.2.8p10@1.3728-o Sat Mar 10 17:59:48 UTC 2018 (1)
kod_init_kod_db(): Cannot open KoD db file /var/db/ntp-kod: No such file or directory
handle_lookup(ntp3.weberlab.de,0x2)
Segmentation fault

That’s it. ;) Happy troubleshooting.

Featured image “Reading glasses over a book” by Marco Verch is licensed under CC BY 2.0.

↧

Basic NTP Server Monitoring

August 27, 2019, 8:24 am

≫ Next: Counting NTP Clients

≪ Previous: Basic NTP Client Test: ntpdate & sntp

Now that you have your own NTP servers up and running (such as some Raspberry Pis with external DCF77 or GPS times sources) you should monitor them appropriately, that is: at least their offset, jitter, and reach. From an operational/security perspective it is always good to have some historical graphs that show how any service behaves under normal circumstances to easily get an idea about a problem in case one occurs. With this post I am showing how to monitor your NTP servers for offset, jitter, reach, and traffic aka “NTP packets sent/received”.

This article is one of many blogposts within this NTP series. Please have a look!

Note I am not yet “counting NTP clients” (which is very interesting), since the mere NTP software is not handing out such values. The “monstats” or “mrulist” do not fit for this use case at all. I’ll have an upcoming blogpost about this topic solely.

Please note: While I am still using my outdated SNMP/MRTG/RRDtool installation, you can get the idea in general while you should use your own monitoring tool for those purposes. I won’t recommend MRTG/Routers2 in 2019 anymore. Please DO NOT simply install MRTG by yourself but use other, more modern, monitoring services such as Icinga 2, Zabbix, or the like. However, everything I am showing with these tools can be monitored with any other SNMP-like system as well. You should also note that MRTG queries all values every 5 minutes. Since the monitored objects here are NOT an “average over the last 5 minutes” (such as mere interfaces counters from switches are) but a single value at the time of asking, all graphs show only parts of the truth – in discrete steps of 5 minutes. ;) Final note: Thanks to David Taylor and his page about Using MRTG to monitor NTP, which gave me the first ideas.

How to monitor NTP Servers?

Besides monitoring the generic Linux operating system stuff such as CPU, memory, load average, and network bandwidth, you have at least two different methods in getting some telemetry data from the NTP service to your monitoring station:

Calling ntpq from the monitoring station in order to get some data from the remote NTP server. You must use the “restrict” option on the NTP server within ntp.conf to permit the monitoring station to query data, such as
```
restrict 2003:de:2016:120::a01:80
```
in my case. The monitoring server is then able to use ntpq with a remote host such as
```
ntpq -c rv ntp1.weberlab.de
```
. In fact these are normal NTP packets on the wire, but with NTP mode = 6, called “control message”. Refer to my blogpost “Packet Capture: Network Time Protocol“.
Using some scripts on the NTP server itself to hand out data to other protocols, such as SNMP. For example, I am using the “extend-sh” option within the snmpd.conf configuration on the NTP server to have the monitoring server query normal SNMP OIDs for certain NTP related information.

In any case, you must use some tools to grep ‘n sed through the output to extract exactly the values you are interested in. Furthermore, you need to feed those values into your monitoring tool. I am showing some MRTG config snippets along with RRDtool graphs.

For details about offset, jitter, and reach in general please have a look at the following blogpost from Aaron Toponce, who describes all those values quite good: Real Life NTP.

Offset and Jitter

Offset: “This value is displayed in milliseconds, using root mean squares, and shows how far off your clock is from the reported time the server gave you. It can be positive or negative.” -> Since I am monitoring a stratum 1 NTP server (with a directly attached reference clock receiver), the offset shows the difference between this reference clock and the system time. Assumed that the system time is quite linear, the offset shows the variance from the received reference clock signal.

Jitter: “This number is an absolute value in milliseconds, showing the root mean squared deviation of your offsets.”

For those two values I am using ntpq on the monitoring server itself. Note that the ntpq tool changed the count of position after decimal point some years ago. Some old ntpq might return something like “offset=-0.306”, while a newer ntpq is returning “offset=-0.306155”.

Basically, “ntpq -c rv” displays the system variables from the NTP server, such as:

weberjoh@nb15-lx:~$ ntpq -c rv ntp2.weberlab.de
associd=0 status=0118 leap_none, sync_pps, 1 event, no_sys_peer,
version="ntpd 4.2.8p12@1.3728-o Thu Nov  8 11:50:39 UTC 2018 (1)",
processor="armv6l", system="Linux/4.14.71+", leap=00, stratum=1,
precision=-18, rootdelay=0.000, rootdisp=1.060, refid=PPS,
reftime=df9022bd.d28a1635  Fri, Nov  9 2018 16:14:05.822,
clock=df9022c2.0df2e132  Fri, Nov  9 2018 16:14:10.054, peer=61582, tc=4,
mintc=3, offset=0.005970, frequency=-7.553, sys_jitter=0.003815,
clk_jitter=0.005, clk_wander=0.002

You can see the offset and sys_jitter in the second to last line. Using grep ‘n sed you can extract those values only:

weberjoh@nb15-lx:~$ ntpq -c rv ntp2.weberlab.de | grep offset | sed s/.*offset.// | sed s/,.*//
0.004919
weberjoh@nb15-lx:~$ ntpq -c rv ntp2.weberlab.de | grep sys_jitter | sed s/.*jitter.// | sed s/,//
0.003815

Since those offset values are that small, I am monitoring them in µs rather than in ms! Now for my MRTG installation this “Target” has the following config for the offset in µs:

###############################################################
################### Offset µ Microseconds #####################
###############################################################
Target[ntp2-gps-offset-us]: `ntpq -c rv ntp2.weberlab.de | grep offset | sed s/.*offset.// | sed s/,.*// && echo 0` * 1000
#Max only 0.1 seconds = 100 ms = 100000 us
MaxBytes[ntp2-gps-offset-us]: 100000
Title[ntp2-gps-offset-us]: Offset µs -- ntp2-gps
Options[ntp2-gps-offset-us]: gauge
Colours[ntp2-gps-offset-us]: DARKPURPLE#7608AA, Blue#0000FF, BLACK#000000, Purple#FF00FF
YLegend[ntp2-gps-offset-us]: Offset in microseconds (µs)
Legend1[ntp2-gps-offset-us]: Offset
Legend3[ntp2-gps-offset-us]: Peak Offset
LegendI[ntp2-gps-offset-us]: Offset:
ShortLegend[ntp2-gps-offset-us]: µs
routers.cgi*Options[ntp2-gps-offset-us]: fixunit nototal noo
routers.cgi*ShortDesc[ntp2-gps-offset-us]: Offset µs ntp2-gps
routers.cgi*Icon[ntp2-gps-offset-us]: graph-sm.gif

as well as for the jitter, in µs, too:

###############################################################
################### Jitter µ Microseconds #####################
###############################################################
Target[ntp2-gps-jitter-us]: `ntpq -c rv ntp2 | grep sys_jitter | sed s/.*jitter.// | sed s/,// && echo 0` * 1000
#Max only 0.1 seconds = 100 ms = 100000 us
MaxBytes[ntp2-gps-jitter-us]: 100000
Title[ntp2-gps-jitter-us]: Jitter µs -- ntp2-gps
Options[ntp2-gps-jitter-us]: gauge
Colours[ntp2-gps-jitter-us]: TURQUOISE#00CCCC, Blue#0000FF, DARKTURQUOISE#377D77, Purple#FF00FF
YLegend[ntp2-gps-jitter-us]: Jitter in microseconds (µs)
Legend1[ntp2-gps-jitter-us]: Jitter
Legend3[ntp2-gps-jitter-us]: Peak Jitter
LegendI[ntp2-gps-jitter-us]: Jitter:
ShortLegend[ntp2-gps-jitter-us]: µs
routers.cgi*Options[ntp2-gps-jitter-us]: fixunit nototal noo
routers.cgi*ShortDesc[ntp2-gps-jitter-us]: Jitter µs ntp2-gps
routers.cgi*Icon[ntp2-gps-jitter-us]: link-sm.gif

Note that the offset can be positive and negative! Hence the rrd file for MRTG must be tuned to support negative values:

sudo rrdtool tune /var/mrtg/ntp2-gps-offset-us.rrd --minimum ds0:-100000
# viewing the file with rrdtool:
rrdtool info /var/mrtg/ntp2-gps-offset-us.rrd
# should indicate something like this:
ds[ds0].min = -1.0000000000e+05
ds[ds0].max = 1.0000000000e+05

In the end, my offset graphs look like this: (2x DCF77 daily/weekly, having offsets of +/- 1000 µs = 1 ms, and 2x GPS daily/weekly, having offsets of +/- 1 µs!!!)

While the jitter graphs are as follows: (Again 2x DCF77 daily/weekly with jitter of about 1000 µs = 1 ms, and 2x GPS daily/weekly with jitter of about 2 µs!!!)

MRTG draws and calculates the peak only for positive values. Hence this black line in the offset graphs only appears in the upper half. Furthermore, my old ntpq package on my monitoring server only gives me three positions after the decimal point. Hence the offset graph from the GPS based NTP server looks quite discrete since the values are only swapping between -2, -1, 0, +1, +2, without any intermediate states. Running on a newer version of ntpq there would be more detailed.

Reach

Reach: “This is an 8-bit left shift octal value that shows the success and failure rate of communicating with the remote server. Success means the bit is set, failure means the bit is not set. 377 is the highest value.”

Note that monitoring this value is NOT accurate due to the fact that this reach field is an 8 bit shift register, represented as octal values to the user. Refer to some more detailed explanations on this such as here: Understanding NTP Reachability Statistics. In fact, you can have *lower* values just after a *higher* value even though your error is some minutes away. For example, having a single error (one 0), which is shifted towards the left of the buffer, results in octal values of: 376, 375, 373, 367, 357, 337, 277, 177, 377. That is: the second to last value of 177 is way below the very last value of 377, though it’s quite good since the last 6 queries succeeded. However, for some reason, I am mostly seeing 376 followed by a 377. Don’t know why.

Anyway, it gives a good overview at a glance whether your NTP server’s reference clock is reachable or not. That’s why I am monitoring it. For my DCF77 based NTP server there is only one reach to look at (the one from the “GENERIC(0)” DCF77 module), while the GPS based NTP server has two values: one from the PPS(0) that gives the high accurate ticks and one from the SHM(0) which is the gpsd driver:

weberjoh@vm01-mrtg:~$ ntpq -c peers ntp1.weberlab.de | grep 'GENERIC(0)' | awk '{print $7}' && echo 0
377
0
weberjoh@vm01-mrtg:~$ ntpq -c peers ntp2.weberlab.de | grep 'PPS(0)' | awk '{print $7}' && ntpq -c peers ntp2.weberlab.de | grep 'SHM(0)' | awk '{print $7}'
377
377

I am using two MRTG Targets since I have two NTP servers, ntp1 with DCF77 and ntp2 with GPS/PPS. In the end I am additionally using a summary graph to display all reach lines in one single diagram:

###############################################################
########################### Reach #############################
###############################################################
Target[ntp1-dcf77-reach]: `ntpq -c peers ntp1.weberlab.de | grep 'GENERIC(0)' | awk '{print $7}' && echo 0`
MaxBytes[ntp1-dcf77-reach]: 377
Title[ntp1-dcf77-reach]: Reach -- ntp1-dcf77
Options[ntp1-dcf77-reach]: gauge integer
YLegend[ntp1-dcf77-reach]: Reach 0 - 377
Legend1[ntp1-dcf77-reach]: Reach
Legend3[ntp1-dcf77-reach]: Peak Reach
LegendI[ntp1-dcf77-reach]: Reach:
ShortLegend[ntp1-dcf77-reach]: &nbsp;
routers.cgi*Options[ntp1-dcf77-reach]: fixunit nototal noo
routers.cgi*ShortDesc[ntp1-dcf77-reach]: Reach ntp1-dcf77
routers.cgi*Icon[ntp1-dcf77-reach]: tick-sm.gif
routers.cgi*Graph[ntp1-dcf77-reach]: ntp-reach

Target[ntp2-gps-reach]: `ntpq -c peers ntp2.weberlab.de | grep 'PPS(0)' | awk '{print $7}' && ntpq -c peers ntp2.weberlab.de | grep 'SHM(0)' | awk '{print $7}'`
MaxBytes[ntp2-gps-reach]: 377
Title[ntp2-gps-reach]: Reach -- ntp2-gps
Options[ntp2-gps-reach]: gauge integer
YLegend[ntp2-gps-reach]: Reach 0 - 377
Legend1[ntp2-gps-reach]: Reach PPS
Legend2[ntp2-gps-reach]: Reach GPS
Legend3[ntp2-gps-reach]: Peak Reach PPS
Legend4[ntp2-gps-reach]: Peak Reach GPS
LegendI[ntp2-gps-reach]: Reach PPS:
LegendO[ntp2-gps-reach]: Reach GPS:
ShortLegend[ntp2-gps-reach]: &nbsp;
routers.cgi*Options[ntp2-gps-reach]: fixunit nototal
routers.cgi*ShortDesc[ntp2-gps-reach]: Reach ntp2-gps
routers.cgi*Icon[ntp2-gps-reach]: tick-sm.gif
routers.cgi*Graph[ntp2-gps-reach]: ntp-reach

routers.cgi*Title[ntp-reach]: NTP Reach Summary
routers.cgi*ShortDesc[ntp-reach]: Reach Summary
routers.cgi*Options[ntp-reach]: nototal
routers.cgi*Icon[ntp-reach]: tick-sm.gif
routers.cgi*InSummary[ntp-reach]: yes

Here are some sample graphs in the “weekly” version:

And here’s one example of this reach graph in which I discovered a DCF77 outage in Germany on my DCF77 Raspberry Pi (and another graph from my Meinberg LANTIME, which I am covering in another blogpost):

Today (2018-12-04) from 2:45 to 6:45am UTC+1 the #DCF77 signal was lost on both of my NTP servers (Meinberg & RaspberryPi, residing on different physical locations). Don't know why. Any ideas? @MeinbergSync @shad0whunter @CharlyKuehnast pic.twitter.com/zryslcG4qt

— Johannes Weber (@webernetz) December 4, 2018

Traffic

Finally, the iostats “display network and reference clock I/O statistics”, standard NTP query program. Those “received packets” and “packets sent” values do not only list the NTP client/server packets from the network, but also the internal packets that NTP uses to query the reference clock driver. Hence those packet rates do not directly point to the number of NTP clients that are using this particular server, but at least give some basic thoughts about it. However, for example my DCF77 receiver has quite different packet rates when it’s working compared to the GPS receiver, so it’s hard to compare them directly.

Originally I wanted to get the counters to my monitoring server in the same way as the above mentioned jitter/offset values, but my fairly old ntp package on my MRTG server wasn’t aware of those received/send packets counters. Hence I used method number two (as explained in the beginning) in which I extended the SNMP daemon on the NTP server to run some scripts that I can then query from the monitoring server via SNMP. Note that you should use method one for this! My SNMP thing is just a workaround.

pi@ntp2-gps:~ $ ntpq -c iostats | grep 'received packets' | awk '{print $3}'
735379
pi@ntp2-gps:~ $ ntpq -c iostats | grep 'packets sent' | awk '{print $3}'
671111

That is: On the NTP server itself I added the following two lines to the

sudo nano /etc/snmp/snmpd.conf

file at the “EXTENDING THE AGENT” section:

extend-sh ntprx ntpq -c iostats | grep 'received packets' | awk '{print $3}'
extend-sh ntptx ntpq -c iostats | grep 'packets sent' | awk '{print $3}'

while I am using this MRTG Target querying the appropriate OIDs via SNMP:

###############################################################
########################## Traffic ############################
###############################################################
#Note that this graph shows packets rather than bytes!
#That is: MaxByte set to max 10.000 packets per second. That should fit. ;)
Target[ntp2-gps-iostats]: 1.3.6.1.4.1.8072.1.3.2.3.1.1.5.110.116.112.114.120&1.3.6.1.4.1.8072.1.3.2.3.1.1.5.110.116.112.116.120:THISISTHEKEY@ntp2.weberlab.de::11:::2
MaxBytes[ntp2-gps-iostats]: 10000
Title[ntp2-gps-iostats]: Traffic Analysis ntpd in Packets -- ntp2-gps
YLegend[ntp2-gps-iostats]: packets per second
ShortLegend[ntp2-gps-iostats]: packets/s
routers.cgi*Options[ntp2-gps-iostats]: nomax
routers.cgi*GraphStyle[ntp2-gps-iostats]: mirror
routers.cgi*ShortDesc[ntp2-gps-iostats]: Traffic ntp2-gps

This gives you graphs like this. Note the third one from an NTP server within the NTP Pool Project that peaks when it is used:

What’s missing? -> Alerting!

Please note that I have not yet used any kind of automatic alerting function. If you’re running your own stratum 1 NTP servers you should get informed in case of a hardware or software error, or just in case your antenna is faulty. Your reach should always be 377, while the offset should be lower than 1 ms all the time, and so on. Please use your own setups to keep track of those values.

That’s it for now. Happy monitoring! ;)

Featured image “Robert Scoble, Coachella 2013 — Indio, CA” by Thomas Hawk is licensed under CC BY-NC 2.0.

↧

Counting NTP Clients

September 2, 2019, 12:30 am

≫ Next: Monitoring a DCF77 NTP Server

≪ Previous: Basic NTP Server Monitoring

Wherever you’re running an NTP server: It is really interesting to see how many clients are using it. Either at home, in your company or worldwide at the NTP Pool Project. The problem is that ntp itself does not give you this answer of how many clients it serves. There are the “monstats” and “mrulist” queries but they are not reliable at all since they are not made for this. Hence I had to take another path in order to count NTP clients for my stratum 1 NTP servers. Let’s dig in:

This article is one of many blogposts within this NTP series. Please have a look!

Not: monstats & mrulist

If you are running an NTP server with just a few clients (let’s say: up to 100), you can use the monstats and mrulist queries to get the number of clients and their addresses. However, you should NOT use those queries on NTP servers that are under high load. They will take minutes to finish and the results are not reliable at all.

ntpq> monstats
enabled:              0x3
addresses:            12780
peak addresses:       12780
maximum addresses:    14563
reclaim above count:  600
reclaim older than:   64
kilobytes:            899
maximum kilobytes:    1024
ntpq> mrulist
Ctrl-C will stop MRU retrieval and display partial results.
1362 (0 updates)

I tried monitoring my NTP servers with the “maximum addresses” output from the monstats query, but the numbers weren’t exact at all. Even adjusting the “reclaim above count” and “reclaim older than” did not succeed in any way.

Alternative: UFW with grep ‘n sed et al.

This is my solution: I installed a simple firewall on the NTP server, the uncomplicated firewall UFW, which only logs every single NTP request into syslog. With some scripts I am grepping through the logs, sorting and uniqing ;) the source IP addresses, and finally, count them.

Since my monitoring system (MRTG w/ Routers2 and RRDtool) polls every 5 minutes, I am counting the unique NTP clients during the last 5 minutes. But since common NTP clients increase the query interval up to 1024 seconds = about 17 minutes, those clients will only be listed in the “last 5 minutes” graph every third time. Hence this graph has some ups and downs. Therefore I am counting the unique source addresses over the last 20 minutes as well to get an idea of how many NTP clients are constantly using my NTP server. Note that a one-time peak will be correctly shown in the 5min graph while it will be wrongly displayed for 15 more minutes in the 20min graph. This may happen when investigating DDos attacks or when using the NTP servers within the NTP Pool Project. In summary, I am quite happy with displaying both lines in one graph to have a view about the last 5 minutes as well as a quite realistic value of all clients (20 minutes).

Setting up UFW

Please note that I am operating my NTP servers with IPv6-only. Hence all subsequent commands are mostly IPv6 orientated. If you are still using legacy IP you need to adjust some of those steps.

Have a look at this and that documentation to get an idea. I installed the UFW on some Raspberry Pis as well as some generic Ubuntu Linux servers. Every time remotely via SSH though I would not recommend it that way. ;)

At first, you need to install UFW, add an allow rule for SSH to not cut off your branches, and verify that everything is working up to now:

pi@ntp1:~ $ sudo apt-get update
pi@ntp1:~ $ sudo apt-get install ufw

#The firewall is currently disabled:
pi@ntp1:~ $ sudo ufw status
Status: inactive

#Adding SSH and SNMP to be allowed
pi@ntp1:~ $ sudo ufw allow ssh/tcp
Rules updated
Rules updated (v6)
pi@ntp1:~ $ sudo ufw allow snmp/udp
Rules updated
Rules updated (v6)

#You can verify the added rules with:
pi@ntp1:~ $ sudo ufw show added
Added user rules (see 'ufw status' for running firewall):
ufw allow 22/tcp
ufw allow 161/udp

#Now you can enable it (press y after the first question) and view the status:
pi@ntp1:~ $ sudo ufw enable
Command may disrupt existing ssh connections. Proceed with operation (y|n)? y
Firewall is active and enabled on system startup
pi@ntp1:~ $ 
pi@ntp1:~ $ sudo ufw status
Status: active

To                         Action      From
--                         ------      ----
22/tcp                     ALLOW       Anywhere
161/udp                    ALLOW       Anywhere
22/tcp                     ALLOW       Anywhere (v6)
161/udp                    ALLOW       Anywhere (v6)

Normally I would have added a “sudo ufw allow ntp/udp” or with logging such as “sudo ufw allow log-all ntp/udp” but in any case, the UFW made a log limit such as “-m limit –limit 3/min –limit-burst 10”. Therefore I added my NTP rules with logging (without limits) and a custom “log-prefix” text to the before6.rules. More information here. That is:

sudo nano /etc/ufw/before6.rules

and adding the following lines before the last COMMIT:

# HERE IS MY DEFINED RULE ########################################################
# allow ntp/udp with logging all packets
-A ufw6-before-input -p udp --dport 123 -j LOG --log-prefix "[UFW ALLOW NTP] "
-A ufw6-before-input -p udp --dport 123 -j ACCEPT

Followed by a

sudo ufw reload

. Using

sudo ufw show raw

shows many lines. The chain “ufw6-before-input” now has two more lines at the end:

Chain ufw6-before-input (1 references)
    pkts      bytes target     prot opt in     out     source               destination
       0        0 ACCEPT     all      lo     *       ::/0                 ::/0
       0        0 DROP       all      *      *       ::/0                 ::/0                 rt type:0 segsleft:0
       2      144 ACCEPT     icmpv6    *      *       ::/0                 ::/0                 ipv6-icmptype 135 HL match HL == 255
       3      208 ACCEPT     icmpv6    *      *       ::/0                 ::/0                 ipv6-icmptype 136 HL match HL == 255
       0        0 ACCEPT     icmpv6    *      *       ::/0                 ::/0                 ipv6-icmptype 133 HL match HL == 255
       0        0 ACCEPT     icmpv6    *      *       ::/0                 ::/0                 ipv6-icmptype 134 HL match HL == 255
      79     7008 ACCEPT     all      *      *       ::/0                 ::/0                 state RELATED,ESTABLISHED
       0        0 ACCEPT     icmpv6    *      *       fe80::/10            ::/0                 ipv6-icmptype 129
       0        0 ufw6-logging-deny  all      *      *       ::/0                 ::/0                 state INVALID
       0        0 DROP       all      *      *       ::/0                 ::/0                 state INVALID
       0        0 ACCEPT     icmpv6    *      *       ::/0                 ::/0                 ipv6-icmptype 1
       0        0 ACCEPT     icmpv6    *      *       ::/0                 ::/0                 ipv6-icmptype 2
       0        0 ACCEPT     icmpv6    *      *       ::/0                 ::/0                 ipv6-icmptype 3
       0        0 ACCEPT     icmpv6    *      *       ::/0                 ::/0                 ipv6-icmptype 4
       0        0 ACCEPT     icmpv6    *      *       ::/0                 ::/0                 ipv6-icmptype 128
       0        0 ACCEPT     udp      *      *       fe80::/10            fe80::/10            udp spt:547 dpt:546
       0        0 ACCEPT     udp      *      *       ::/0                 ff02::fb             udp dpt:5353
       0        0 ACCEPT     udp      *      *       ::/0                 ff02::f              udp dpt:1900
       0        0 LOG        udp      *      *       ::/0                 ::/0                 udp dpt:123 LOG flags 0 level 4 prefix "[UFW ALLOW NTP] "
       0        0 ACCEPT     udp      *      *       ::/0                 ::/0                 udp dpt:123
       0        0 ufw6-user-input  all      *      *       ::/0                 ::/0

After a few seconds (depending on the load of your NTP server) the pkts and bytes are increasing:

4      384 LOG        udp      *      *       ::/0                 ::/0                 udp dpt:123 LOG flags 0 level 4 prefix "[UFW ALLOW NTP] "
       4      384 ACCEPT     udp      *      *       ::/0                 ::/0                 udp dpt:123

And the /var/log/syslog shows log entries with the defined [UFW ALLOW NTP] prefix:

pi@ntp1:~ $ tail -f /var/log/syslog | grep 'UFW ALLOW NTP'
May 23 10:25:36 ntp1 kernel: [652436.007855] [UFW ALLOW NTP] IN=eth0 OUT= MAC=b8:27:eb:4f:ff:14:b8:27:eb:d1:52:26:86:dd:6b:80:00:00:00:38:11:40:20:03:00:51 SRC=2003:0051:6012:0110:0000:0000:06b5:0123 DST=2003:0051:6012:0110:0000:0000:dcf7:0123 LEN=96 TC=184 HOPLIMIT=64 FLOWLBL=0 PROTO=UDP SPT=123 DPT=123 LEN=56
May 23 10:25:56 ntp1 kernel: [652456.918946] [UFW ALLOW NTP] IN=eth0 OUT= MAC=b8:27:eb:4f:ff:14:b4:0c:25:05:8e:13:86:dd:60:00:00:00:00:38:11:36:20:01:09:84 SRC=2001:0984:aee9:0006:fad1:11ff:fea0:2b2e DST=2003:0051:6012:0110:0000:0000:dcf7:0123 LEN=96 TC=0 HOPLIMIT=54 FLOWLBL=0 PROTO=UDP SPT=46097 DPT=123 LEN=56
^C

Perfect! ;)

Counting Source IP Addresses

With some small shell commands, you can extract the source IPv6 address only, sort it, list only unique addresses, and count it. You will get something like this:

pi@ntp1:~ $ cat /var/log/syslog | grep 'UFW ALLOW NTP' | sed s/.*SRC=// | sed s/.DST.*// | sort | uniq | wc -l
20

After a RIPE Atlas measurement with 50 clients I got this:

pi@ntp1:~ $ cat /var/log/syslog | grep 'UFW ALLOW NTP' | sed s/.*SRC=// | sed s/.DST.*// | sort | uniq | wc -l
68

Great!

Now I wanted to grep all addresses from the last 5 minutes since my default MRTG/Routers2 installation polls every 5 minutes. That is, I want to know the count of unique IPv6 source addresses during a period of 5 minutes. I found a great awk command that extracts the last 5 minutes by Alfred Tong which worked out of the box:

awk -v d1="$(date --date="-5 min" "+%b %_d %H:%M")" -v d2="$(date "+%b %_d %H:%M")" '$0 > d1 && $0 < d2 || $0 ~ d2' /var/log/syslog

Combined with my grep sort uniq wc command it is this. Hence I’m getting the count of unique addresses during the last 5 minutes. Yeah. Since the default syslog file is rotated every day I am grepping through both logfiles, the current one and the one from yesterday with the “.1” extension. (Though this is only needed for correct stats for a few minutes a day, the script reads both files completely at every execution. But I don’t care. ;))

pi@ntp1:~ $ awk -v d1="$(date --date="-5 min" "+%b %_d %H:%M")" -v d2="$(date "+%b %_d %H:%M")" '$0 > d1 && $0 < d2 || $0 ~ d2' /var/log/syslog /var/log/syslog.1 | grep 'UFW ALLOW NTP' | sed s/.*SRC=// | sed s/.DST.*// | sort | uniq | wc -l
7

For getting the “clients during the last 20 minutes” it’s almost the same line, but with “-20 min” instead of the “-5 min” statement.

Monitoring via SNMP

Now that I have the logs and count of clients on the NTP server itself, I wanted to get this data into my monitoring system. I decided to use SNMP and its “EXTENDING THE AGENT” section. But before using SNMP, the user account of the snmpd must be able to read the syslog file. Therefore this user must be added to the “adm” group. Please note that on a newer Raspbian the snmpd was not run anymore by a user called “snmp” but “Debian-snmp”. Hence you must add this user to the adm group:

pi@ntp2-gps:~ $ sudo adduser Debian-snmp adm
Adding user `Debian-snmp' to group `adm' ...
Adding user Debian-snmp to group adm
Done.

You can now proceed with the SNMP extensions:

sudo nano /etc/snmp/snmpd.conf

and adding:

extend-sh ufwclients awk -v d1="$(date --date="-5 min" "+%b %_d %H:%M")" -v d2="$(date "+%b %_d %H:%M")" '$0 > d1 && $0 < d2 || $0 ~ d2' /var/log/syslog /var/log/syslog.1 | grep 'UFW ALLOW NTP' | sed s/.*SRC=// | sed s/.DST.*// | sort | uniq | wc -l
extend-sh ufwclients20 awk -v d1="$(date --date="-20 min" "+%b %_d %H:%M")" -v d2="$(date "+%b %_d %H:%M")" '$0 > d1 && $0 < d2 || $0 ~ d2' /var/log/syslog /var/log/syslog.1 | grep 'UFW ALLOW NTP' | sed s/.*SRC=// | sed s/.DST.*// | sort | uniq | wc -l

Followed by a:

sudo service snmpd restart

. From the SNMP server you can walk the OIDs to find the correct ones:

snmpwalk -v 2c -c COMMUNITYSTRING udp6:ntp1.weberlab.de .1.3.6.1.4.1.8072.1.3
[...]
iso.3.6.1.4.1.8072.1.3.2.4.1.2.10.117.102.119.99.108.105.101.110.116.115.1 = STRING: "12"
iso.3.6.1.4.1.8072.1.3.2.4.1.2.12.117.102.119.99.108.105.101.110.116.115.50.48.1 = STRING: "26"
[...}

Note that under heavy load my script takes more than 2 seconds to run. This is no problem for the Pi but for MRTG which uses a default SNMP timeout of 2 seconds. Hence I increased the value to 11 seconds as well as the retries to 5. This value is at the end of the Target line after the second colon, while the last “2” declares SNMP version 2:

::11:5::2

community@router[:[port][:[timeout][:[retries][:[backoff][:[version]]]]][|name]

One more note: This approach will give correct results for IPv6 addresses since every single IPv6 node that sends a request is a unique IPv6 client. If you are using legacy IP the count of unique source IPv4 addresses might not be exactly the count of NTP clients since NAT might hide several NTP clients behind a single IPv4 address. Avoid using IPv4!

Now let’s have a look at the MRTG Target. Following are two different Targets as a reference, plus one summary graph to sum them up:

###############################################################
###################### Clients via UFW ########################
###############################################################
Target[ntp1-dcf77-ufwclients]: 1.3.6.1.4.1.8072.1.3.2.4.1.2.10.117.102.119.99.108.105.101.110.116.115.1&1.3.6.1.4.1.8072.1.3.2.4.1.2.12.117.102.119.99.108.105.101.110.116.115.50.48.1:JESUSISTHEKEY@ntp1.weberlab.de::11:5::2
MaxBytes[ntp1-dcf77-ufwclients]: 64000
Title[ntp1-dcf77-ufwclients]: Unique Source Addresses UFW last 5/20 Min -- ntp1-dcf77
Colours[ntp1-dcf77-ufwclients]: Pink#FF00AA, Darkpurple#7608AA, Yellow#FFD600, Orange#FC7C01
Options[ntp1-dcf77-ufwclients]: gauge integer
YLegend[ntp1-dcf77-ufwclients]: Number of Addresses
Legend1[ntp1-dcf77-ufwclients]: 5min Addresses
Legend2[ntp1-dcf77-ufwclients]: 20min Addresses
Legend3[ntp1-dcf77-ufwclients]: Peak 5min Addresses
Legend4[ntp1-dcf77-ufwclients]: Peak 20min Addresses
LegendI[ntp1-dcf77-ufwclients]: 5min Addresses:
LegendO[ntp1-dcf77-ufwclients]: 20min Addresses:
ShortLegend[ntp1-dcf77-ufwclients]: &nbsp;
routers.cgi*Options[ntp1-dcf77-ufwclients]: maximum nototal nomax
routers.cgi*ShortDesc[ntp1-dcf77-ufwclients]: Clients UFW ntp1-dcf77
routers.cgi*Icon[ntp1-dcf77-ufwclients]: user-sm.gif
routers.cgi*InSummary[ntp1-dcf77-ufwclients]: yes
routers.cgi*Graph[ntp1-dcf77-ufwclients]: ntp-ufwclients

Target[ntp2-gps-ufwclients]: 1.3.6.1.4.1.8072.1.3.2.4.1.2.10.117.102.119.99.108.105.101.110.116.115.1&1.3.6.1.4.1.8072.1.3.2.4.1.2.12.117.102.119.99.108.105.101.110.116.115.50.48.1:JESUSISTHEKEY@ntp2.weberlab.de::11:5::2
MaxBytes[ntp2-gps-ufwclients]: 64000
Title[ntp2-gps-ufwclients]: Unique Source Addresses UFW last 5/20 Min -- ntp2-gps
Colours[ntp2-gps-ufwclients]: Pink#FF00AA, Darkpurple#7608AA, Yellow#FFD600, Orange#FC7C01
Options[ntp2-gps-ufwclients]: gauge integer
YLegend[ntp2-gps-ufwclients]: Number of Addresses
Legend1[ntp2-gps-ufwclients]: 5min Addresses
Legend2[ntp2-gps-ufwclients]: 20min Addresses
Legend3[ntp2-gps-ufwclients]: Peak 5min Addresses
Legend4[ntp2-gps-ufwclients]: Peak 20min Addresses
LegendI[ntp2-gps-ufwclients]: 5min Addresses:
LegendO[ntp2-gps-ufwclients]: 20min Addresses:
ShortLegend[ntp2-gps-ufwclients]: &nbsp;
routers.cgi*Options[ntp2-gps-ufwclients]: maximum nototal nomax
routers.cgi*ShortDesc[ntp2-gps-ufwclients]: Clients UFW ntp2-gps
routers.cgi*Icon[ntp2-gps-ufwclients]: user-sm.gif
routers.cgi*InSummary[ntp2-gps-ufwclients]: yes
routers.cgi*Graph[ntp2-gps-ufwclients]: ntp-ufwclients

routers.cgi*Title[ntp-ufwclients]: NTP Unique Source Addresses UFW last 5/20 Min Summary
routers.cgi*ShortDesc[ntp-ufwclients]: Clients UFW Summary
routers.cgi*Options[ntp-ufwclients]: nototal noi
routers.cgi*Icon[ntp-ufwclients]: user-sm.gif
routers.cgi*InSummary[ntp-ufwclients]: yes

Uff. You’re done. ;) Congratulations!

Sample Graphs

Finally here are some graphs. This one shows normal days with internal NTP clients only (about 50-60). Note the filled pink area that shows the 5 min address count, while the purple line shows the 20 min address count which gives the sum of all current clients:

These graps list the clients during NTP Pool Project participation (10 – 50 k!!!). As always you have the daily/weekly/monthly/yearly graphs to either show overviews or more details:

Finally my summary graph over four different NTP servers, showing only the 20 min address counts, daily view:

Cheers!

Featured image “Wer im Alter nicht nur Peanuts zählen will, sollte sich bereits jetzt um ‘ne vernünftige Altersvorsorge kümmern – später bleibt keine Zeit mehr dafür. Und auch wenn das Thema so dröge wie langweilig erscheinen mag, ist es einfach wichtig, sich zu informie” by ppc1337 is licensed under CC BY-SA 2.0.

↧

Monitoring a DCF77 NTP Server

September 11, 2019, 11:36 pm

≫ Next: Monitoring a GPS NTP Server

≪ Previous: Counting NTP Clients

Now that you’re monitoring the Linux operating system as well as the NTP server basics, it’s interesting to have a look at some more details about the DCF77 receiver. Honestly, there is only one more variable that gives a few details, namely the Clock Status Word and its Event Field. At least you have one more graph in your monitoring system. ;)

This article is one of many blogposts within this NTP series. Please have a look!

There is an event field within the clock status word that consists of these values:

0 nominal
1 no reply to poll
2 bad timecode format
3 hardware or software fault
4 signal loss
5 bad date format
6 bad time format

You can query this via ntpq and its “clockvar” request. Have a look at the first output line. The fourth number after “status=” is the event field, in this example a zero:

weberjoh@vm01-mrtg:/etc/mrtg$ ntpq -c clockvar ntp1.weberlab.de
associd=0 status=0040 , 4 events, clk_unspec,
device="RAW DCF77 CODE (Conrad DCF77 receiver module)",
timecode="-#-####--#--###---M-S12--1-4p-2--1-p1----2--412---1--81---p",
poll=1398, noreply=0, badformat=4, baddata=0, fudgetime1=884.000,
stratum=0, refid=DCFa, flags=0,
refclock_time="e03df9af.00000000  Thu, Mar 21 2019 11:53:19.000",
refclock_status="TIME CODE; (LEAP INDICATION; CALLBIT)",
refclock_format="RAW DCF77 Timecode",
refclock_states="*NOMINAL: 1d+00:43:00 (99.54%); BAD FORMAT: 00:06:51 (0.45%); running time: 1d+00:49:51"

Using grep ‘n sed again you can exempt this single event field id from all the others:

weberjoh@vm01-mrtg:/etc/mrtg$ ntpq -c clockvar ntp1.weberlab.de | grep associd | sed s/.*status....// | sed s/,.*//
0

In the end, my MRTG config for this looks like this:

###############################################################
################ Clock Status für DCF77 0-6 ###################
###############################################################
Target[ntp1-dcf77-clockvar]: `ntpq -c clockvar ntp1.weberlab.de | grep associd | sed s/.*status....// | sed s/,.*// && echo 0`
MaxBytes[ntp1-dcf77-clockvar]: 6
Title[ntp1-dcf77-clockvar]: Status Code Clockvar -- ntp1-dcf77
Options[ntp1-dcf77-clockvar]: gauge integer
YLegend[ntp1-dcf77-clockvar]: Status Code Clockvar 0-6
Legend1[ntp1-dcf77-clockvar]: Status Code
Legend3[ntp1-dcf77-clockvar]: Peak Status Code
LegendI[ntp1-dcf77-clockvar]: Status:
ShortLegend[ntp1-dcf77-clockvar]: &nbsp;
routers.cgi*Options[ntp1-dcf77-clockvar]: fixunit nototal noo nomax
routers.cgi*ShortDesc[ntp1-dcf77-clockvar]: Status DCF77 Clockvar ntp1-dcf77
#routers.cgi*WithPeak[ntp1-dcf77-clockvar]: none
routers.cgi*Icon[ntp1-dcf77-clockvar]: tick-sm.gif
routers.cgi*Comment[ntp1-dcf77-clockvar]: 0=nominal, 1=no reply, 2=bad timecode, 3=hard or software fault, 4=signal loss, 5=bad date, 6=bad time

Note that I am using the “routers.cgi*Comment” option to display a single line with some comments underneath the RRD graphs:

Of course, this clockvar graph correlates with other stats out of the DCF77 NTP server, such as the reach graph or the iostats. For example, I had this “bad timecode” in the above graph for about 1 hour. Similarly, the other graphs show some suspicious values as well, while you know now the actual root cause, namely “bad timecode”.

However, to be honest, I am not looking at this clockvar graph that much, since the reach already shows me whether my server is up and running or not. But, you know, because I can… ;)

Featured image “Dusty” by Thomas Hawk is licensed under CC BY-NC 2.0.

↧

Monitoring a GPS NTP Server

September 19, 2019, 2:49 am

≫ Next: Monitoring a Meinberg LANTIME NTP Server

≪ Previous: Monitoring a DCF77 NTP Server

Beyond monitoring Linux OS and basic NTP statistics of your stratum 1 GPS NTP server, you can get some more values from the GPS receiver itself, namely the number of satellites (active & in view) as well as the GPS fix and dilution of precision aka DOP. This brings a few more graphs and details. Nice. Let’s go:

This article is one of many blogposts within this NTP series. Please have a look!

NMEA 0183

The GPS-based NTP server consists of two inputs: The NMEA sentences to get the date and time (along with much other information as the position, which is the primary usage of GPS) and the PPS pulse per second ticks. In order to get some live stats about the satellites and the precision of the GPS signal, we leverage some of the NMEA sentences. Since the gpsd daemon is already running on the Raspberry Pi, we can easily use another tool called “gas pipe” which simply lists all incoming NMEA data. Sample usage of

gpspipe -r

pi@ntp2-gps:~ $ gpspipe -r
{"class":"VERSION","release":"3.16","rev":"3.16-4","proto_major":3,"proto_minor":11}
{"class":"DEVICES","devices":[{"class":"DEVICE","path":"/dev/ttyAMA0","driver":"u-blox","subtype":"Unknown","activated":"2019-03-21T13:34:24.189Z","flags":1,"native":1,"bps":9600,"parity":"N","stopbits":1,"cycle":1.00,"mincycle":0.25}]}
{"class":"WATCH","enable":true,"json":false,"nmea":true,"raw":0,"scaled":false,"timing":false,"split24":false,"pps":false}
$GPZDA,133425.00,21,03,2019,00,00*6E
$GPGGA,133425,5132.2081,N,01356.7869,E,2,08,1.00,197.16,M,47.711,M,,*4F
$GPRMC,133425,A,5132.2081,N,01356.7869,E,0.0471,136.251,210319,,*2E
$GPGSA,A,3,2,6,12,19,24,25,32,123,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,1.0,1.7*08
$GPGBS,133425,1.95,M,2.98,M,9.89,M*3A
$GPZDA,133426.00,21,03,2019,00,00*6D
$GPGGA,133426,5132.2082,N,01356.7868,E,2,08,1.00,196.74,M,47.711,M,,*4B
$GPRMC,133426,A,5132.2082,N,01356.7868,E,0.0557,219.635,210319,,*22
$GPGSA,A,3,2,6,12,19,24,25,32,123,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,1.0,1.7*08
$GPGSV,4,1,13,02,23,113,40,06,24,077,38,12,83,015,38,14,27,311,18*7C
$GPGSV,4,2,13,17,04,037,21,19,24,044,19,24,50,138,46,25,53,266,32*77
$GPGSV,4,3,13,29,21,198,17,32,36,295,13,123,28,151,41,127,18,126,30*7B
$GPGSV,4,4,13,128,01,102,31*70
$GPZDA,133427.00,21,03,2019,00,00*6C
$GPGGA,133427,5132.2084,N,01356.7867,E,2,08,1.00,196.66,M,47.711,M,,*40
$GPRMC,133427,A,5132.2084,N,01356.7867,E,0.1544,279.026,210319,,*2B
$GPGSA,A,3,2,6,12,19,24,25,29,32,123,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,1.0,1.7*03
$GPGBS,133429,1.95,M,2.98,M,9.89,M*36
$GPZDA,133430.00,21,03,2019,00,00*6A
$GPGGA,133430,5132.2086,N,01356.7865,E,2,08,1.00,195.86,M,47.711,M,,*4B
$GPRMC,133430,A,5132.2086,N,01356.7865,E,0.0151,297.432,210319,,*2D
$GPGSA,A,3,2,6,12,19,24,25,29,32,123,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,1.0,1.7*03
$GPZDA,133431.00,21,03,2019,00,00*6B
$GPGGA,133431,5132.2088,N,01356.7864,E,2,08,1.00,195.57,M,47.711,M,,*49
$GPRMC,133431,A,5132.2088,N,01356.7864,E,0.2654,279.194,210319,,*2A
$GPGSA,A,3,2,6,12,19,24,25,29,32,123,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2.0,1.0,1.7*03
$GPGBS,133431,1.95,M,2.98,M,9.89,M*3F

Piping this output into

grep XY -m 1

to stop after the first output, and piping it again into

awk -F ',' '{print $X}'

to have the comma as separator as well as printing only the X-th position, we can use the following NMEA messages to gather this information:

satellites in view: GPGSV at the fourth position
satellites being active: GPGGA after the seventh comma = position 8
GPS fix (0 = no fix, 1 = 2D fix, 3= 3D fix which is the best): GPGSA at position 3
positional dilution of precision aka PDOP (should be under 10 for a good position, but irrelevant for our NTP timing): GPGSA again, normally at position 15, but for whatever reason on my output it is at position 54.

A sample run with these four values is:

pi@ntp2-gps:~ $ gpspipe -r | grep GPGSV -m 1 | awk -F ',' '{print $4}'
13
pi@ntp2-gps:~ $ gpspipe -r | grep GPGGA -m 1 | awk -F ',' '{print $8}'
08
pi@ntp2-gps:~ $ gpspipe -r | grep GPGSA -m 1 | awk -F ',' '{print $3}'
3
pi@ntp2-gps:~ $ gpspipe -r | grep GPGSA -m 1 | awk -F ',' '{print $54}'
2.4

That’s it from the NTP server’s side of view. Now we have to bring those values into the monitoring server:

SNMP into MRTG

I am still using my fairly outdated MRTG with Routers2 and RRD installation. Please consider using some other tools such as Zabbix, Icinga 2, or PRTG for monitoring purposes. However, the following procedures in getting the raw values from the NTP server into some kind of SNMP monitoring system are the same.

I am using the “EXTENDING THE AGENT” section within the snmpd.conf in order to be able to poll these values through SNMP. That is:

sudo nano /etc/snmp/snmpd.conf

adding the following lines at the EXTENDING THE AGENT section:

extend-sh gpsfix        gpspipe -r | grep GPGSA -m 1 | awk -F ',' '{print $3}'
extend-sh gpspdop       gpspipe -r | grep GPGSA -m 1 | awk -F ',' '{print $54}'
extend-sh gpssatact     gpspipe -r | grep GPGGA -m 1 | awk -F ',' '{print $8}'
extend-sh gpssatview    gpspipe -r | grep GPGSV -m 1 | awk -F ',' '{print $4}'

Followed by a

sudo service snmpd restart

Using snmpwalk on the MRTG server to find the relevant OIDs, such as:

weberjoh@jw-vm01-mrtg:~$ snmpwalk -v 2c -c THISISTHEKEY udp6:ntp2.weberlab.de .1.3.6.1.4.1.8072.1.3.2.3.1.1
iso.3.6.1.4.1.8072.1.3.2.3.1.1.5.116.101.115.116.49 = STRING: "Hello, world!"
iso.3.6.1.4.1.8072.1.3.2.3.1.1.5.116.101.115.116.50 = STRING: "Hello, world!"
iso.3.6.1.4.1.8072.1.3.2.3.1.1.6.103.112.115.102.105.120 = STRING: "2"
iso.3.6.1.4.1.8072.1.3.2.3.1.1.7.103.112.115.112.100.111.112 = STRING: "2.7"
iso.3.6.1.4.1.8072.1.3.2.3.1.1.9.103.112.115.115.97.116.97.99.116 = STRING: "04"
iso.3.6.1.4.1.8072.1.3.2.3.1.1.10.103.112.115.115.97.116.118.105.101.119 = STRING: "13"

Note that some of those used NMEA sentences only appear every few seconds, in my case of GPGSV only every 10 seconds. Since the default timeout for SNMP is much lower you might run into some timeouts. Use the “-t 11” option for snmpwalk, while appending “::11:5::2” at the end of each MRTG target to do the same with an additional retry count of 5 to overcome this issue.

In the end my two MRTG targets for all four values (that is: 2x graphs) are as follows:

Target[ntp2-gps-satellites]: 1.3.6.1.4.1.8072.1.3.2.3.1.1.9.103.112.115.115.97.116.97.99.116&1.3.6.1.4.1.8072.1.3.2.3.1.1.10.103.112.115.115.97.116.118.105.101.119:THISISTHEKEY@ntp2.weberlab.de::11:5::2
MaxBytes[ntp2-gps-satellites]: 24
Title[ntp2-gps-satellites]: GPS Satellites -- ntp2-gps
Colours[ntp2-gps-satellites]: DARKGREEN#006600, PINK#FF00FF, GREEN#00CC00, BLUE#0000FF
Options[ntp2-gps-satellites]: gauge integer
YLegend[ntp2-gps-satellites]: Satellites
Legend1[ntp2-gps-satellites]: Satellites active
Legend2[ntp2-gps-satellites]: Satellites in view
Legend3[ntp2-gps-satellites]: Peak satellites active
Legend4[ntp2-gps-satellites]: Peak satellites in view
LegendI[ntp2-gps-satellites]: Satellites active:
LegendO[ntp2-gps-satellites]: Satellites in view:
ShortLegend[ntp2-gps-satellites]: &nbsp;
routers.cgi*Options[ntp2-gps-satellites]: fixunit nomax nototal
routers.cgi*ShortDesc[ntp2-gps-satellites]: GPS Satellites ntp2-gps
routers.cgi*Icon[ntp2-gps-satellites]: globe-sm.gif

Target[ntp2-gps-fixdop]: 1.3.6.1.4.1.8072.1.3.2.3.1.1.6.103.112.115.102.105.120&1.3.6.1.4.1.8072.1.3.2.3.1.1.7.103.112.115.112.100.111.112:JESUSISTHEKEY@ntp2.weberlab.de::11:5::2
MaxBytes[ntp2-gps-fixdop]: 100
Title[ntp2-gps-fixdop]: GPS Fix 'n DOP -- ntp2-gps
Colours[ntp2-gps-fixdop]: DARKGREEN#006600, PINK#FF00FF, GREEN#00CC00, BLUE#0000FF
Options[ntp2-gps-fixdop]: gauge
YLegend[ntp2-gps-fixdop]: Fix 'n DOP
Legend1[ntp2-gps-fixdop]: Fix
Legend2[ntp2-gps-fixdop]: Positional DOP
Legend3[ntp2-gps-fixdop]: Peak Fix
Legend4[ntp2-gps-fixdop]: Peak Positional DOP
LegendI[ntp2-gps-fixdop]: Fix:
LegendO[ntp2-gps-fixdop]: PDOP:
ShortLegend[ntp2-gps-fixdop]: &nbsp;
routers.cgi*Options[ntp2-gps-fixdop]: fixunit nomax nototal
routers.cgi*ShortDesc[ntp2-gps-fixdop]: GPS Fix 'n DOP ntp2-gps
routers.cgi*HRule[ntp2-gps-fixdop]: 10 "PDOP should be under 10 for a good position (not relevant for timing)"
routers.cgi*Icon[ntp2-gps-fixdop]: globe-sm.gif

This gives the following graphs. Note the daily period for the satellites, depending on the position of the GPS antenna. Satellites graph (in the weekly view):

And the GPS fix and PDOP graph (in the weekly view as well):

Two more graphs, both in the yearly view, that show my change from a small GPS antenna to a bigger one at the beginning of February 2018. While the average number of active satellites increased from about 5 to almost 8, the daily peak for the PDOP decreased from about 33 to under 10. Very good!

That’s it. Have a nice day. :D

Featured image “Cockpit” by Roger Schultz is licensed under CC BY 2.0.

↧

Monitoring a Meinberg LANTIME NTP Server

September 25, 2019, 2:16 am

≫ Next: Using RIPE Atlas for NTP Measurements

≪ Previous: Monitoring a GPS NTP Server

Monitoring a Meinberg LANTIME appliance is much easier than monitoring DIY NTP servers. Why? Because you can use the provided enterprise MIB and load it into your SNMP-based monitoring system. Great. The MIB serves many OIDs such as the firmware version, reference clock state, offset, client requests, and even more specific ones such as “correlation” and “field strength” in case of my phase-modulated DCF77 receiver (which is called “PZF” by Meinberg). And since the LANTIME is built upon Linux, you can use the well-known system and interfaces MIBs as well for basic coverage. Let’s dig into it:

This article is one of many blogposts within this NTP series. Please have a look!

I am working with a Meinberg LANTIME M200 with firmware-build 6.24.021. Unfortunately, I am still using my outdated MRTG with Routers2 and RRDtool installation which is not able to load MIBs. ;D Hence I have constructed a couple of MRTG targets by myself. It was still much easier than using bash snippets with grep ‘n sed or advanced logging features in order to count clients.

Before starting with the monitoring server you must ensure that you’ve enabled SNMP on the appropriate interface and that you’re using SNMPv3 with strong authentication and encryption. (However, I am still using plaintext SNMPv2c. Shame on me.) After that, you can have a look at the SNMP values, for example with the iReasoning MIB Browser that is capable of loading the MIB.

Linux Defaults

At first I followed my basic procedure for adding a Linux host to MRTG. I changed the icon to the clock one:

routers.cgi*Icon: clock-sm.gif

. There is no SWAP available on the LANTIME, hence the following MRTG line throws an error: “MaxBytes2[ntp3.weberlab.de-memory]: 0”. I simply added the same value as for MaxBytes1, though it is not correct. But never mind:

MaxBytes2[ntp3.weberlab.de-memory]: 235347968

. Finally I added the temperature (OID: .1.3.6.1.4.1.5597.30.0.5.2.1.0) such as I am using other temperature graphs, e.g., for the Raspberry Pi. This is the temperature MRTG target:

###############################################################
####################### Temperature ###########################
###############################################################
Target[ntp3.weberlab.de_temp]: 1.3.6.1.4.1.5597.30.0.5.2.1.0&PseudoZero:COMMUNITYSTRING@ntp3.weberlab.de:::::2
MaxBytes[ntp3.weberlab.de_temp]: 150
Title[ntp3.weberlab.de_temp]: Temperature on ntp3.weberlab.de
Options[ntp3.weberlab.de_temp]: gauge
WithPeak[ntp3.weberlab.de_temp]: my
Colours[ntp3.weberlab.de_temp]: Red#FF0000, Blue#0000FF, Darkred#800000, Purple#FF00FF
YLegend[ntp3.weberlab.de_temp]: Temperature °C
Legend1[ntp3.weberlab.de_temp]: Temperature
Legend3[ntp3.weberlab.de_temp]: Peak Temperature
LegendI[ntp3.weberlab.de_temp]: Temperature:
ShortLegend[ntp3.weberlab.de_temp]: °C
routers.cgi*Options[ntp3.weberlab.de_temp]: fixunit nomax nopercentile nototal noo
routers.cgi*ShortDesc[ntp3.weberlab.de_temp]: Temperature
routers.cgi*InSummary[ntp3.weberlab.de_temp]: yes
routers.cgi*Icon[ntp3.weberlab.de_temp]: temp-sm.gif

Up to now I have the following graphs: CPU, load average, free memory, processes, couple of disks, interface, temperature:

Offset

Of course, the most interesting value of a stratum 1 NTP server is the offset – the difference between the local built-in clock and the reference clock, in my case the german DCF77 signal. OID from Meinberg: .1.3.6.1.4.1.5597.30.0.2.4.0. Note that in the following MRTG target I am multiplying the value with 1000 to have it displayed in µs rather than in ms:

###############################################################
################### Offset µ Microseconds #####################
###############################################################
Target[ntp3-pzf-offset-us]: 1.3.6.1.4.1.5597.30.0.2.4.0&PseudoZero:COMMUNITYSTRING@ntp3.weberlab.de:::::2 * 1000
#Max only 0.1 seconds = 100 ms = 100000 us
MaxBytes[ntp3-pzf-offset-us]: 100000
Title[ntp3-pzf-offset-us]: Offset µs -- ntp3-pzf
Options[ntp3-pzf-offset-us]: gauge
Colours[ntp3-pzf-offset-us]: DARKPURPLE#7608AA, Blue#0000FF, BLACK#000000, Purple#FF00FF
YLegend[ntp3-pzf-offset-us]: Offset in microseconds (µs)
Legend1[ntp3-pzf-offset-us]: Offset
Legend3[ntp3-pzf-offset-us]: Peak Offset
LegendI[ntp3-pzf-offset-us]: Offset:
ShortLegend[ntp3-pzf-offset-us]: µs
routers.cgi*Options[ntp3-pzf-offset-us]: fixunit nototal noo
routers.cgi*ShortDesc[ntp3-pzf-offset-us]: Offset µs ntp3-pzf
routers.cgi*Icon[ntp3-pzf-offset-us]: graph-sm.gif

And again, MRTG specific: You must tweak the RRD file in order to store negative values as well:

rrdtool info /var/mrtg/ntp3-pzf-offset-us.rrd
sudo rrdtool tune /var/mrtg/ntp3-pzf-offset-us.rrd --minimum ds0:-100000
rrdtool info /var/mrtg/ntp3-pzf-offset-us.rrd
###
ds[ds0].min = -1.0000000000e+05
ds[ds0].max = 1.0000000000e+05
###

It ends up in this nice graph:

Note that the offset ranges from +/- 1.5 µs with is about 1000 times better than my DIY Raspberry Pi with (amplitude modulated) DCF77 signal!

You might have noticed that I am not graphing the jitter from the LANTIME appliance. This is because the jitter values are not accessible via SNMP. ;( Feature request is pending.

PZF Correlation & Field Strength

There are two more specific status OIDs for the reference clock, in my case a “PZF” antenna, i.e., phase-modulated DCF77. Those two values are:

correlation with a max of 100
field strength with a max of 127

To be honest, I have no idea what these values are about. :D Never mind, I am graphing them:

###############################################################
############# PZF Correlation & Field Strength ################
###############################################################
Target[ntp3-pzf-correlation]: 1.3.6.1.4.1.5597.30.0.1.2.1.6.1&1.3.6.1.4.1.5597.30.0.1.2.1.8.1:COMMUNITYSTRING@ntp3.weberlab.de:::::2
MaxBytes1[ntp3-pzf-correlation]: 100
MaxBytes2[ntp3-pzf-correlation]: 127
Title[ntp3-pzf-correlation]: PZF Correlation & Field Strength -- ntp3-pzf
Colours[ntp3-pzf-correlation]: DARKGREEN#006600, PINK#FF00FF, GREEN#00CC00, BLUE#0000FF
Options[ntp3-pzf-correlation]: gauge integer
YLegend[ntp3-pzf-correlation]: Correlation & Field Strength
Legend1[ntp3-pzf-correlation]: Correlation
Legend2[ntp3-pzf-correlation]: Field Strength
Legend3[ntp3-pzf-correlation]: Peak Correlation
Legend4[ntp3-pzf-correlation]: Peak Field Strength
LegendI[ntp3-pzf-correlation]: Correlation:
LegendO[ntp3-pzf-correlation]: Strength:
ShortLegend[ntp3-pzf-correlation]: &nbsp;
routers.cgi*Options[ntp3-pzf-correlation]: fixunit nototal
routers.cgi*ShortDesc[ntp3-pzf-correlation]: Correlation & Strength ntp3-pzf
routers.cgi*Icon[ntp3-pzf-correlation]: globe-sm.gif

The resulting monthly view looks like this:

Today’s Clients & Requests

Having activated the client list logging at Statistics -> NTP Client List -> Activate Logging with the “Duration of Recording” set to “Continously” you can query the number of today’s clients as well as the total requests.

Note that at least for the latter it’s kind of hard to graph it with MRTG. You can either list them as a gauge which grows always or you can display them like packets per second, that is, requests per second. However, this gives strange values since MRTG calculates them always “per second”. If you have only a couple of NTP clients you will have something like micro-requests per second which doesn’t give a good number.

Anyway, this is my approach with MRTG:

###############################################################
########################## Clients ############################
###############################################################
Target[ntp3-pzf-clientstoday]: 1.3.6.1.4.1.5597.30.0.2.8.8.0&PseudoZero:COMMUNITYSTRING@ntp3.weberlab.de:::::2
MaxBytes[ntp3-pzf-clientstoday]: 65536
Title[ntp3-pzf-clientstoday]: Todays Clients -- ntp3-pzf
Colours[ntp3-pzf-clientstoday]: Pink#FF00AA, Yellow#FFD600, Darkpurple#7608AA, Orange#FC7C01
Options[ntp3-pzf-clientstoday]: gauge integer
YLegend[ntp3-pzf-clientstoday]: Number of Clients
Legend1[ntp3-pzf-clientstoday]: Clients
Legend3[ntp3-pzf-clientstoday]: Peak Clients
LegendI[ntp3-pzf-clientstoday]: Clients:
ShortLegend[ntp3-pzf-clientstoday]: &nbsp;
routers.cgi*Options[ntp3-pzf-clientstoday]: nototal noo
routers.cgi*ShortDesc[ntp3-pzf-clientstoday]: Clients ntp3-pzf Today
routers.cgi*Icon[ntp3-pzf-clientstoday]: user-sm.gif
routers.cgi*InSummary[ntp3-pzf-clientstoday]: yes


###############################################################
######################### Requests ############################
###############################################################
Target[ntp3-pzf-requests]: 1.3.6.1.4.1.5597.30.0.2.8.4.0&PseudoZero:COMMUNITYSTRING@ntp3.weberlab.de:::::2
MaxBytes[ntp3-pzf-requests]: 10000
Title[ntp3-pzf-requests]: Requests -- ntp3-pzf
YLegend[ntp3-pzf-requests]: Requests
Legend1[ntp3-pzf-requests]: Requests
Legend3[ntp3-pzf-requests]: Peak Requests
LegendI[ntp3-pzf-requests]: Requests
ShortLegend[ntp3-pzf-requests]: requests
routers.cgi*Options[ntp3-pzf-requests]: nomax noo
routers.cgi*ShortDesc[ntp3-pzf-requests]: Requests ntp3-pzf

At least the “Today’s Clients” graph gives a realistic view about the clients. Note that my M200 is in the NTP Pool Project. Hence thousands of clients within a couple of seconds, every time my IPv6 address appears in their DNS. This counter is reset every night, hence the drop to 0 at midnight:

The requests somehow correlate to this clients view but are hard to interpret. Please note again that this is a limitation of my MRTG solution and not of the Meinberg counter.

In case anybody’s wondering: I had no performance degradation with the “NTP Client List Logging” on the Meinberg M200, though it is not recommended by the vendor to leave it in the “Continuously” state. I have not seen any issues in the load average / CPU graphs.

Example

Here’s an example in which I used the correlation & field strength graph (left-hand side) since I had a loss of the DCF77 signal during a couple of hours. The right-hand side shows the reach graph from my DIY DCF77 Raspberry Pi NTP server:

Today (2018-12-04) from 2:45 to 6:45am UTC+1 the #DCF77 signal was lost on both of my NTP servers (Meinberg & RaspberryPi, residing on different physical locations). Don't know why. Any ideas? @MeinbergSync @shad0whunter @CharlyKuehnast pic.twitter.com/zryslcG4qt

— Johannes Weber (@webernetz) December 4, 2018

It turned out that there was indeed an outage of the DCF77 signal during that period.

Okay, that’s it. Happy monitoring!

Featured image “Octocopter” by FaceMePLS is licensed under CC BY 2.0.

↧

Using RIPE Atlas for NTP Measurements

October 1, 2019, 5:34 am

≫ Next: Adding your NTP Server to the NTP Pool Project

≪ Previous: Monitoring a Meinberg LANTIME NTP Server

If you are operating a public available NTP server, for example when you’re going to join the NTP Pool Project, you probably want to test whether your server is working correctly. Either with a one-off measurement from hundreds of clients or continuously to keep track of its performance. You can use the RIPE Atlas measurement platform (Wikipedia) for both use cases. Here’s how:

This article is one of many blogposts within this NTP series. Please have a look!

I am hosting two RIPE Atlas probes in different ASes for many years. (By the way: They are the only probes in both ASes.) Hence I have enough credits to run several ongoing tests as well. Have a look at my blogpost about an overview of RIPE Atlas measurements and some stats.

Create a New Measurement

The first step is to create a new measurement. In this example I am testing my Meinberg LANTIME M200 appliance “ntp3.weberlab.de”, which is IPv6-only, hence the address family selection of IPv6. To have a realistic test from all over the world, I am checking the “resolve on probe” box. Note that the “interval” section will disappear when I am enabling the one-off measurement in the third step below:

In the second step, I am selecting the probes. I deleted the default “Worldwide 10” selection and added a manual set with 500 probes worldwide while including the “system-ipv6-works” tag. Otherwise, you’ll have too many errors from not working IPv6 probes:

Finally, I am using a one-off timing and creating the measurement:

Results

Since the measurements are publicly available you can have a look at this just created measurement #20366117 by yourself. Initially, it takes a couple of minutes (of course) until the results are available. In the end, the results were as expected: Very low offsets from all over the world, while a few probes did not succeed at all:

Ongoing Tests & other Examples

Besides this one-off measurements, I have a couple of ongoing tests as well, for example, measurement #22810167 which uses 50 probes worldwide to query my NTP server every 900 seconds. Also note that with the help of these RIPE Atlas measurements I encountered an initial problem with my load balancing setup for NTP, refer to Load Balancing NTP via F5 BIG-IP LTM.

Featured image “oranges – en masse” by Georgie Sharp is licensed under CC BY-NC 2.0.

↧

Adding your NTP Server to the NTP Pool Project

October 10, 2019, 10:43 am

≫ Next: Stats from Participating the NTP Pool Project

≪ Previous: Using RIPE Atlas for NTP Measurements

You have a running NTP server with a static IP address? What about joining the NTP Pool project by adding your server to the pool? You will give something back to the Internet community and feel good about it. ;)

It doesn’t matter if you’re running a Raspberry Pi with GPS/DCF77 on your home, or a fully-featured NTP appliance such as the ones from Meinberg on your enterprise DMZ. Just a few clicks and your server will be used by the NTP Pool’s round-robin DNS. Here’s a simple tutorial:

This article is one of many blogposts within this NTP series. Please have a look!

If you do not have an NTP server yet, follow my setup guides for the Raspberry Pi to build one, either for GPS or DCF77.

Prerequisites

NTP server, obviously, either stratum 1 as mentioned above, or even stratum 2 or 3 with some “good upstream servers” configured manually
A static IP address on the Internet, either IPv6, legacy IP or both
Permanent Internet connection
Motivation to stay within the pool for a couple of years (or at least months)

You don’t need that much bandwidth for this project at all. At least a 1 Mbps ISP connection fits. You can configure the net speed at the pool to match your bandwidth. I myself used four independent IPv6-only NTP servers with a configured net speed of 10 Mbps each, which resulted in only 1 packet/second on average. [Ref: Stats from Participating the NTP Pool Project] You can really handle that. For official information from the NTP Pool project click here: How do I join pool.ntp.org?

Joining the NTP Pool

Now that’s the easy part:

Login to Manage Servers
Add your server by hostname or static IP address
Adjust the zones in which the server is added (in case the auto-detection did something wrong)
Adjust the net speed for your server
Wait until the monitoring system has scored your server with 10 or more (will take a few hours after the first joining)

That’s it. From now on you will receive NTP queries from all over the world, while focused on your zone. For example, my NTP servers in Germany are in the @, europe, and DE zones.

The monitoring score for every single NTP server within the pool is public available at

https://www.ntppool.org/en/scores/IPADDRESS

, e.g. https://www.ntppool.org/en/scores/2a03:4000:35:399::1. It looks like this:

Keep an eye on your score to eliminate any potential errors. However, keep in mind that the scoring system is not that sound as pointed out here: Stats from Participating the NTP Pool Project.

In the end, it’s good for the Internet community to have more NTP servers within the pool. As a private person, you are probably using the pool for many years without even knowing it, such as from your home router, IoT devices, etc. Good to be a part of the NTP Pool from the other side as well. ;D

Featured image “Pool with a view.” by david is licensed under CC BY-NC 2.0.

↧

Stats from Participating the NTP Pool Project

October 14, 2019, 2:01 am

≫ Next: NTP Server’s Delta Time

≪ Previous: Adding your NTP Server to the NTP Pool Project

I am participating in the NTP Pool Project with at least one NTP server at a time. Of course, I am monitoring the count of NTP clients that are accessing my servers with some RRDtool graphs. ;) I was totally surprised that I got quite high peaks for a couple of minutes whenever one of the servers was in the DNS while the overall rate did grow really slowly. I am still not quite sure why this is the case.

For one month I also logged all source IP addresses to gain some more details about its usage. Let’s have a look at some stats:

This article is one of many blogposts within this NTP series. Please have a look!

Prenotes

For this blogpost I took the stats from March 2019. At this time, I had four servers online:
- ntp2: Stratum 1, Raspberry Pi 1 B+ with GPS
- ntp3: Stratum 1, Meinberg M200
- ntp4: Stratum 2, Raspberry Pi 3 B Rev 1.2
- ntp5: Stratum 2, Dell PowerEdge R200, Intel(R) Pentium(R) Dual CPU E2200 @ 2.20GHz, 4 GiB DDR2 Memory
All were listed with a net speed of 10 Mbit/s. My actual ISP speed was 100 Mbit/s.
Since all servers are IPv6 only, it is quite easy to count NTP clients. Every single source IPv6 address is a single client.

Scoring

NTP servers are only used by the round-robin DNS of the pool if they have a score higher than 10. But there are some concerns about this scoring. “Points are deducted if the server can’t be reached or if the time offset is more than 100ms (as measured from the monitoring systems).” More specific: “The monitoring system works roughly like an SNTP (RFC 2030) client, so it is more susceptible by random network latencies between the server and the monitoring system than a regular ntpd server would be.”

In fact, almost once a day my scores drop dramatically, sometimes even below a score of 0, while my NTP servers are fully functional from my point of view. Here’s an example from my server ntp5. The yellow dots (offset) are increasing regularly, while the score dropped between 14-23 o’clock on April 1st, 2019:

Now, this was my point of view from my monitoring station. [Ref: Basic NTP Server Monitoring] Neither the jitter (measured in µs rather than ms!) nor the offset (in ms) had any issues. April 1st, 2019 on the left-hand side of the graphs:

This seems to be related to some routing behavior from Los Angeles (the location of the NTP monitoring station) to my network (DTAG, AS3320). Or generic network congestion. I don’t know.

That is: My overall experience with this score is mixed. To my mind, it is not reliable and should be replaced by a more profound one. Note that this discussion is not new, refer to some threads on the pool mailing list: “Why is my server score suddenly so poor?” or “Decentralised monitoring?“.

NTP Client Stats

At first here is a weekly graph from one of my servers (ntp5) which shows the normal case all over the time. That is: High peaks (up to 30 k), but only for a very small amount of time:

I am wondering why there are so many NTP clients that are querying the servers only *once* at the time the IP address is listed in the DNS. I expected that NTP clients are resolving the DNS and staying on those IP addresses until the next service restart or system reboot. But obviously, they don’t. Any ideas? Bad implementations such as explained here: “How to NOT use the NTP Pool“?

Here is the summary graph of my NTP servers (ntp1, 2, 4, 5) from March 2019. It shows the maximum unique clients = IPv6 source addresses per 20 minutes. [Ref: Counting NTP Clients] That is: Max clients per 20 min is about 30-40 k. Wow.

I have logged all incoming connections through my FortiGate FG-100D firewall to an external syslog-ng server. Hence I could cat ‘n grep through the raw logfiles from this whole month.

TL;DR: Four servers listed with 10 Mbit/s each on the NTP Pool. For one month, each server got 91 k requests per day (avg) = 1.05 requests per second on average. The absolute max requests per second was 1794. Unique IPv6 source addresses over all four servers: 3.8 M.

Some more details with the values per month:

Server	NTP Requests	Unique Sources	Max Requests per Second
ntp2	2556621	973931	1794
ntp3	3091628	1022346	1223
ntp4	3077001	998343	1110
ntp5	2555322	864274	1280
AVG	2820143	964723	-
AVG per Day	90972	31120	-
AVG per Second	1.05	0.36	-

Here are the top 10 queries per second timestamps from one server (ntp2). As you can see, it’s only the top 10 which exceeds the 1000 queries/s rate:

weberjoh@jw-nb10-syslog-mirror:/var/log/firewalls/2003:de:2016::3/2019/03$ cat * | grep "dstip=2003:de:2016:330::6b5:123 dstport=123" | awk '{print $5,$6}' | uniq -c | sort -rg | head
   1794 date=2019-03-08 time=00:04:01
   1702 date=2019-03-08 time=00:03:12
   1547 date=2019-03-08 time=00:03:15
   1528 date=2019-03-08 time=00:03:13
   1444 date=2019-03-08 time=00:03:19
   1280 date=2019-03-08 time=00:03:17
   1266 date=2019-03-08 time=00:04:03
   1130 date=2019-03-08 time=00:03:10
   1064 date=2019-03-08 time=00:03:11
   1061 date=2019-03-08 time=00:04:00

Here’s another analysis. How often do I see how many requests/s. First column: counts per month, second column: queries/s. Listing from ntp2. That is: The vast majority is below 10 queries/s. For example, line 11 reads: 829x per month the server got 10 requests per second. To my mind, that’s not that much.

weberjoh@jw-nb10-syslog-mirror:/var/log/firewalls/2003:de:2016::3/2019/03$ cat * | grep "dstip=2003:de:2016:330::6b5:123 dstport=123" | awk '{print $5,$6}' | uniq -c | sort -rg | awk '{print $1}' | uniq -c | sort -rg | head -20
 705352 1
 158096 2
  34087 3
  10946 4
   6838 5
   3664 6
   2054 7
   1369 8
   1076 9
    829 10
    724 11
    636 12
    573 13
    501 14
    460 15
    435 16
    412 17
    351 18
    349 19
    316 21

This is how I grepped:

### NTP requests per server address:
cat * | grep "dstip=2003:de:2016:333:221:9bff:fefc:8fe1 dstport=123" | awk '{print $15}' | sed s/srcip=// | wc -l

### Unique source IPv6 addresses per server address:
cat * | grep "dstip=2003:de:2016:333:221:9bff:fefc:8fe1 dstport=123" | awk '{print $15}' | sed s/srcip=// | sort | uniq | wc -l

### Queries per second, top 10 per server address:
cat * | grep "dstip=2003:de:2016:333:221:9bff:fefc:8fe1 dstport=123" | awk '{print $5,$6}' | uniq -c | sort -rg | head

### How often are certain queries per second, top 20 per server address:
cat * | grep "dstip=2003:de:2016:333:221:9bff:fefc:8fe1 dstport=123" | awk '{print $5,$6}' | uniq -c | sort -rg | awk '{print $1}' | uniq -c | sort -rg | head -20

### Unique source addresses over all servers for one month:
cat * | grep policy6 | grep "dstport=123" | awk '{print $15}' | sed s/srcip=// | sort | uniq | wc -l

Requests after Leaving the Pool

I had to move my lab to another location with new IPv6 addresses. Hence I had to delete my servers since they are referenced by IP addresses rather than DNS names in the pool.

This is the NTP clients graph (zoomed in) for the first six days after I left the pool:

Clients are decreasing while there are still a couple of hundreds that are constantly using my server. Again, from a technical perspective I was expecting even many more clients using it constantly. I thought that once an NTP client queries the DNS name of the pool it stays on those resolved IP addresses until a reboot of the system. But obviously, this isn’t the case for the majority of clients. One idea: Maybe these clients use ntpdate rather than ntpd which is called every hour via cronjob? In this case, each run would initiate a new DNS query rather than staying on the same NTP server. But that’s just an idea. I have no clue what’s going on there.

Featured image “Hourglass with coins” by Marco Verch is licensed under CC BY 2.0.

↧

NTP Server’s Delta Time

October 17, 2019, 2:11 pm

≫ Next: Incorrect Working IPv6 NTP Clients/Networks

≪ Previous: Stats from Participating the NTP Pool Project

This is a guest blogpost by Jasper Bongertz. His own blog is at blog.packet-foo.com.

Running your own NTP server(s) is usually a good idea. Even better if you know that they’re working correctly and serve their answers efficiently and without a significant delay, even under load. This is how you can use Wireshark to analyze the NTP delta time for NTP servers:

This article is one of many blogposts within this NTP series. Please have a look!

Update your Wireshark, please …

Looking at NTP server request/response performance used to be a little problematic before Wireshark version 3.0, because that’s the version that added a field called “ntp.delta_time”. The problem with delta time calculations is always, that Wireshark needs to do it for request/response packet pairs; it’s not something you can do yourself with a filter. Which means that a developer needs to track and match response packets for requests, calculate the delta time, and put it into a meta field in the response packet decode:

The good thing about those meta fields is that you can use them just like any other field in the decode: you can search for them, filter on them and graph them in the IO-Graph:

By the way, you can lookup all fields Wireshark knows about (including the version numbers needed to see them) at https://www.wireshark.org/docs/dfref/.

Setting up your analysis environment

So, after filtering away everything else you can take a look at the NTP delta times, e.g. by adding a custom column to Wireshark:

You might notice that I forced Wireshark to replace the NTP server IPv6 address with a much shorter name. I did that by putting a hosts file into the “profiles” directory used by Wireshark, which can easily be found by checking out the “About dialog”:

The file itself is just a normal hosts file, like the one used by the operating system, e.g.:

2003:de:2016:330::6b5:123             ntp2
2003:de:2016:330::dcfb:123            ntp3
2003:de:2016:333:1130:d52a:ece2:33fe  ntp4
2003:de:2016:333:221:9bff:fefc:8fe1   ntp5

The advantage of putting it into a Wireshark profile directory is that you don’t have to change your system communication behavior and that you can keep different hosts files per analysis task simply by putting them into different profiles. You might need to enable the network name resolution in the “View Menu” -> “Name Resolution” -> “Enable Network Name Resolution”. To make that permanent you can also configure that setting in the Wireshark preferences dialog.

NTP server performance

One of the things you may want to know about your NTP servers is their performance. So if you capture all their packets you can use the new ntp.delta_time field to easily read the delay for each request. But first, you have to make sure that you only look at requests that were answered by your server – because it may act as a client itself sometimes, which would then muddy the waters.

The testbed

Johannes provided me capture files taken at four of his NTP servers:

ntp2.weberlab.de, 2003:de:2016:330::6b5:123, Raspberry Pi 1 B+ w/ GPS Receiver
ntp3.weberlab.de, 2003:de:2016:330::dcfb:123, Meinberg LANTIME M200 Appliance with a DCF77 Receiver
ntp4.weberlab.de, 2003:de:2016:333:1130:d52a:ece2:33fe, Raspberry Pi 3 B
ntp5.weberlab.de, 2003:de:2016:333:221:9bff:fefc:8fe1, a Dell PowerEdge R200

All four servers were part of the NTP pool project, hence received thousands of requests per minute while within their round-robin DNS.

Isolating NTP server communication

The problem with NTP is that it uses UDP port 123 for both client and server, so it’s less easy to find out who the server and who the client is (compared to, let’s say HTTP – the node responding from port 80 or 443 is the server with very high certainty). Fortunately, NTP has a field that will tell you what kind of message you’re looking at (client or server):

With that field and using the IP addresses of your servers, you can isolate all packets where they either receive a client packet or send a server packet. For example, if your NTP server has the IP address 2003:de:2016:330::dcfb:123 you would use a display filter like this:

(ipv6.dst==2003:de:2016:330::dcfb:123 and ntp.flags.mode==3) or (ipv6.src==2003:de:2016:330::dcfb:123 and ntp.flags.mode==4)

Meaning: all packets sent to that IP address need to be client packets (mode 3), and all packets coming from that IP address need to be server packets (mode 4).

Comparing the response times

Let’s take a look at the “I/O graph” (found in the statistics menu of Wireshark) of the four servers – but I have to warn you:

If you do this with as many requests and servers like I did, chances are high that Wireshark will crash under some circumstances. I found it best to let it graph everything before changing settings or closing the graph dialog window again. It’s also a good idea to prepare the I/O graph first without letting it draw anything (don’t use the checkboxes on the left at first) by entering all the settings you need, then close Wireshark to make it store the setup. That way a crash doesn’t mean you have to redo the setup each time, which can be very annoying for complicated settings.

I have graphed all four server NTP deltas in separate graphs with different colors, and opted for the logarithmic scale because otherwise we would only see a couple of peaks and not much else:

I think it’s pretty obvious that NTP2 is by far the slowest server under load, compared to the graph showing the amount of request packets each server had to respond to. This graph isn’t logarithmic for the simple reason that I wanted to show the peaks prominently (and that the red graph drowns out everything in the lower ranges):

As you can see the delta time peaks match with heavy loads of the servers, meaning that when many requests arrive all servers show higher delays in their response time.

Now let’s take a closer look at the peak of the red and blue servers (NTP2 and NTP5) right next to each other, around 2pm. Zoomed in it looks like this for the response time:

And this is the packet ratio:

So, we can deduce that the packet ratio in both bursts is quite similar, but NTP2 has much more significant delays in its answer packets. This isn’t surprising because a Raspberry Pie 1 is like David vs the Dell PowerEdge Goliath. And from my point of view, David is still doing a pretty good job for a small system like that.

Meinberg: Client Logging vs. No Client Logging

The Meinberg NTP server has a feature to log client requests, which could introduce additional stress to a busy server. To be able to compare the delta times of the server with both client logging enabled and disabled I chose two packet peaks that were as similar as possible within the capture files I head, close to 100 packets/s. As you can see in the delta time graphs it doesn’t really matter that much if client logging is enabled or not; the delta time increase doesn’t look significant with an increase of only around 10ms. On higher loads, this might get worse, so you might want to disable Client logging when you don’t really have a use case for it.

Packet Ratio, with client logging enabled:

Packet Ratio, without client logging:

Client Logging enabled, Delta Time:

Client Logging not enabled, Delta Time:

NTP Server Performance Min/Max/Average

Comparing the performance of the four servers was performed by pulling all response times out of the capture files using tshark, e.g. for NTP2:

tshark -r "NTP2Only.pcapng" -2 -Y "ntp.flags.mode==4" -Tfields -e ntp.delta_time > NTP2Delta.csv

The parameters used here were:

-r <filename>: reads a file instead of capturing from a network card
-2: forces a 2-pass analysis, which is important to get the delta time calculations. Without it, the field would stay empty (thanks to @PacketDetective to remind me)
-Y <filter>: filters a file, in this case for all packets that are answers from the server
-TFields: asks tshark to print only specific fields
-e <fieldname>: specifies the fields I need, in this case, the NTP delta time

Sending those field values into a CSV file, one per server, allowed me to use Excel to generate Min/Max/Average values for me:

	NTP2	NTP3	NTP4	NTP5
	Raspi 1 B+	Meinberg M200	Raspi 3 B	Dell PowerEdge R200
Min [ms]	0,827	0,146	0,351	0,199
Max [ms]	29,432	15,127	11,452	4,422
Average [ms]	1,461	0,566	0,790	0,425

The fastest server – unsurprisingly – is the Dell PowerEdge server. The slowest is the Raspberry Pi 1, being about three times slower than the Dell on average.

Photo by Markus Lompa on Unsplash.

↧

Incorrect Working IPv6 NTP Clients/Networks

October 23, 2019, 2:59 am

≫ Next: Network Time Security – New NTP Authentication Mechanism

≪ Previous: NTP Server’s Delta Time

During my analysis of NTP and its traffic to my NTP servers listed in the NTP Pool Project I discovered many ICMP error messages coming back to my servers such as port unreachables, address unreachables, time exceeded or administratively prohibited. Strange. In summary, more than 3 % of IPv6-enabled NTP clients failed in getting answers from my servers. Let’s have a closer look:

This article is one of many blogposts within this NTP series. Please have a look!

I saw those ICMP packets in my traces for a while but did not think about it until I read this article from Heiko Gerstung: How to NOT use the NTP Pool. “[…] It sends an NTP client request, waits for a specific amount of time for the response and, if the response does not arrive within this time frame, closes the port and stops listening. If an NTP response arrives after the device stopped waiting for it, an ICMP Port Unreachable error message is sent to the sender of the NTP response, creating even more unnecessary traffic […].”

I wanted to have a deeper look into it. I captured all NTP traffic for 24 hours coming into my four NTP servers listed at the NTP Pool Project. (I was using my ProfiShark 1G in front of the FortiGate FG-100D that had those servers behind it.) I was quite astonished how many different ICMP error codes I found. Note that since I am using IPv6-only, you’ll only see ICMPv6 messages rather than legacy IP ones. Following is my analysis:

TL;DR: Within 24 hours my four NTP servers received 10923 ICMPv6 error messages from 5193 sources. This is about 3.11 % of all NTP clients. Seven different failure types were received. I did not expect so many and different error types and codes.

ICMP Errors ‘n Errors ‘n Errors

This is how my NTP tracefile looks in Wireshark. Different errors distributed all over the trace. Second screenshot filtered for “icmpv6” and a special custom column “ipv6.dst” but with 2nd field occurrence to display the original source that triggered the ICMPv6 error as my servers tried to reply:

Leveraging tshark I was getting some stats about it:

During that 24 h period 166471 unique source addresses queried my NTP servers
My servers received ICMPv6 error packets from 5193 different sources!
Doing the math it’s about 3.11 % of all NTP clients. Quite high to my mind.
In summary, they received 10923 ICMPv6 error messages.

The distribution of those ICMPv6 types and codes was:

Count  Type  Code  Meaning
    385  1     0     no route to destination
    291  1     1     communication with destination administratively prohibited
   9839  1     3     address unreachable
    367  1     4     port unreachable
      3  1     5     source address failed ingress/egress policy
      1  1     6     reject route to destination
     37  3     0     hop limit exceeded in transit

WHAT? This is almost every destination unreachable code that is available! ;D Uhm. This was not expected.

For the sake of completeness: This is how I used tshark along with sort, uniq, etc.:

### Number of unique IPv6 source addresses which queried one of my four NTP servers
tshark -r ntp-outside-fortigate-MERGED-24h-nur-NTP2345-mit-ICMPv6.pcapng -Y "!icmpv6 && (ipv6.dst == 2003:de:2016:330::6b5:123 || ipv6.dst == 2003:de:2016:330::dcfb:123 || ipv6.dst == 2003:de:2016:333:1130:d52a:ece2:33fe || ipv6.dst == 2003:de:2016:333:221:9bff:fefc:8fe1)" -T fields -e ipv6.src | sort | uniq | wc -l

### Number of unique IPv6 addresses that sent an ICMPv6 error
tshark -r ntp-outside-fortigate-MERGED-24h-nur-NTP2345-mit-ICMPv6.pcapng -Y "icmpv6" -T fields -e ipv6.src | sort | uniq | wc -l

### Count of different error codes
tshark -r ntp-outside-fortigate-MERGED-24h-nur-NTP2345-mit-ICMPv6.pcapng -Y "icmpv6" -T fields -e icmpv6.type -e icmpv6.code | sort | uniq -c

    385 1       0
    291 1       1
   9839 1       3
    367 1       4
      3 1       5
      1 1       6
     37 3       0

Why?

To my mind, this is not only related to failures in NTP clients but to different IPv6 misbehavior in general. (For further reading, have a look at one of many other articles teaching the different types in general.)

no route to destination: Seems to be a generic IPv6 problem based on the router to which the end user network is attached to. Or the source address was spoofed and never valid at all. Looking into my trace: Many different sources were affected by this error while it was almost one single v6 router that sent out these errors. So maybe a single central router having routing issues?
communication with destination administratively prohibited: Maybe some middleboxes (routers, firewalls) that do not work stateful? Probably not an NTP client problem but an administrative network issue. Looking into my trace this is almost one single source that triggered those errors. Hence no general problem.
address unreachable: Again, generic IPv6 problem. For example when the layer 2 address is not resolvable (neighbor solicitation). But that many? I have no idea why.
port unreachable: Ok, this seems to be related to bad NTP client configurations such as shown here.
source address failed ingress/egress policy: Why should someone block my source address?
reject route to destination: Found that single one in my trace: The original source for this NTP query was “fdef:ffc0:4fff:1:dd91:bb7:2a77:d84”. This was discarded by my firewall on purpose. Again an IPv6 issue since someone configured this invalid source address.
hop limit exceeded in transit: Again a generic IPv6 routing issue. But those affected clients can’t reach the Internet at all, can’t they?

If you want to have a look at those error messages you can download the trace file (with only the errors). 7zipped, 517 kb, 10923 packets:

(Trivia: Wireshark Filtering)

Note that I ran into some problems using Wireshark and tshark with its display filters since those are used for the ICMPv6 packets itself as well as the quoted original IPv6 packet (in my case: NTP). Complicated discussion about this on Twitter:

Need some help on #Wireshark filters for ICMPv6 messages: How can I distinguish between the src/dst from the ICMPv6 message vs. the original src/dst that's included in the packet as well? "Apply as Filter" is the same ipv6.dst ;( @PacketJay @SYNbit @geraldcombs @JeffCarrell_v6 pic.twitter.com/t6YLaGlv7c

— Johannes Weber (@webernetz) April 4, 2019

Some possible solutions:

@webernetz. I have got a solution for tshark using "-T fields".
1) Create a column in Default Profile. Sample:
– Title: Source2, Type: Custom, Fields: ipv6.src, Occurrence: 2
2) Use -T fields -e _ws.col.Source2 to print the inner IPv6 source address from the icmp error message.

— Matthias Kaiser (@wiresharky) April 4, 2019

OK. What about including the Ethernet source (or dest) address in the filter to help filter by direction? Something like so?:

icmpv6 && (ipv6.dst == …) && (eth.src == 00:11:22:33:44:55)

— Christopher Maynard (@chrisjmaynard) April 4, 2019

Conclusion

While I “just wanted to have a quick look at the ICMP errors” it took me a couple of hours to get just a little bit of knowledge out of this trace file. Sigh. I am not sure whether all my thoughts are 100 % correct.

Anyway, there are many different scenarios that lead to failures when it comes to IPv6 enabled NTP clients. To my mind, an error rate of more than 3 % is quite bad.

Featured image “3 o’clock” by Hani Amir is licensed under CC BY-NC-ND 2.0.

↧

Network Time Security – New NTP Authentication Mechanism

October 29, 2019, 4:39 am

≫ Next: Network Time Security – Strengths & Weaknesses

≪ Previous: Incorrect Working IPv6 NTP Clients/Networks

This is a guest blogpost by Martin Langer, Ph.D. student for “Secured Time Synchronization Using Packet-Based Time Protocols” at Ostfalia University of Applied Sciences, Germany.

In many areas, the use of authentication mechanisms in NTP is important to prevent the manipulation of time information by an attacker. For many years, NTP has been offering solutions such as a Symmetric Key based method and the Autokey approach. However, both have serious disadvantages, for which reason they are rarely used to secure NTP connections. After years of development, a new standard is to be adopted in 2020 that solves the problems of the current mechanisms and offers a real alternative. First implementations of the so-called Network Time Security protocol (NTS) are already available and interoperate with each other …

This article is one of many blogposts within this NTP series. Please have a look!

TL;DR: In a nutshell: NTS is a new authentication scheme for NTP and fixes many issues of the previous security methods. It uses a separate TLS connection for the initial parameter and key exchange. The subsequent NTP connection is then secured by NTS extension fields. The functionality of NTP remains untouched and the time data is not encrypted by NTS – but authenticated.

NTP – An important protocol with
insufficient security

The introductory article on NTP makes it very clear: Time is important to ensure the functionality of devices and processes. But what is the use of time synchronization if the time can be arbitrarily changed by attackers? Nothing at all! To protect this time information, NTP already offers two authentication modes in its current version 4. One of them is the older and still secure symmetric key approach, which unfortunately has a significant disadvantage: it does not scale. The usage of pre-shared keys always requires the manual configuration of the client depending on the server. Simple adding of new clients is therefore not possible and changes of the server-side keys result in adjustments to all clients. These problems were solved by the Autokey method, which showed other serious design flaws after an analysis in 2012. With this method, attackers can break a secured connection in a few seconds and modify the time data in the NTP packets. Therefore, the built-in solutions in NTP do not provide a satisfactory protection mechanism.

But what about alternatives like NTP over TLS or tlsdate? Unfortunately, TCP-based connections and tunneling concepts increase the latency and NTP packet runtimes. Even if these solutions offer higher security, this is only accompanied by the loss of synchronization accuracy and greater time fluctuations. The same applies to the alternative time service tlsdate, which also no longer works with the current TLS version 1.3.

Network Time Security (NTS), a new
solution for NTP (overview)

The lack of security mechanisms in NTP already led to the development of the Autokey v2 specification in 2012. Due to the bad reputation of the previous version and the fundamentally different communication structure, Autokey v2 was renamed to Network Time Security a short time later. After years of development and a second revision of the NTS protocol [draft-06, draft-20], it is now nearing its completion.

The NTS protocol is a security extension for time protocols and currently focuses on NTP in unicast mode. It provides strong cryptographic protection against packet manipulation, prevents tracking, scales, is robust against packet loss and minimizes the loss of accuracy due to the securing process. To protect the time information, NTS uses the NTP Extension Fields (EF), in which parameter and status information are also transferred between client and time server. The secured time protocol remains untouched so that the usage of NTS in other protocols (e.g. the Precision Time Protocol (PTP)) is possible as well. This also means that the time data is not encrypted by NTS – but authenticated.

NTS consists of sub-protocols, which currently form two phases of communication (see figure 1). The first phase takes place once at the beginning of the communication and serves the negotiation of parameters as well as the exchange of key material in the form of cookies. In the second phase, the NTS-secured NTP connection takes place. For this purpose, the client uses the cookies provided by the server, which it attaches to the NTP requests. The client remains in this phase until the connection is terminated or if cookies are no longer available due to repeated packet loss. In this case, the first phase is executed again.

Figure 1: Phases of the NTS secured communication

How does NTS work?

At this point, I will go into a little more detail by using a communication run.

Phase 1 – NTS Key Establishment (NTS-KE)

The first phase takes place via a TLS1.2/TLS1.3 connection on a separate TCP channel to protect the initial data exchange from manipulation (see figure 2). Thus NTP shifts the entire overhead of the parameter negotiation to the well-established TLS communication and prevents possible design mistakes when implementing an own handshake solution via NTP. Potential fragmentation of IP packets, e.g. during the transmission of large certificates, is therefore excluded. This procedure also allows the easy use of the PKI structure and the reliable checking of the time server, as long as the certificate issuer is trustworthy.

Figure 2: Separate communication channels between client and server

After completion of the TLS handshake and verification of the certificates, the negotiation of the NTS parameters takes place. This is done with so-called NTS Records (or rather TLS records) via the TLS Application Data Protocol. Among other things, the records contain connection information, crypto algorithms and a set of cookies (see figure 3a, 3b).

Figure 3a: NTS-KE phase: initial parameter and key negotiation (request message)

Figure 3b: NTS-KE phase: initial parameter and key negotiation (response message)

The connection information allows the optional negotiation of the destination server address (IPv4/IPv6; UDP port). In this case, the NTS-KE can also assign another time server independently of the desired target server of the client. This concept makes a separation of the NTS-KE from the time server possible and enables load balancing. Moreover, this can be done logically on the same machine as well as physically on separated machines (as in Figure 4). In this way, several time servers can share one NTS-KE server. If no connection information is negotiated, the NTP time server can be reached on the same IP address as the NTS-KE with the standard UDP port 123.

Figure 4: Separation of NTS-KE server and NTP time server

The crypto algorithms used are AEAD methods, which are applied for the later protection of the NTP packets. These algorithms use symmetric cryptography to protect the integrity of NTP packets and enable the optional encryption of data. Currently, the AEAD algorithm AES-SIV [RFC 5297] is defined in NTS, which is insensitive to the reuse of an already used nonce.

To ensure that the time server can work statelessly, so-called cookies are used. These contain key material from the TLS connection, the negotiated crypto algorithm, and some further parameters. Both the NTS-KE server and the time server can generate the cookies and encrypt them with a secret master key. The structure as well as the content of the cookies are not transparent for a client and depend on the server implementation. The encrypted cookies differ from each other, preventing the tracking of mobile devices (e.g. smartphones) across multiple networks at the NTS level. In addition, cookies have a lifetime determined by the server. To do this, the server generates a new master key at regular intervals but accepts older cookies for another 1 to 2 rotation periods, which can take place daily. This allows a smooth key refresh without the need for a new NTS-KE.

If the client has received the cookies and parameters from the NTS-KE server, it also extracts the key material from the TLS connection and disconnects it afterward. The negotiation is now complete and phase 2 begins.

Phase 2 – NTS-secured NTP

In the second phase, the client communicates to the assigned time server via NTP. The client generates NTP requests, which are extended by the NTS content in the form of NTP Extension Fields (EF). An NTS-secured NTP request typically contains 3 to 10 extension fields, while a response packet contains two EFs (see figure 5). Currently, four NTS-EFs are defined and follow a TLV-like (Type-Length-Value) data format.

The first NTS-EF is the Unique Identifier EF. It contains random data and implements replay protection at the NTS level.

This is followed by the NTS cookie EF, which contains the oldest cookie from the client’s pool. Similar to a nonce, cookies are only used once in a request to prevent tracking. If an NTS request is sent, the cookie contained in the request is considered to be consumed.

A client always tries to hold a set of 8 cookies. This is achieved by the server returning a fresh cookie in its response and thus keeping the balance. However, if a packet loss occurs or an invalid message is discarded, cookies are missing in the pool afterward. In this case, the client inserts one or more NTS Cookie Placeholder EF in the following request. For each placeholder EF, the client receives an additional cookie from the server response. The size of the placeholder EF is identical to the size of the NTS cookie EF, but does not contain any data. This guarantees that the size of the request and response messages is always the same and that amplification attacks are not possible.

The last extension field is the NTS Authenticator and Encrypted EF. This implements the integrity protection (so-called authentication tag) over the NTP header and all previous extension fields. In addition to the authentication tag, it can also contain encrypted extension fields that are usually not contained in request packages. A finished NTS-secured NTP packet is much larger than an unsecured NTP packet with 48 bytes (NTP header only). An NTS secured packet typically varies from 228 bytes to 1468 bytes (NTP header + NTS EFs). If a possible IP fragmentation threatens, clients can request fewer cookies in case of packet loss.

If the server receives an NTS-secured request, it first decrypts the cookie with the master key and extracts the negotiated AEAD algorithm therein, as well as the keys contained. With this, the server now checks the integrity of the NTP packet to ensure that it is not manipulated. On success, the server generates one or more new cookies and creates the NTP response packet afterward. This always contains two NTS-EFs: The first is the Unique Identifier EF, which is taken unchanged from the request packet. The second is the NTS Authenticator and Encrypted EF, which secures the NTP packet and the previous EFs using the extracted keys, in the same way as the request. However, unlike the client, the server encrypts the cookies that are now included in this Extension Field. This procedure also protects the client from tracking because an attacker cannot extract the cookies from a response message. If the packet is finalized, it is sent back to the client.

After receiving the NTP packet, the client checks the Unique Identifier EF first. If a replay attack is excluded, the integrity check of the packet takes place. The client already knows the required key and the AEAD algorithm. If the check is successful, the encrypted cookies are automatically decrypted and exported. The client adds the new cookies to its pool and releases the time information for synchronization with NTP.

This communication process is repeated with the next request. In order to reduce the execution of the first phase, clients can also store whole sessions or cookies locally on the hard disk when the time service is stopped. This enables a reestablishment of the connection at a later time. It is also possible to send the same cookie repeatedly in case of connection problems, as these do not lose their validity after use. However, this is at the expense of data privacy, as it enables the tracking of a client.

NTS implementations and interoperability

Currently (Q4/2019), there are seven known implementations of NTS, which are in different stages of development. These include NTPsec, Ostfalia, Cloudflare, and Chrony (Red Hat). The first proof-of-concept implementation was already developed by the Ostfalia University in cooperation with PTB in Germany in 2015. Test servers and test implementations have been publicly accessible since 2018 and the software is open source [ntp, nts]. NTPsec offers a further NTS implementation that is fully executable. However, it should be noted that the NTS specification and thus the implementations have not yet been completed. The use is therefore at your own risk.

Since 2018, the Internet Engineering Task Force (IETF) has carried out several hackathons [IETF 101, 102, 104, 105] to test the interoperability. Four independent implementations have already passed these tests and problems in the current NTS specification were no longer found. The merged results can be viewed in Table 1 below:

Table 1: Summary of IETF Hackathon results (NTS interoperability test)

Featured image “Gold Lock” by Mark Fischer is licensed under CC BY-SA 2.0.

↧

Network Time Security – Strengths & Weaknesses

November 6, 2019, 3:44 pm

≫ Next: Basic TCP and UDP Demos w/ netcat and telnet

≪ Previous: Network Time Security – New NTP Authentication Mechanism

This is a guest blogpost by Martin Langer, Ph.D. student for “Secured Time Synchronization Using Packet-Based Time Protocols” at Ostfalia University of Applied Sciences, Germany.

The Network Time Security protocol (NTS) is close to completion as an Internet standard and will replace the existing security mechanisms in NTP. The introductory article on NTS describes the basic communication process as well as the most important features. Despite high-security efforts, NTS also has its limitations. In this blogpost, I list the strengths and weaknesses of the new authentication mechanism and describe them briefly.

This article is one of many blogposts within this NTP series. Please have a look!

Overview of the NTS Properties

To start off, Table 1 summarizes the main properties of NTS, which I will describe in more detail below. Since I was involved in the standardization process from the beginning, I have a good overview of NTS. Nevertheless, I cannot guarantee the completeness of the following content.

Table 1: Strengths and weaknesses of NTS

The Strengths of NTS

In general, NTS contains many design decisions to ensure the best and most efficient protection of the time protocol to be secured. At this point, I would like to emphasize the features of NTS that provide the most important advantages.

Defense Against Known Attacks

There are a lot of known attacks against NTP, ranging from time shifting to the complete deactivation of the NTP client (e.g. using Kiss-o’-Death (KoD) packets). Furthermore, NTP servers have been abused repeatedly in the past to perform DDoS attacks.

NTS aims to counteract these attacks with cryptographic measures. Authenticated packets prevent spoofing and the manipulation of time data. Cryptographically generated identifiers (UID) provide replay protection and identical sizes of request and response packets prevent DDoS amplification attacks.

Design of the Key Establishment (KE)

The decoupling of the initial key establishment (phase 1) from the NTS-secured NTP connection (phase 2) offers several advantages. One of them is the optional physical separation of the two phases on different machines. This method shifts the entire overhead from the time server to the KE server and allows load balancing. Phase 1 also uses the well-established TLS 1.2/1.3 and guarantees the source authentication of the time server. Moreover, crypto algorithms and connection information to the time server can be negotiated dynamically.

Good Scalability

The KE phase also realizes another fundamental property: scalability. A network with NTS-capable clients and servers can grow flexibly so that the manual configuration of new clients is no longer necessary. In addition, both the client and server can manage several connections simultaneously. Furthermore, NTP implementations can support a hybrid mode to allow unsecured and NTS-secured connections at the same time. Of course, unsecured connections only make sense in a trusted local network.

The AEAD Algorithm (AES-SIV)

Another central element in NTS is certainly the AEAD algorithm SIV-AES, which is used for the construction of cookies and ensures the integrity of the NTP packets. The algorithm uses symmetric cryptography, benefits from AES-based hardware acceleration (AES-NI), enables the optional encryption of NTP extension fields, and is robust against the reuse of a nonce. Key lengths of 256, 384 and 512 bits are currently used in NTS. However, extending it with additional AEAD algorithms is easily possible.

Dynamic Key Rotation / Key Freshness

A preventive measure against brute-force attacks on generated cookies is the dynamic key rotation. Time servers renew the master key required for cookie creation at regular intervals. However, older keys remain valid for a few rotation periods to verify request packages with older cookies. This also allows a smooth key exchange between server and clients without the need to re-execute the KE phase.

If KE server and time server are separated, the same master key can be deterministically derived on both machines using an HKDF algorithm, without the necessity of communicating with each other. However, the generation of a new key based on a cryptographically secure random number (CSPRNG) always offers higher security.

Unlinkability / Privacy Protection

An NTS-secured request package realizes privacy protection by the use of cookies, which differ from each other in the binary data and only be used once. Therefore, the client discards every cookie that has already been used in a request. The server replaces used cookies with new ones, which it sends back to the client in encrypted form. This means that access by a potential attacker is not possible, which prevents the tracking of mobile devices (e.g. notebooks or smartphones) across different networks.

NTP implementations such as NTPsec use Transmit Timestamp Randomization to ensure this protection at the NTP level too. For this purpose, the client doesn’t write the transmission time into the NTP packet, but random data.

Stateless Server

NTS servers always operate statelessly and can serve any number of clients without generating an overhead. The necessary status information is stored in the cookies and is always transmitted back to the client.

Robust Against Packet Loss

An NTS-secured NTP client always tries to hold a set of eight cookies. This allows the client to send further NTP requests with unused cookies in case of packet loss. Missing cookies are then refilled by the server, which means that the KE phase does not need to be executed again.

Optional Resilience

However, if all cookies are lost due to packet loss, the client can decide whether to send the last cookie repeatedly. This avoids the execution of the KE phase but allows the client to be tracked through different networks (loss of privacy protection).

Independent of this, the client can store the cookies locally when the time service is stopped in order to re-establish the server connection without a KE phase at a restart.

Disadvantages and Weaknesses

Although NTS offers many advantages, some limitations need to be considered:

Delay attacks

One of the biggest problems is delay attacks, as they cannot be solved even with the best cryptography. NTP packets delayed by an attacker are still valid and difficult to detect. This usually leads to asymmetric packet runtimes, which results in a systematic time offset of the client. Therefore, an attacker can always delay the time of a client. An effective countermeasure is a limitation of the packet round-trip time (RTT). The maximum possible time offset of the clients corresponds to about half of the RTT. If the maximum permitted runtime is set to 300 milliseconds, the possible time offset of the client is limited to approximately +/-150 milliseconds. A further measure is the usage of several time servers and different communication paths (see also RFC 8633, RFC 7384).

Reduction of the Synchronization Accuracy

Another disadvantage is the reduction of synchronization accuracy by NTS. This can be minimized, but not completely prevented. Securing of the NTP packets is only possible after writing the timestamp. This means that the NTS protection time (the latency) flows completely into the calculated packet runtime. The fact that these times vary slightly between the client and server leads to a small RTT asymmetry and therefore to a systematic error. According to initial tests, this error is typically between 20 and 100 µs, depending on the implementation and platform. Since the fluctuations are dominated by the Internet, these deviations are usually not relevant for most applications.

Only for NTP in Unicast Mode

The NTS protocol is currently designed for unicast only. Support for broadcast connections is not possible with this concept since a connected slave cannot send cookies to its master. A TESLA-based procedure (see RFC 4082), which has not yet been pursued further, may be suitable for this. Since broadcast connections are rarely used and are frequently filtered over the Internet, this is less urgent.

The Chicken-or-Egg Problem

The NTS-KE phase requires valid server certificates for a successful TLS handshake. However, if the client’s time is not within the validity range of the certificate, the verification fails (even if the server certificate is valid). This completely cancels the current KE phase and an NTS-secured time synchronization never takes place. However, deviations of a few months are normally not a problem. Only in the case of very large time deviations (e.g. when setting up a device for the first time), it is necessary to set the date manually or to perform a one-time unsecured NTP synchronization.

For computer systems without an RTC (e.g. IoT devices or developer boards) this problem goes deeper. Such devices often lose their time after shutdown and reset it to a default value after a restart (e.g. 1970/01/01). As a result, an NTS-secured NTP connection would no longer work because the date is out of the certificate’s validity range. To counteract this, NTS implementations can store time information at regular intervals (e.g. every hour) and restore it after a system start. Storing a trusted date would be sufficient to ensure the verification of the certificates.

No End-to-End Security

NTS guarantees the integrity of transmitted time information between two devices. These are usually a client and a server. If the server synchronizes itself with other lower strata servers (higher accuracy), it can of course also use NTS. However, there is no guarantee that the clock on the server cannot be manipulated.

Higher Network Traffic / Larger NTP Packets

NTS-secured NTP packets cause a significantly higher network load compared to unsecured NTP communication. While unsecured NTP packets have a size of 48 bytes, NTS-secured NTP packets typically have a size of between 228 and 1468 bytes.

This is not a problem for the client since the transmission intervals are typically between 16 and 128 seconds. On the other hand, for large time servers that answer thousands of requests per second, this can be a greater burden.

Impl. dependent: Cookie Construction / KE Separation

The structure of the cookie described in the NTS specification is not normative. Implementations could follow a completely different structure. Furthermore, the communication between the time server and the KE server depends on the implementation, as long as they are physically separated. A bad software design could allow attacks and compromise security.

Impl. dependent: NTS Stripping Attack / Downgrade Attack

This is not a direct disadvantage of NTS but should be considered. A DoS attack on the client, such as intercepting and discarding NTP packets, can force the execution of the KE phase. In this case, good implementations should not react by lowering the protection mechanisms (e.g. TLS 1.2 instead of TLS 1.3) or switch entirely to unsecured NTP communication.

Well, you did it. ;)

Photo by Alora Griffiths on Unsplash.

↧

Basic TCP and UDP Demos w/ netcat and telnet

November 11, 2019, 11:37 pm

≫ Next: Intro to NetworkMiner

≪ Previous: Network Time Security – Strengths & Weaknesses

I am currently working on a network & security training, module “OSI Layer 4 – Transport”. Therefore I made a very basic demo of a TCP and UDP connection in order to see the common “SYN, SYN-ACK, ACK” for TCP while none of them for UDP, “Follow TCP/UDP Stream” in Wireshark, and so on. I wanted to show that it’s not that complicated at all. Every common application/service simply uses these data streams to transfer data aka bytes between a client and a server.

That is: Here are the Linux commands for basic lab, a downloadable pcap, and, as always, some Wireshark screenshots:

TCP

Listening with netcat on the server on port 1337:

netcat -6 -l 1337

Verifying the listening port:

netstat -tulpen6

In my case, this looks like:

weberjoh@nb15-lx:~$ netstat -tulpen6
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       User       Inode       PID/Program name
tcp6       0      0 :::22                   :::*                    LISTEN      0          21160       -
tcp6       0      0 :::1337                 :::*                    LISTEN      1000       1490116     20122/netcat
udp6       0      0 fe80::d6be:d9ff:fe4:123 :::*                                0          22715       -
udp6       0      0 2001:470:765b::b15::123 :::*                                0          22713       -
udp6       0      0 ::1:123                 :::*                                0          22711       -
udp6       0      0 :::123                  :::*                                0          22699       -

Now connecting from the client to the server with telnet:

telnet <ip> <port>

In my case, along with some text messages in both directions:

weberjoh@vm24-ns0:~$ telnet 2001:470:765b::b15:22 1337
Trying 2001:470:765b::b15:22...
Connected to 2001:470:765b::b15:22.
Escape character is '^]'.
Hello
Hi there
Greetings from the client to the server!
Thanks. Greetings back from the server to the client.
Cheers
Goodbye
^]
telnet> quit
Connection closed.

Wireshark reveals the TCP flags in the Info column for connection establishment and termination. Have a look at the ACKs directly after each sent message, regardless of which direction. Finally, a “Follow TCP Stream” shows the raw data, coloured by the way they were transmitted:

UDP

Basically the same with UDP. Listening on the server on port 2311:

netcat -6 -l -u 2311

Proto type “udp6” is shown with netstat:

weberjoh@nb15-lx:~$ netstat -tulpen6
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       User       Inode       PID/Program name
tcp6       0      0 :::22                   :::*                    LISTEN      0          21160       -
udp6       0      0 fe80::d6be:d9ff:fe4:123 :::*                                0          22715       -
udp6       0      0 2001:470:765b::b15::123 :::*                                0          22713       -
udp6       0      0 ::1:123                 :::*                                0          22711       -
udp6       0      0 :::123                  :::*                                0          22699       -
udp6       0      0 :::2311                 :::*                                1000       1490184     20131/netcat

Connecting from the client, using netcat (and not telnet, which is not capable of UDP):

netcat -u <ip> <port>

Now my demo, again with some text messages and umlauts:

weberjoh@vm24-ns0:~$ netcat -u 2001:470:765b::b15:22 2311
Hi over UDP
Guten Tag auch
Oh, you speak German
Kann ich auch
Sehr schön. Sogar mit Umlauten.
;)
Yup. Ciao.
Tschö
^C

Wireshark’s glasses. No connection establishment nor termination. No ACKs. Only the raw data in both directions. One single UDP packet per sent text message. Quite easy. “Follow UDP Stream” works as well:

pcap

Have a look at the corresponding pcap, if you like. 7zipped, 1 KB:

Featured image “Slices of rye bread with butter on a wooden board” by Marco Verch Professional Photographer and Speaker is licensed under CC BY 2.0.

↧

Stumbling Blocks

Verifying

Cisco Router with 6in4

Palo Alto with 2x Untrust Interfaces

Conclusion

Step-by-Step through the GUI

CLI Commands

Up and Running

Setup

IP Protocol 41

Spontaneous Challenges

Common DNS: UDP

Bigger Sizes: IP Fragmentation & TCP

EDNS Extensions

EDNS(0) Client Subnet

Domain Name System (DNS) Cookies

Further Links

ntpdate

Debugging & NTP Authentication

sntp

Debugging & NTP Authentication

How to monitor NTP Servers?

Offset and Jitter

Reach

Traffic

What’s missing? -> Alerting!

Not: monstats & mrulist

Alternative: UFW with grep ‘n sed et al.

Setting up UFW

Counting Source IP Addresses

Monitoring via SNMP

Sample Graphs

NMEA 0183

SNMP into MRTG

Linux Defaults

Offset

PZF Correlation & Field Strength

Today’s Clients & Requests

Example

Create a New Measurement

Results

Ongoing Tests & other Examples

Prerequisites

Joining the NTP Pool

Prenotes

Scoring

NTP Client Stats

Requests after Leaving the Pool

Update your Wireshark, please …

Setting up your analysis environment

NTP server performance

The testbed

Isolating NTP server communication

Comparing the response times

Meinberg: Client Logging vs. No Client Logging

NTP Server Performance Min/Max/Average

ICMP Errors ‘n Errors ‘n Errors

Why?

(Trivia: Wireshark Filtering)

Conclusion

NTP – An important protocol with insufficient security

Network Time Security (NTS), a new solution for NTP (overview)

How does NTS work?

Phase 1 – NTS Key Establishment (NTS-KE)

Phase 2 – NTS-secured NTP

NTS implementations and interoperability

Overview of the NTS Properties

The Strengths of NTS

Defense Against Known Attacks

Design of the Key Establishment (KE)

Good Scalability

The AEAD Algorithm (AES-SIV)

Dynamic Key Rotation / Key Freshness

Unlinkability / Privacy Protection

Stateless Server

Robust Against Packet Loss

Optional Resilience

Disadvantages and Weaknesses

Delay attacks

Reduction of the Synchronization Accuracy

NTP – An important protocol with
insufficient security

Network Time Security (NTS), a new
solution for NTP (overview)