One of the great many frustrations I hear frequently from people in the network community is in dealing with customers who practice “traceroute engineering” when opening trouble cases. At CacheFly we’re extremely lucky in that we almost never have customers who do this, but the problem certainly exists. Worse yet, 9 times out of 10 people make fundamentally incorrect assumptions, inferences and conclusions based on something as simple as a traceroute.
Enter Richard Steenbergen. One of the truly gifted people out there when it comes to running a network, Richard is the CTO and founder of nLayer. A few years ago Richard made a fantastic presentation at NANOG regarding how to *really* read a traceroute, and bunch of other smart people walked a way with a much better understanding of how traceroute actually works. Richard has since converted his presentation into an article which is very easy to digest and yet provides all the info from his original presentation. I’ve included a couple of interesting excerpts below, but for a proper education, be sure to read the full PDF. It’s a total of 20 pages including some graphics, and well worth your time if you ever look at a traceroute for work or pleasure.
To understand queuing delays, first you must understand the nature of interface utilization. For
example, a 1GE port may be said to be “50% utilized” when doing 500Mbps, but what this actually
means is 50% utilized over some period of time (for example, over 1 second). At any given instant,
an interface can only be either transmitting (100% utilized), or not transmitting (0% utilized).
When a packet is routed to an interface but that interface is currently in use, the packet must be
queued. Queuing is a natural function of all routers, and normally contributes very little latency to
the overall forwarding process. But as an interface approaches the point of saturation, the
percentage of packets which must be queued for significant periods of time increases exponentially.
The amount of queuing delay that can be caused by a congested interface depends on the type of
router. A large carrier-class router typically has a significant amount of packet buffering, and can add
many hundreds or thousands of milliseconds of latency when routing over a congested interface. In
comparison, many enterprise-class devices (such as Layer 3 switches) typically have very small
buffers, and may simply drop packets when congested, without ever causing a significant increase inlatency.
Matt: Once you’ve read this once or twice, it makes all the sense in the world.. But did you ever really look at a gige port doing 512kbps and think of it as operating at 100% utilization?
One of the most basic concepts of routing on the Internet is that there is absolutely no guarantee of
symmetrical routing of traffic flowing between the same end-points but in opposite directions. Regular
IP forwarding is done by destination-based routing lookups, and each router can potentially have its own
idea about where traffic should be forwarded.
As we discussed earlier, Traceroute is only capable of showing you the forward path between the source
and destination you are trying to probe, even though latency incurred on the reverse path of the ICMP
TTL Exceed packets is part of the round-trip time calculation process. This means that you must also
examine the reverse path Traceroute before you can be certain that a particular link is responsible for
any latency values you observe in a forward Traceroute.
Asymmetric paths most often start at network boundaries, because this is where administrative policies
are most likely to change. For example, consider the following Traceroute:
3 te1-1.ar2.DCA3.gblx.net (188.8.131.52) 0.719 ms 0.560 ms 0.428 ms
4 te1-2-10g.ar3.DCA3.gblx.net (184.108.40.206) 0.574 ms 0.557 ms 0.576 ms
5 sl-st21-ash-8-0-0.sprintlink.net (220.127.116.11) 100.280 ms 100.265 ms 100.282 ms
6 18.104.22.168 (22.214.171.124) 102.037 ms 101.876 ms 101.892 ms
7 sl-bb20-dc-15-0-0.sprintlink.net (126.96.36.199) 101.888 ms 101.876 ms 101.890 ms
This Traceroute shows a 100ms increase in latency between Global Crossing in Ashburn VA and Sprint in
Ashburn VA, and you’re trying to figure out why. Obviously distance isn’t the cause for the increased
latency, since these devices are both in the same city. It could be congestion between Global Crossing
and Sprint, but this isn’t guaranteed. After the packets cross the boundary between Global Crossing and
Sprint, the administrative policy is also likely to change. In this specific example, the reverse path from
Sprint to the original Traceroute source travels via a different network, which happens to have a
congested link. Someone looking at only the forward Traceroute would never know this though, which is
why obtaining both forward and reverse Traceroutes is so important to proper troubleshooting.
Matt: These are just the tip of the iceberg.. In the full document, Richard dives into MPLS tunnels, ECMP, Serialization/Queuing delays and much more..
The full article is located here: http://cluepon.net/ras/traceroute.pdf