In "End-to-End Routing Behavior in the Internet", Vern Paxson describes the methodology and analysis of 40,000 "traceroute" measurements made between 37 Internet sites during the autumns of 1994 and 1995. Twenty years of communication networks studies produced an abundance of literature on routing protocols and algorithms, but very little on routing behavior. In Paxson's view this study was the first to focus on the "the end-to-end dynamics of routing information". Specifically the effort was interested in the virtual "data link" path between Internet hosts A and B (symbolically shown as A => B). At any instance in time this virtual path is seen by the data link layer as a single "route", but it is really a sequence of hops between routers from A to B. The study examined just how stable is this virtual route, or in other words what is the route's behavior. An Internet measuring utility named "traceroute" was loaded on the participating sites. Its use is not explained, but rather the reader is directed to W. R. Stevens, TCP/IP Illustrated, Volume 1: Protocols for further discussion. Overall, the authors assumes a considerable knowledge of network communications, which considering it's publication in the IEEE/ACM Transaction on Networking, may be valid, but the less informed will likely make several passes through the article before reaching a level of understanding.
Table I. lists the Internet sites that participated, 26 in North America, 8 in Europe and 3 in the Pacific. Figures 1 and 2 show maps of the sites and the links traversed during the study. Two sets of data were collected. The first in 1994, referred to a D1 was adjusted the following year (D2), to allow for the addition of participating sites and paired measurements. Immediately following the conclusions of measurements on the virtual path A => B, the reverse route B => A was measured. The D2 data was collected using "burst" measurement techniques. "Poisson Arrivals See Time Average " (PASTA) principles where used to allow for comparison of D1 and D2 even though they had different sampling rates. At the time of the study there were more than 50,000 hosts, but Paxson argues that they "reflect a significant cross-section of the behavior". The study uncovered a number of shortcomings, which Paxson believes could be overcome with better collection and analysis tools.
Nearly 7000 "traceroutes" between 27 sites were undertaken during the D1 experiments. During the 1995 experiment, 33 sites attempted over 5 times as many "traceroutes". For both measurements over 90% of the hosts were able to connect to the distant end of the route. Where the routing resulted in "substandard performance", it was usually attributed to routing loops, erroneous routing or rapidly changing routes. Routing loops can be one of three types, forward, information or "traceroute." Forwarding loops occur then a packet sent out by a router returns to it, while the information loops is a specific case of the forwarding loop in that the packet returns because of some information previously propagated by the sending router. Traceroutes differ in that the measurement utility reports the router hop sequence more than once. Routing algorithms are designed to avoid forwarding loops, so when they occur, they usually resolve quickly. Persistent loop frequently come in cluster, with the author siting the Washington DC area where there is a high degree of interchange between Internet service providers (ISP). There was one case where packets took a totally wrong path, Connecticut to England by way to Israel. Another abnormality is called fluttering, which occurs when a route rapidly oscillates or changes. This is more serious situation, because it creates for unstable network paths, which may be in one direction or both. If in one direction, it results in return trip bandwidth estimate errors. More importantly, different routes have different propagation times, resulting in packets arriving at the distant end out of order. There were also a half dozen instances where the packet did not reach its destination due to the expiration of its time of life counter. Temporary network outages were the principle cause for the loss of a series of packets. Between the two experiments around half of the "traceroutes" experienced no lost packets and less than 2% experience more than 6 lost packets. Table II. Provides a summary of the routing pathologies just discussed.
Paxson offers stability as a key property of end-to-end stability, which he defines as how frequently, routes change over time. He defines to attributes of stability, prevalence and persistence. Prevalence is a measure of how likely you'd observe again the route you just observer. For instance if you saw the same route 10 out of 11 times then there is a 10/11 chance that you're see it the next time through. Persistence is a measure of time, specifically how long before you'd expect for the route to change. For instance using the example of 10/11 prevalence, if the time for the 11 observations is measured in days then the route is also persistent, however is it's measured in seconds, then it isn't. Considering data reduction, three levels of granularity where selected, to help account for shifting workload between interchangeable hosts: hostname (any change), city, and autonomous systems (AS) collections (major change). Of the route changes observed at a host, 57% also occurred at the city granularity, and 36% at AS granularity. Figure 6. plots the cumulative probability vs. the prevalence of dominant route for all three granularities, which was observed over 1000 plus virtual paths during D2. The same route was observed 82% of the time for over half of the paths measured. Table III. Summarizes the persistence at six time scales ranging from seconds and minutes, up to days. It is noted that for routes that persist for days, 50% last for less than 7 days, while the other "50% account for 90% of the total route lifetimes."
"Major" asymmetries, or a lack of equal pairs of propagation time, are of interest because it is common to assume that the one way propagation time between two hosts is half that of the roundtrip time (RTT). Asymmetries impact router utility, accounting, troubleshooting as well as network measurements. Asymmetry is also caused when ISP routing schemes are based on economic decisions rather than on efficient operations. Asymmetries are common, even at the autonomous system level, were 30% of the paired measurements observed different virtual paths. Fortunately, the majority of asymmetries are confined to a "single hop", with only 33% differing by 2 or more "hops".
Paxson summarizes that "widespread variation" is the constant theme of the study, meaning that different site pairs behave very different. Because there is no "typical" Internet site there is no "typical" Internet path.