top of page
lasse220

Arista 7150S: Our PTP capable switch of choice - should it be yours?

In this article, I look at how we used Timebeat and the Qulsar Qg2 in our lab to discover and quantify the error introduced by the transparent and boundary clock functionality of the Arista 7150S series switch and discovered a few interesting things in the process. I'm confident the results obtained from this widely deployed switch will surprise you. If you have these switches, you need to keep reading. Timekeeping seems at a glance like an easy thing to deploy, but it is in reality a very complicated thing to get right and errors are hard to detect.


(Note: Since we first published this article we have learned two things and updated the content accordingly. Firstly, Arista told us that transparent clock mode in the 7150S is not supported between interfaces that do not run at the same link speed and that this could be the reason we initially saw unusual results in one of our tests. So, with this in mind we have changed the tests to contain links of similar speeds (1Gbps). Secondly, we learned from our friends at Meinberg that PHY matters more than we considered. I.e. a baseline test containing 1 cable and 1 SFP vs. another test containing 2 cables and 3 SFPs will "ceteris paribus" introduce error and this must be considered and accounted for. With this in mind we've changed all of the tests to have the same number of cables and SFPs).


It's been a great start to 2021. We've been working on improving the capability of our clock steering servos and filters to make Timebeat the most accurate clock synchronisation solution in the world. The results we see are beyond what we hoped we could achieve when we started out in November last year. Our solution is now so accurate that manufacturers of specialised high precision network cards use our software to test the accuracy of the hardware they produce for things like 5G networks - a sector where accuracy is a key requirement.

We've also begun to extend the capability of our software so that it not only synchronises and monitors clocks in Windows and Linux but now also switches and network equipment that act in a capacity as boundary and transparent clocks. We've developed an exciting relationship with Qulsar who makes an amazingly accurate yet cost effective grandmaster clock which is widely deployed in mobile networks all over the world.

Many of our customers in the financial services space rely on switches from Arista Networks. Arista was one of the first companies to launch a switch that had very low port-to-port latency and which could function as a PTP boundary and transparent clock.

One of the switches we see deployed most widely in this space is the Arista 7150S switch which forms part of Arista's "Ultra low latency" switch line-up. It is advertised to have a port-to-port latency of just 350-380ns.

To test the impact the switch has on timing accuracy, I've built a lab where we first establish a baseline.


Baseline - no switch, just cables




Here we use the Qulsar Qg2 to feed PPS to a clock on a NIC in testhost03 (a). Then we use testhost03 to provide a PTP feed to testhost04 which is where Timebeat first synchronises a clock on a NIC (b), from here Timebeat synchronises the system clock (c) which in turn synchronises the clock on a second installed NIC (d). On the second NIC we receive a PTP feed from the Qulsar Qg2 which we only monitor. That is, we compare how well clock (d) in the diagram above compares with the PTP feed, but we don't use the PTP feed from the Qg2 to steer/synchronise the clock.

We have also introduced a device to join two SFPs. We do this, so that when we move these SFPs to an Arista switch in the tests that follow, then no new error is introduced that is not accounted for elsewhere. This will give us a more "honest" view of the Arista's capabilities.

The result of the baseline test as seen on testhost04 is shown below.



What we can see from the baseline test is that the PTP feed that testhost04 receives at (b) from testhost03 (derived from a PPS feed on testhost03 from the same Qg2) matches very well the PTP feed received directly from the Qg2. This provides us with excellent assurance that no error in the time chain exists before we introduce the Arista switch between testhost03 and testhost04. Accuracy for both sources have a mean of about 0ns, is unbiased and maintained consistently in the low nanosecond range.

We now introduce the Arista switch between the two hosts by changing the lab setup as per the diagram below.


Arista 7150S - Transparent clock mode


Here we have configured the Arista to function as a transparent clock. What we are expecting to happen is that the Arista determines the time a PTP packet ingresses on one port and egresses on another. The switch then adds this information to the "correctionField" in the PTP header, so that the receiving party can to take into account how long the packet was inside the switch for - this is called the "residence time". By removing the residence time in both directions, the intention is that the error caused by this variable delay of packets in switch buffer can be discounted from the PTP calculation, so that it will be as though the switch wasn't in the path and we would therefore expect to see result similar to what we obtained in the baseline experiment.

In this test we can see that the 7150S achieves good results. The normal delay variations that would follow the introduction of a switch are mitigated by the transparent clock function and excellent accuracy is maintained between the Timebeat PTP grandmaster and the Timebeat PTP client. The result appears in the graph below and we can see that the two sources have a low variance and a mean of 0ns.



To enable this mode in the Arista on two interfaces, we simply apply the configuration below:


It is here essential that interfaces Ethernet3 and Ethernet4 have the same link speed or the transparent ptp clock mode will not operate correctly and a large error will be introduced.


Arista 7150S - Boundary clock mode

Another option to mitigate clock error as time is propagated throughout a network, is a mode of operation called boundary clock. If we enable this mode in the Arista, what will happen is that the Arista will act as a client towards a clock source and act as a source towards a client. The benefit of this approach is firstly that the original clock source has to deal with fewer clients as only the boundary clock becomes a client of the original source and acts as a source for end-clients. The other main benefit is, that error introduced by packet queues in the switch network from clock to client are eliminated or at least minimised. There are several drawbacks but I will save these for a future article.

The lab network for the boundary clock experiment stays the same, but we've change the Arista switch PTP mode to boundary clock. The results we obtain running the Arista in this mode are shown below.


We can see that in respect of our "control" PTP feed directly from the Qg2 it lines up well with the source we only monitor as a reference.

To enable this mode in the Arista on two interfaces we simply apply the configuration below:

In the boundary ptp clock mode, interfaces Ethernet3 and Ethernet4 does not have to have the same link speed. This is useful if you have a grandmaster clock that does not have a 10Gb interface but a server that relies on having one.

The results obtained highlights why the Arista PTP capable switch is considered best-in-class and, hopefully it also highlights how using Timebeat enables network administrators to verify accuracy of time dissemination in a network. There's a lot of things we've omitted to show in these experiments in respect of how the Timebeat filters and servos operate, but the results show that it is perfectly possible to achieve synchronisation in the single digit nanosecond range with a combination of Timebeat and Arista.

If you read the first version of this article, then you could be forgiven for believing that we are not fans of Arista. In fact the opposite is the case. Throughout our tests we have used Timebeat and the monitoring capability it has to track the PTP status of the Arista's ports, oscillator, offset etc. via the Timebeat Management System which in turn relies on Arista's cool eAPI. I provide a screenshot of this below where you can see how the state of everything PTP in the Arista is monitored in Timebeat.


One of my favourite features is how every Timebeat client connected to an Arista boundary clock automatically appears in the table at the bottom. From there you can click the client you are interested in and then go directly to the dashboard relevant to that specific client. It's really very cool stuff that you will not get from anything else than the Arista-Timebeat combination.

In my next article I will be discussing how Timebeat’s revolutionary External Devices module allows you to complete the timing error chain for network devices such as boundary and grandmaster clocks all in one place, a feature you just will not find offered by any other vendor. With these new features we are well on our way to create one platform to monitor every single link in the timing chain in one single place.

If you are interested in finding out how your company can use Timebeat to understand whether clocks are synchronised accurately or how our integration with Arista works, then feel free to get in touch. We offer both enterprise and free licenses along with a cool front-end that can be deploy on-prem or delivered as PaaS. You can download a copy on timebeat.app for free.

Lastly, I will just mention, that in this set of tests, we've consistently used 16 PTP transactions per second (interval -4) for both sync and delay-request/response packets. In our experience, this strikes a good balance between accuracy and the load experienced both on the client servers running Timebeat and on the network carrying the PTP traffic.

147 views0 comments

Comments


bottom of page