Report: Latency and Loss of Timing Messages in the Timing System

Introduction

Starting in October 2019 the ECA-Tap module was added to the gateware of a few dedicated timing receivers. The ECA-Tap module provides the following statistics at the ECA input.

  • # of messages received
  • time differences: deadline - timestamp
    • average (calculated from 'sum' and '# of messages')
    • minimum
    • maximum
  • # of late messages

From this, diagnostic data can be derived. It is tried to connect timing receivers at different layers of network switches. As an exmample, comparing the statistics data from different network layers allows to extract data such as latency or loss of messages per layer.

PRO

For the location of the nodes in the production system see here.
layer date node direction dt min [us] dt max [us] dt ave [us] av latency [us] total # of messages loss rate # of lates late rate
1 2019-oct-11 DM TX 891.0 998.7 ~ 981.5 N/A -- N/A 0 --
1 ident SCU diag RX 879.1 990.1 941.6 39.9 5061751 N/A 0 N/A
2 ident SCU diag RX 876.3 987.3 938.8 2.8 ident < 1.97e-7 0 < 1.97e-7
3 ident SCU diag RX 873.8 984.8 936.2 2.6 ident < 1.97e-7 0 < 1.97e-7
1 2019-oct-15 DM TX 891.0 998.7 ~ 982.0 N/A -- N/A 0 --
1 ident SCU diag RX 790.1 990.2 942.2 40.0 19697158 N/A 0 N/A
2 ident SCU diag RX 787.4 987.4 939.5 2.7 ident < 5.01e-8 0 < 5.01e-8
3 ident SCU diag RX 784.7 984.8 936.9 2.6 ident < 5.01e-8 0 < 5.01e-8
1 2019-oct-21 DM TX 891.0 998.7 ~ 982.6 N/A -- N/A 0 --
1 ident SCU diag RX 790.1 990.2 942.2 40.4 36277751 N/A 0 N/A
2 ident SCU diag RX 787.4 987.4 939.8 2.4 ident < 2.76e-8 0 < 2.76e-8
3 ident SCU diag RX 784.7 984.8 937.2 2.6 ident < 2.76e-8 0 < 2.76e-8
1 2019-nov-14 DM (1) TX 860.6 998.8 ~ 987.4 N/A -- N/A 0 --
1 ident SCU diag RX 827.9 990.1 938.8 48.6 106706802 N/A 1 N/A
2 ident SCU diag RX 825.0 988.3 936.0 2.8 ident < 9.37e-9 1 = 9.37e-9
3 ident SCU diag RX 822.4 986,0 933.5 2.5 ident < 9.37e-9 1 = 9.37e-9
1 2019-nov-25 DM TX 845.3 998.8 ~ 983.0 N/A -- N/A 0 --
1 ident SCU diag RX 803.9 990.8 937.1 41.4 186760650 N/A 4 N/A
2 ident SCU diag RX 801.5 988.3 934.4 2.4 ident < 4.48e-9 4 = 2.14e-8
3 ident SCU diag RX 799.1 985.9 931.8 2.4 ident < 4.48e-9 4 = 2.14e-8
1 2019-dec-20 DM TX < 0 998.7 ~ 980.0 N/A -- N/A 0 --
1 ident SCU diag RX 811.3 990.8 935.1 ~40.0 265915522 N/A 5 N/A
2 ident SCU diag RX 808.7 988.3 932.2 2.6 ident < 3.76e-9 5 = 1.88e-8
3 ident SCU diag RX 806.0 985.9 929.8 2.7 ident < 3.76e-9 5 = 1.88e-8

Table: Typical numbers extracted from the production system. For each date, there are four rows of data. Description see text. If statistics has been reset, the DM (SCU) data are mareked by 2 (1).

The Data Master (DM) and three SCUs for diagnostic (SCU diag) are connected to a three layer WR network. The 'SCU diag' receive the messages sent by the DM. The above table shows the difference 'dt' (deadline - timestamp), the number of timing messages, and the number of late messages. The 'loss rate' is ratio of the number of lost messages to the total number of messages. The 'late rate' is the ratio of number of late messages to the total number of messages. The numbers for 'loss rate' and 'late rate' are always calculated to the row above. Please note:
  • dt is measured at the input of the priority queue (DM)
  • dt is measured at the input of the ECA ('SCU diag')
  • after each measurement the counter of ECA-TAP are cleared
  • 'total # of messages' is the sum of all measurements

UNILAC PRO

At LSB6 an experimental test setup is operated, see here.

layer date node direction dt min [us] latency [us] # of messages loss rate # of lates late rate
1 2019-oct-11 SCU PZ TX 3051.4 N/A 8e8 N/A 0 < 1.25e-9
1 ident SCU diag RX 3047.4 4.0 ident < 1.25e-9 0 < 1.25e-9
1 2019-oct-15 SCU PZ TX 2992.6 N/A 1.51e9 N/A 0 < 6.62e-10
1 ident SCU diag RX 2988.9 3.7 ident < 6.62e-10 0 < 6.62e-10
1 2019-oct-21 SCU PZ TX 519.7 N/A 2.66e9 N/A 0 < 3.76e-10
1 ident SCU diag RX 515.6 4.1 ident < 3.76e-10 0 < 3.76e-10
1 2019-oct-25 SCU PZ TX 519.6 N/A 3.34e9 N/A 0 < 3.00e-10
1 ident SCU diag RX 515.8 3.8 ident < 3.00e-10 0 < 3.00e-10
1 2019-nov-14 SCU PZ TX 2831.2 N/A 4.35e9 N/A 0 < 2.30e-10
1 ident SCU diag RX 2827.5 3.7 ident < 2.30e-10 0 < 2.30e-10
1 2019-nov-25 SCU PZ TX 2499.7 N/A 5.44e9 N/A 0 < 1.84e-10
1 ident SCU diag RX 2496.1 3.6 ident < 1.84e-10 0 < 1.84e-10
1 2019-dec-30 SCU PZ TX 1792.4 N/A 1.09e10 N/A 0 < 9.18e-11
1 ident SCU diag RX 1788.7 3.7 ident < 9.18e-11 0 < 9.18e-11
Table: Numbers extracted from UNILAC PRO. For each date, there are two rows of data. Description see text.

Two SCUs are connected to a layer1 WR switch. 'SCU PZ' serves as UNILAC Pulszentrale and sends timing messages to the WR network. 'SCU diag' receives the messages. The above table shows the difference 'dt' (deadline - timestamp), the number of timing messages, and the number of late messages. The 'loss rate' is ratio of the number of lost messages to the total number of messages. The 'late rate' is the ratio of number of late messages to the total number of messages. Please note:
  • dt is measured at at the lm32 after the messages have been written to the EBM (SCU PZ)
  • dt is measured at the input of the ECA ('SCU diag')
  • after each measurement the counter of ECA-TAP are cleared
  • 'total # of messages' is the sum of all measurements

-- DietrichBeck - 30 Dec 2019
Topic revision: r8 - 27 Dec 2019, DietrichBeck
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback