Gateway Data Master <-> UNILAC PZ (dm-unipz)

Introduction

"dm-unipz" is the interface between the White Rabbit based Data Master und the MIL-based UNILAC 'Pulszentrale'. The task of dm-unipz is to synchronize the Data Master to the beams delivered by the UNILAC. Background information and further reading are available
  • here, focus on UNIPZ
  • 'Booster-Mode'

A dedicated How-To is available here.

dm-unipz overview.jpg
Figure: Overview on the interfaces of the gateway (see text).

An overview on the gateway is depicted in the figure above. The gateway is hosted by a SCU. Its "glue" is a firmware hosted in lm32 softcore in the FPGA. The softcore communicates to three Wishbone (WB) devices; a Etherbone Master (EBM), the Event-Condition-Action unit (ECA) and the MIL-Macro. The ECA serves to receives scheduled commands from the Data Master (DM) and executes on-time actions. The actions drive the activity of the firmware via events. The MIL-Macro provides two functionalities. First, it serves as a so-called MIL-Devicebus master to a bit I/O close to the UNILAC "Pulszentrale". Second, it receives events via the so-called MIL-Eventbus. Upon reception of the "UNI_READY" event, the MIL-Eventbus receiver generates TTL pulse that is connected to a LEMO input and subsequently timestamped in the Timsestamp Latch Unit (TLU) via the ECA. Finally the EBM serves to transmit replies to the DM. The firmware provides a dual-port RAM (not shown), which allows the software on the host system to communicate with the firmware in the lm32.

dmunipz context.JPG
Figure: Context of the gateway (see text).

The context of "dm-unipz" is given in the figure above. The gateway and a Timing Receiver (TR) are connected to the Data Master (DM) via a White Rabbit network. The gateway furthermore connects to the UNILAC Puslzentrale via Devicebus (as master) and Eventbus (as slave). An oscilloscope displays two digital pulses generated by a MIL based TIF and a White Rabbit based TR.

Timing Messages

Starting with beam-time 2022, the so-called booster mode shall be implemented. The relevant timing messages are listed in the table below.
Event Name Event Number short description Parameter Field Remark
CMD_UNI_TCREQ 0x15e request TK 63..32 (N/A), 31..0 (DM dynpar0) dynpar0 contains the 32bit address of a block 'slow wait with timeout'
CMD_UNI_TCREL 0x15f release TK 63..00 (N/A)  
CMD_UNI_BPREP 0x161 prepare beam 63..00 (N/A) the corresponding 'unprepare' is done, when beam from UNILAC has been received (or CMD_UNI_BREQ(_NOWAIT) failed)
CMD_UNI_BREQ 0x160 request beam 63..32 (dynpar1), 31..16 (reserved), 15..8 (CPU Idx), 7..0 (thread Idx) upon beam delivery by UNILAC, this will terminate the 'slow wait' at DM and start a corresponding thread; dynpar1 contains the 32bit address of the thread origin
CMD_UNI_BREQ_NOWAIT 0x162 request beam 63..32 (dynpar1), 31..16 (reserved), 15..8 (CPU Idx), 7..0 (thread Idx) upon beam delivery by UNILAC, this will start a corresponding thread; dynpar1 contains the 32bit address of the thread origin
Table: Timing Messages used to control the DM-UNIPZ Gateway. The values CPU Idx and thread Idx are explicitly given by LSA as part of the schedule. The values dynpar0 and dynpar1 are indicated as edges in the LSA schedule but the values are written to the timing message by the DM firmware on-the fly during run-time.

Procedure

The following procedure is applied (somewhat simplified)
  1. the DM prepares the gateway via a Timing Message
  2. the DM tells the gateway to request the 'Transfer Kanal' (TK) via a Timing Message
  3. the gateway requests the TK from UNIPZ via Devicebus (Modulbus I/O)
  4. the gateway starts waiting for acknowledgement or timeout from UNIPZ
  5. UNIPZ signals an acknowledgement or ("not ok") after the TK has been prepared
  6. the gateway reads the acknowledgement from UNIPZ via Devibus (Modulbus I/O); otherwise: timeout
  7. the gateway instructs the DM to continue with its schedule
  8. loop (1 or more iterations driven by the DM following a schedule provided by LSA)
    1. the DM tells the gateway to request beam via a Timing Message
    2. the gateway requests beam from UNIPZ via Devicebus (Modulbus I/O)
    3. the gateway starts waiting for the MIL Event "READY_TO_SIS" or timeout
      1. UNIPZ sends a MIL Event "READY_TO_SIS" 10ms prior to beam delivery (or "not ok")
      2. the MIL-Macro of the gateway receives the Event "READY_TO_SIS"
      3. the MIL-Macro generates a TTL pulse
      4. (via a Lemo cable, the TTL pulse is guided to bidirectional I/O)
      5. the time of the incoming TTL pulse is latched via the Timestamp Latch Unit (TLU) connected to the ECA
      6. the ECA generates an event and an action towards the LM32 is triggered, indicating the "READY_TO_SIS" event to the lm32
      7. the lm32 receives the ECA event that includes the timestamp from the TLU, t_Evt
    4. the gateway adds an offset of exactly 1.5ms to the timestamp: t_flex = t_Evt + 1.5ms
    5. the gateway instructs the DM to continue the schedule exactly at t_flex
    6. the gateway releases the beam request at UNIPZ via Deviceubs (Modulbus I/O)
    7. the DM continues scheduling events starting exactly at t_flex. NB: this part of the schedule is not aligned to BuTiS T0 ticks but starts exactly at t_flex
    8. the beam is transferred from UNILAC to SIS18
      1. exactly 10ms after the MIL Event "READY_TO_SIS", and in coincidence with
      2. exactly 8.5ms after t_flex
  9. the DM tells the gateway to release the TK and continues with its 'normal' schedule. NB: from here on, the schedule is again aligned to BuTiS T0 ticks
  10. the gateway releases the TK at UNIPZ via Devicebus (Modulbus I/O)

Firmware

FSM

dm-unipz.png
Figure: FSM of the firmware. Shown are states and transitions. Implicitly, all states may transit to the ERROR state (transitions not shown). Description see text.

The figure above depicts states and transitions of a Finite State Machine. As soon as the firmware is loaded in the lm32, it is in the initial S0 state and performs a basic initialization. The states and their transitions are described below. For details on Entry-, Do- and Exit actions please check the source code.
  • S0: Initial State. Firmware performs basic initialization.
    • Initialization successful: automatic transition -> IDLE
    • Initialization failed: automatic transition -> FATAL
  • FATAL: This state is entered whenever a non-recoverable error is detected. Examples for such an error are missing ECA or MIL-Macro. It is impossible to recover from such a situation; this is a final state.
  • IDLE: Basic (unconfigured) state. In this state the firmware does not react to MIL events or ECA actions. The firmware can only be controlled by commands via the DP-RAM. This state is also safe for uploading new firmware to the lm32 softcore.
    • command "configure" -> CONFIGURED
  • CONFIGURED: After undergoing the process of configuration within the entry-action, the firmware is configured.
    • command "configure" -> CONFIGURED
    • command "idle" -> IDLE
    • command "startop" -> OPREADY"
  • OPREADY: This should be the normal state for all operational situations including failed transfers from UNILAC to SIS (a failed transfer does not cause a transition to the ERROR state).
    • command "stopop" -> STOPPING (-> CONFIGURED)
  • STOPPING: This is an intermediate state handling a clean transition from OPREADY -> CONFIGURED. If in this state, there is an automatic transition to CONFIGURED.
  • ERROR: This state is entered whenever a severe error is detected. Examples for such an error is a physically disconnected MIL Devicebus to the Bit I/O close to UNILAC PZ.
    • command "recover" -> IDLE
    • autorecovery mode : If in state ERROR, the FW tries autorecovery ERROR -> IDLE -> CONFIGURED -> OPREADY

Status

In December 2017, the gateway was deployed to the production system for the first time. The gateway SCU
  • has a MIL Devicebus connection to the Modulbus I/O of the UNILAC Pulszentrale at LSB6.
  • receives MIL Events from the UNILAC Pulszentrale at LSB6.
  • receives Timing Messages from the Data Master at BG2
  • communicates with the Data Master BG2 via the White Rabbit production network.

The gateway was operated between 19 December 2017 and 3 January 2018. About 492 thousands "dry injectios" from UNILAC Pulszentrale to the new White Rabbit based timing system have been achieved successfully. However, there are a couple of issues that need to be addressed.

issue value status description recommended action updated status May 2018
lm32 latency mean 3us ok latency of lm32 to react on MIL events already considered in configuration of firmware irrelevant, use TLU timestamping
lm32 latency jitter 170ns sdev ok standard deviation, required 1us within specs, TLU would improve value N/A
lm32 latency min 2.7us ok shortest possible reaction time not required N/A
lm32 latency max 3.3us ok longest reaction time. not required N/A
be aware of lm32 latency excess (see below) N/A
lm32 latency max-min 0.6us ok max range of jitter, required 1us within specs, TLU would improve value N/A
be aware of lm32 latency excess (see below) N/A
MIL Eventbus error < 1E-7 ok failure to receive MIL event (not observed) not required same
GMT latency mean 999.995us ok execution of first WR timing event after MIL set to specified value via lm32 config as specified
event. Specified value 1000.0us
GMT latency jitter 170ns sdev ok standard deviation, required 1us, ok within specs, TLU would improve value better than 100 ns (TLU timestamping)
GMT latency min 999.7us ok shortest reaction time not required N/A
GMT latency max 1000.3us ok longest reaction time. not required N/A
be aware of lm32 latency excess (see below)  
GMT latency max-min 0.6us ok max range of jitter, required 1us within specs, TLU would improve value better then 200 ns (TLU timestamping)
be aware of lm32 latency excess (see below)  
lm32 latency excess 1E-4 not ok Wishbone bus blocked due to CPU access which TLU for latching time of MIL event N/A
causes a latency around 50us. Rate depends  
on CPU program. Results in partial or total  
beam loss during transfer.  
MIL Devicebus error 3E-5 not ok error in communication with modulbus I/O try MIL expander for device bus MIL expander implemented
Results in failure of transfer (failed beam  
request, loss of beam, dry cycle...)  
EB read error 5E-5 not ok timeout error when reading from Data Master Forward Error Correction '2nd chance' implemented
via timing network. Results in beam loss.  
Workaround: try 2nd read in case of timeout  
avoids deadlock (but still beam loss)  
EB write error 2E-5 fatal error when writing to Data Master Forward Error Correction timeout handling in DM implemented
Presently, this results in an unrecoverable Workaround: timeout handling at DM  
"deadlock" halting the Data Masters thread.  
total error 2E-4 bad Under the assumption "EB write erros" can be see above expected to have improved
handled by timeout treatment, the present (needs to be remeasured)
failure rate is about 2E-4: With injections  
at 1 Hz, one injection per hour fails.  
Table: Table with performance data and issues based on 492000 "dry" injections from UNILAC to SIS18 (more details see text).

The table above presents current performance performance and issues with respect to the transfer of beam from the UNILAC to the new White Rabbit based Data Master. For the data presented in the table, the DM-UNILAC gateway was operated for an extended period of time over Christmas and New Year 2017/2018. The latency numbers are identical to the ones measured with the integration setup in the Programmentwicklungsraum, as expected. In the following, relevant issues are discussed.

lm32 latency excess
The firmware polls the Wishbone GSI_MIL_SCU for the incoming MIL event EVT_READY_TO_SIS. While the jitter values for unperturbed operation are ok, the upper latency is sometimes drastically increased to values of 50us and more. This happens, if the Wishbone connection from lm32 to GSI_MIL_SCU is blocked by a third party. As an example, this happens if the status of gateway lm32 is read from the host system while a transfer is in progress. A possible solution is timestamping of the MIL event using the TLU of a timing receiver. This would also further reduce the jitter and the spread of min-max values.

EB read/write errors
Those errors are not observed in the integration system in the Programmentwicklungsraum, but only in the production system. The most likely cause are WR switches that occasionally drop the low priority packets between the gateway and the Data Master in favour of high priority traffic of the timing system itself. To some extent, the error rate might be enhanced due to the present (December 2017) operation mode of the timing system.

MIL Devicebus errors
Those errors are not observed in the integration system in the Programmentwicklungsraum, but only in the production system. This needs to be investigated. The simplest cause might be too long MIL cables between BG2 and LSB6. In this case, use of MIL expanders might cure the problem.

Summary
From the point of view of the gateway, a transfer of beam from UNILAC to SIS18 can be achieved. The communication between the UNILAC Pulszentrale, the gateway and the Data Master can be achieved with a time of exactly 1000.0(6) us as specified (specs: 1ms fixed latency and 1us uncertainty). The scenario has been tested in the production system and the real UNILAC Pulszentrale. To date (January 2018), the total error rate is about 2E-4 and there is good hope that this can be reduced further. Most critical are EB write errors, that can presently not be catched and result in a halt of the facility.

-- DietrichBeck - 25 January 2022
I Attachment Action Size Date Who Comment
dm-unipz.pngpng dm-unipz.png manage 32 K 07 Mar 2018 - 10:24 DietrichBeck dm-unipz FSM
dm-unipz_overview.jpgjpg dm-unipz_overview.jpg manage 59 K 19 Apr 2018 - 11:50 DietrichBeck dm-unipz overview
dmunipz_context.JPGJPG dmunipz_context.JPG manage 167 K 04 Jul 2017 - 12:25 DietrichBeck dm-unipz context
Topic revision: r10 - 31 Jan 2022, DietrichBeck
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback