Documentation Data Master Test System

Hardware

SuperMicro fel0069

The data master for the test system is hosted on SuperMicro fel0069 with two PEXARIA5d (fel0069.acc.gsi.de). One of these is the data master. The other one is used to analyze the timing messages with snoop. The SuperMicro is accessible with ssh root@fel0069.acc.gsi.de via ASL cluster or other hosts. The management interface (ILO) is accessible via https://fel0069i.acc.gsi.de.
  1. fel0069.acc.gsi.de
    • IP: 10.248.2.137
    • Location: BG2.009, Rack BG2A.A9, Slot 28
  2. pexaria248t (dev/wbm0), Data master
    • IP: 192.168.131.184
    • MAC: 00:26:7b:00:08:0b
    • Name: pexaria248t
    • CID: 55 0113 0012 0
    • PEXARIA5d, Serie EE
  3. pexaria305t (dev/wbm1), Timing receiver for snoop
    • IP: 192.168.131.241
    • MAC: 00:26:7b:00:08:44
    • Name: pexaria305t
    • CID: 55 0113 0069 4
    • PEXARIA5d, Serie EE
After power on, the IP addresses of the two pexarrias are set by a WR init script. This host has no BootP service.

On dev/wbm0:
init show
-- built-in script --
(empty)
-- user-defined script --
ip set 192.168.131.184

On dev/wbm1:
init show
-- built-in script --
(empty)
-- user-defined script --
ip set 192.168.131.241

White Rabbit Switch nwt0473m66

Location: BG2.009, Rack BG2A.A9, Slot 29
Configuration: blank. Not an access switch or distribution switch!
Access via tsl101, like other switches.
Name: nwt0473m66.timing.acc.gsi.de,
IP: 192.168.21.219.
Connections: fibre optic cable from wri2 to pexaria248t, fibre optic cable from wri3 to pexaria305t. Network cable to acc network for management.

Software

The pexarria5 can be flashed with some gateware independent of the lm32-firmware.

PXE Boot

The PXE boot is currently configured to use the current Yocto ram disk from /common/tftp/csco/pxe/yocto/current.

The SuperMicro is configured for PXE boot and nfsinit with links (following https://www-acc.gsi.de/wiki/Timing/Intern/TimingSystemHowToHintsForFECS).
  1. On ASL cluster links in folder /common/tftp/csco/pxe/pxelinux.cfg for PXE boot: 0AF80289 to fel0069, fel0069 to yocto
  2. On ASL cluster links in folder /common/export/nfsinit/fel0069/ for nfsinit.
Connections: network cable to acc network. For management: network cable to acc network.

To prepare the libraries for the Yocto environment for fel0069, use modules/ftm/tests/tools/copyLibraries.sh. This script needs to download dm-cmd, dm-sched, libcarpedm from the workspace of a Jenkins job. Details are described in copyLibraries.sh.

After reboot of fel0069:
  1. On each host, which needs access to fel0069, copy the public ssh key to fel0069. This has to be done for every user / host which needs access to fel0069 with ssh.
    ssh-copy-id -i .ssh/id_rsa.pub root@fel0069.acc.gsi.de
    For most users this is pre-configured with tg-backdoor-yocto in /common/export/nfsinit/.
  2. Test (check before automated testing starts)
    ssh -t root@fel0069.acc.gsi.de "saft-ctl tr1 -ij"
    (get info on saftbus version on fel0069)

There is a script in bel_projects, modules/ftm/test/tools/datamasterInit.sh. This initializes fel0069 after reboot / power on. If this is NOT started from ac0pc042.gsi.de, there are adoptions necessary. datamasterInit.sh also compiles the lm32 firmware located in modules/ftm/ftmfw. Call
datamasterInit.sh FEL0069 ftm.bin 8

for 8 thread firmware or
datamasterInit.sh FEL0069 ftm.bin 32

for 32 thread firmware.

Firmware Images

Datamaster: build with make ftm in bel_projects root folder. Current version (example):

Project     : ftm
Platform    : pexaria5 +db[12] +wrex1
FPGA model  : Arria V (5agxma3d4f27i3)
Source info : fallout_uniftm-5488
Build type  : developer preview
Build date  : Wed Mar 20 11:35:52 CET 2024
Prepared by : Mathias Kreider 
Prepared on : acopc050
OS version  : 5.14.0-1056-oem x86_64 GNU/Linux
Quartus     : Version 18.1.0 Build 625 09/12/2018 SJ Standard Edition

  2921ffe06 DM: GW: new test: uniftm DM with ECA set to one core and aggreation 520k RAM (4 cores worth) to one core
  bef4246d8 DM: GW: uniftm DM with ECA set to one core and aggreation 390k RAM (3 cores worth) to one core
  66bee807f DM: GW: successful synthesis of 4core Ring DM with ECA
  41b60df99 uniftm: initial
  ec506f53a Makefile: add target uniftm

Detecting Firmwares ...

Found 4 RAMs, 4 holding a Firmware ID


********************
* RAM @ 0x04120000 *
********************
UserLM32
Stack Status:                                                       
Project     : ftm
Version     : 9.0.0
Platform    : 
Build Date  : Thu Sep 26 16:16:24 CEST 2024
Prepared by : martin Martin Skorsky 
Prepared on : ACOPC042
OS Version  : Linux Mint 21.3  Linux 6.2.0-39-generic x86_64
GCC Version : lm32-elf-gcc(GCC)4.5.3 (build 230320-92a789-1dd6)
IntAdrOffs  : 0x10000000
SharedOffs  : 0x500
SharedSize  : 98304
ThreadQty   : 8
FW-ID ROM will contain:

   f9ef529dd tools/copyLibraries.sh: add descriptions for source and target.
   8941cb882 tools/copyLibraries.sh: add description. Use ~/Downloads for libcarpedm etc.
   00aaed7a1 tools/getBoostVersion.sh: add coment describing script.
   a5682c58c copyLibraries.sh: script to copy Yocto libraries to fel0069 stagging area.
   f44935345 datamaster-test.tex: add description for test_Cpu0Cpu1.py.
*****

Pexaria dev/wbm1 for snoop: https://github.com/GSI-CS-CO/bel_projects/releases/download/fallout-v6.0.1/pexarria5.rpd

Update: https://github.com/GSI-CS-CO/bel_projects/releases/download/fallout-v6.1.2-rc1/falloutv6_1_2-rc1-pexarria5.rpd

Access the Data Master

dm-cmd tcp/fel0069.acc.gsi.de
dm-sched tcp/fel0069.acc.gsi.de

The datamaster tools dm-cmd and dm-sched are used on the Jenkins slave. This ensures that the current build from the repository is used. The connection to the datamaster uses socat and the address tcp/fel0069.acc.gsi.de.

Remote Snoop of Timing Messages

Snoop per remote ssh: Set up of ssh without password: use public key of user@host and transfer it to root@fel0069.acc: /.ssh/authorized_keys with ssh-copy-id. Snoop with Python3: the tests using python3 / pytest read the command for snooping from environment variable SNOOP.
  • Example for local environment: saft-ctl tr0 -xv snoop 0 0 0
  • Example for remote environment on fel0069.acc.gsi.de: ssh -t root@fel0069.acc.gsi.de 'saft-ctl tr1 -xv snoop 0 0 0'
    saftbusd on fel0069 monitors dev/wbm1 as tr1.
The tests add an additional parameter for the number of seconds to snoop.

Required versions

  1. Python 3.6.9
  2. pytest 6.1.2

Jenkins on builder.acc.gsi.de / Jenkins Slave tsl021.acc.gsi.de

There are Jenkins jobs on https://builder.acc.gsi.de, which build the datamaster lm32 firmware and the tools and run tests on these binaries. The jobs have the steps
  1. Preparation
  2. Build the tools, currently with Boost 1.75 (other environments use Boost 1.69)
  3. Build the lm32 firmware with lm32-toolchain and load this into the datamaster fel0069.acc.gsi.de
  4. Run the tests with these components
  5. Report test results
There are jobs for the branches fallout, dm-summer-update-2022, dm-merge-Dez23, dm-fallout-tests for 8 threads and 32 threads.

The script (in repository, folder modules/ftm/tests/tools/jenkinsBuild.sh) for the build steps is
# create links needed for Rocky-9 environment
cd res/rocky-9
./generate_soft_links.sh
export PATH=$PATH:$(pwd)
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(pwd)
cd ../..
./fix-git.sh
# make all prerequisites: 1. hdlmake and lm32-toolchain, 2. etherbone, 3. eb-tools, 4. test-tools for ftm.
make
make etherbone
make tools
cd modules/ftm/tests
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$WORKSPACE/ip_cores/etherbone-core/api/.libs/:$WORKSPACE/modules/ftm/lib/
EBPATH1=../../../ip_cores/etherbone-core/api make prepare
export PATH=$PATH:$WORKSPACE/tools/:$WORKSPACE/modules/ftm/bin/:$WORKSPACE/modules/ftm/analysis/scheduleCompare/main/
# build ftm lm32 firmware
THR_QTY=8 PATH=$PATH:$HOME/.local/bin:$WORKSPACE/lm32-toolchain/bin/ make -C $WORKSPACE/syn/gsi_pexarria5/ftm/ ftm.bin
# load the required lm32 firmware into fel0069
$WORKSPACE/syn/gsi_pexarria5/ftm/fwload_all.sh tcp/fel0069.acc.gsi.de $WORKSPACE/syn/gsi_pexarria5/ftm/ftm.bin
# run all tests
OPTIONS='--runslow' make remote

Links for the Jenkins Jobs in view tests. These jobs may change to other branches.
  1. https://builder.acc.gsi.de/jenkins/job/timing/view/tests/job/test_module_ftm_datamaster_dm-fallout-tests/ All tests in modules/ftm/tests, branch dm-fallout-tests.
  2. https://builder.acc.gsi.de/jenkins/job/timing/view/tests/job/test_module_ftm_datamaster_dm-fallout-tests_thread32/ All tests in modules/ftm/tests, branch dm-fallout-tests, ftm.bin modified for 32 threads.
  3. https://builder.acc.gsi.de/jenkins/job/timing/view/tests/job/test_module_ftm_datamaster_dm-dryrun-2024/ All tests in modules/ftm/tests, branch dm-dryrun-2024.
  4. https://builder.acc.gsi.de/jenkins/job/timing/view/tests/job/test_module_ftm_datamaster_dm-dryrun-2024_thread32/ All tests in modules/ftm/tests, branch dm-dryrun-2024, ftm.bin modified for 32 threads.
  5. https://builder.acc.gsi.de/jenkins/job/timing/view/tests/job/test_module_ftm_datamaster_dm-new-logging-jul24/ All tests in modules/ftm/tests, branch dm-new-logging-jul24.
  6. https://builder.acc.gsi.de/jenkins/job/timing/view/tests/job/test_module_ftm_datamaster_dm-new-logging-jul24_thread32/ All tests in modules/ftm/tests, branch dm-new-logging-jul24, ftm.bin modified for 32 threads.
  7. (disabled) https://builder.acc.gsi.de/jenkins/job/timing/view/tests/job/test_module_ftm_datamaster_dm-merge-Dez23/ All tests in modules/ftm/tests, branch dm-merge-Dez23.
  8. (disabled) https://builder.acc.gsi.de/jenkins/job/timing/view/tests/job/test_module_ftm_datamaster_dm-merge-Dez23_thread32/ All tests in modules/ftm/tests, branch dm-merge-Dez23, ftm.bin modified for 32 threads.
  9. (disabled) https://builder.acc.gsi.de/jenkins/job/timing/view/tests/job/test_module_ftm_datamaster_dm-summer-update-2022/ All tests in modules/ftm/tests, branch dm-summer-update-2022.
  10. (disabled) https://builder.acc.gsi.de/jenkins/job/timing/view/tests/job/test_module_ftm_datamaster_dm-summer-update-2022_thread32/ All tests in modules/ftm/tests, branch dm-summer-update-2022, ftm.bin modified for 32 threads.
  11. https://builder.acc.gsi.de/jenkins/job/timing/view/tests/job/test_module_ftm_datamaster_fallout/ All tests in modules/ftm/tests, branch fallout.

The jobs use a lockable resource named 'fel0069' to ensure that only one job at a time runs tests against datamaster fel0069.acc.gsi.de. Running jobs concurrently against a datamaster breaks the tests. There is no access to the filesystem which the Jenkins slave tsl021 uses. To access the files of the build jobs log in to Jenkins and browse the workspace and download files.

Reasons for test failures
  1. All tests fail: firmware is wrong. I.e. datamaster tools need version 9.0.0, firmware is 8.0.4
  2. snoop with saft-ctl fails: indication: 72 tests of 538 tests fail. Possible reason: ssh access to fel0069 fails.
  3. For some unknown reason the datamaster is slow. Then about 29 tests fail.
Topic revision: r20 - 01 Oct 2024, MartinSkorsky
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback