Pseudo SRAM Access from lm32

Introduction

The W968D6DA provides 256Mbit (32MByte) of Pseudo SRAM (datasheet). It provides 32 bit address width and 16 data lines. Presently, it is only employed in the Exploder5 form factor.

In early 2019 some measurements have been performed which are presented here.

Measurements using SignalTap

As the PSRAM is rarely used, the present Exploder5 VHDL connects it to the so-called device crossbar. Thus, it is not directly connected to the lm32 but access is realized via Wishbone with two crossbars ('top-crossbar', 'device-crossbar') in between. Since measurements via CPU are not useful, measurements have been done using Quartus SignalTap.

Time Values
Mode Type total lm32 RAM accessSorted ascending 1st lm32 intercycle gap 2nd lm32 intercycle gap lm32 to PSRAM WB controller PSRAM WB controller to lm32 PSRAM WB controller memory access
random access copy (write) 368 24 32 32 112 72 96
random access copy (read) 368 24 32 32 112 72 96
burst write 296 N/A N/A 8 120 72 96
Table: Access time values for different modes. Numbers are per 32bit word [ns].

Bandwidth
Mode Type connection bandwidth [Mbit/s] Comment
random access (w) register --> PSRAM WB with 2 x-bars @ 62.5 MHz 108 measured
random access (r/w) copy RAM <-> PSRAM WB with 2 x-bars @ 62.5 MHz 87 measured
random access (r/w) copy RAM <-> PSRAM WB direct @ 62.5 MHz 175 extrapolated
burst (r/w) N/A DMA controller @ 62.5 MHz 290 extrapolated
burst (r/w) N/A DMA controller @ 125 MHz 800 theoretical maximum for direct connection DMA/PSRAM
Table: Bandwidth for different modes and connection modes. Measured values for the present Exploder5 are given in the first two rows.

Figures

Measurements from lm32 CPU

As a complement to the measurements using SignalTap, some numbers have determined using a simple program in the lm32 on a Exploder5.
matrix register (w) shared RAM (w) PSRAM (w)
register (r) N/A 795 (40) 99 (320)
shared RAM (r) 443 ( 72) 322 (96) N/A
PSRAM (r) 95 (336) N/A 51 (624)
Table: Pseudo SRAM access measured using a lm32 program. Numbers are per 32bit word and given as bandwidth in MBit/s (in brackets: access time in nanoseconds). Explanation see text.

The numbers slightly underestimate the bandwidth (= overestimate the access time), as the overhead for timestamping and handling of 'for loops' is included. The measurement protocol is attached.

Conclusion

  • non-optimized: PSRAM provides random access with about 100MBit/s (368ns per 32bit word).
  • optimized:
    • using direct connection and a DMA controller will allow random access up to 500 MBit/s (~ 72ns per 32bit word) or 800 MBit/s (~40ns per 32bit word).
    • PSRAM is will be as fast as 'direct access' of lm32 to its own shared memory providing a bandwidth of up to 800MBit/s (~40ns).

Verdict

Direct connection and a DMA controller would provide about the same performace as access of lm32 RAM today.
A hardware implementation of true (faster) SRAM does not make much sense.
True SRAM only provide an option, if (at some time in the future) one would migrate to a different softcore architecture.

-- DietrichBeck, MathiasKreider - 21 Feb 2019
I Attachment Action Size Date Who Comment
lm32-ram-performance.txttxt lm32-ram-performance.txt manage 862 bytes 21 Feb 2019 - 09:27 DietrichBeck protocol of lm32 RAM access
psram2ram.pngpng psram2ram.png manage 124 K 21 Feb 2019 - 09:05 DietrichBeck random access (read) from PSRAM @ Exploder5
ram2psram.pngpng ram2psram.png manage 124 K 21 Feb 2019 - 09:06 DietrichBeck random access (write) to PSRAM @ Exploder5
reg2psram.pngpng reg2psram.png manage 120 K 21 Feb 2019 - 09:06 DietrichBeck burst (write) to PSRAM @ Exploder5
Topic revision: r3 - 21 Feb 2019, DietrichBeck
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback