The SPI bus is a de facto standard for transmitting data to and from devices. In its most basic state, it has three wires: MOSI (Master Out Slave In), MISO (Master In, Slave Out), and a clock signal. Each chip also has a Chip Select pin to indicate whether you are talking to this specific device or not. Just tie it low and then talk only to this device.
Of course, multiple devices can be added, in this case, you will need to use a specific Chip Select pin per device, so the total number of pins rises to 3 + the number of devices. For one accelerometer and one flash chip, it will require MOSI, MISO, Clock, and then two Slave Select pins.
When data throughput becomes a necessity, you can choose a more advanced option, such as QuadSPI, OctoSPI, or even HexadecaSPI, which has a whopping 16 data lines.
The STM32H563ZI (available on the Nucleo-H563ZI) has one OctoSPI interface. You can place a single OctoSPI device on the bus and perform reads and writes directly through the HAL, or put two devices and have to handle the Chip Select yourself. Some microcontrollers, like the STM32H723 (available on the Nucleo-H723ZG), have two OctoSPI interfaces. This allows you to use two OctoSPI devices, each with 8-bit data lines, a Clock, and hardware-enabled Chip Select. The number of reserved GPIOs rises to 9 per interface: IO0 to IO7, a Clock, and a Chip Select pin. To have two OctoSPI devices, you would theoretically require 18 GPIOs reserved solely for the OctoSPI function. Why would you need two chips in the first place? Maybe one of those devices is a NAND flash chip, and the second one is external SRAM, allowing you to have storage space for images on the most demanding graphics applications.
Use case
My client is using an STM32U585, a beast of a chip with 2 megabytes of flash and a hardware encryption engine, all of that running on a chip at 160 MHz. The microcontroller is in the U family, short for ultra-low power. The specific need is to have two flash modules, each a QuadSPI flash device with 128 megabits of storage. The immediate question is – why? Why not use a single 256-megabit device? The answer is simple – this particular component is used in multiple other projects, and adding another component reference is a bad idea (testing, validation, and keeping stock). Since this device has no long-term low power consumption or size constraints, the solution was to use two devices.
When explaining using dual SPI devices above, I left out one critical piece of information. The H723, and indeed the U585, both have dual OctoSPI interfaces, but they can be multiplexed. That means the IO (IO0 to IO7) are shared between the two devices, but each OctoSPI interface has a separate Chip Select pin. In essence, that means that the IO are shared, and only two pins are needed to know which flash chip is being addressed, reducing the GPIO requirements from 18 to 10. Even more important, the devices being used are only QuadSPI, meaning that we will be using IO0 to IO3, pone clock and two chip select pins, for a total of 7 GPIOs to handle two devices – more than reasonable.
Having devices multiplexed like this has the advantage of IO usage but also disadvantages. By definition, it isn’t possible to talk to the two devices simultaneously, but this isn’t a requirement for this project, so the multiplexed SPI solution works fine. Besides, these flash chips can have reasonably long write cycles, so while one chip is busy performing auto-writing, we can write data to the second flash chip, speeding up the process.
Configuration
We will use a Nucleo board, more precisely, the Nucleo-U575ZI-Q, to perform a proof of concept. The difference between the U575 and U585 is the flash size and the lack of an embedded hardware encryption engine, which are of little importance for a proof of concept design. Using STM32CubeMX, it was simple enough to create a single Quad-SPI interface.
To get this up and running, we need to select the mode. I’ve created a standard configuration to compare the standard and multiplexed versions. A single chip is used, connected to the QuadSPI port, so I’m using QuadSPI, and not the multiplexed version. Next comes the clock configuration. This device has two OctoSPI ports, so we have the choice between Port1 and Port2. Port1 it is. The same goes for the Chip Select, I’m using Port 1. Finally, the data lines. I’m only using four of the eight lines, and I’m using Port1 IO0 to 3. This is the standard configuration you would see on any Quad-SPI or Octo-SPI device. When the device is configured, we have a GPIO setting that looks like this:
Now, let’s look at my use case: two devices. We can use the traditional double OctoSPI interface, essentially doubling the amount of lines that would be needed. There is, however, another way, which is by using the multiplexed version. Let’s change the configuration to the following:
Now, our mode has the “Multiplexed” tag at the end. Our clock and Data lines are all multiplexed since they will be shared between two devices. The Chip Select line, however, isn’t, for obvious reasons. Our GPIO configuration looks identical, as we don’t use additional pins.
Once this is done, we can move on to OctoSPI2. Configuring this peripheral is excessively easy since everything is greyed out except for one choice. We will be using QuadSPI, multiplexed mode. We will use the same clock, Port1, and the same data lines, Port 1. As for Chip Select, we only have the choice for Port2. Looking at the GPIO usage for OCTOSPI2, we can see a single line:
Finally, we need to configure the chip itself. This particular device falls into the Micron category and is a flash chip. For the size, we need to enter 24. This doesn’t correspond to the number of megabits/bytes, but rather the number of bits available, in two to the power of x format. Two to the power of 24 is 16777226, or 16384K. Interestingly, on my version of CubeMX, you have to enter a numerical value for the U5. However, the H5 uses a different strategy, using a pull-down list with human-readable values (128MBits, which equates to HAL_XSPI_SIZE_128MB).
Code
By default, CubeMX generates two OctoSPI interfaces, &hospi1
and &hospi2
. Reading and writing are performed as normal, using two different OctoSPI interfaces. Our current driver implementation takes a virtual address and then performs HAL calls using one device or the other. If one device is busy on the bus, the other will not launch a command, returning that the driver is currently unavailable.
HAL_OSPI_Command(&hospi1), &command, HAL_OSPI_TIMEOUT_DEFAULT_VALUE);
One difference here resides in memory mapping. Of course, these peripherals allow you to use memory mapping, but the two devices are in entirely different regions. Which, under the circumstances, is understandable. While I would have liked an option to make the two devices mappable on a contiguous region, it isn’t practical if one chip is NAND flash and the other is SRAM. So, two different regions it is, then. The memory view looks like this (also showing off the impressive 768KB of internal SRAM):
The good and the bad
The best point about this, and the entire reason why multiplexed SPI exists in the first place, is pin usage. Adding a single chip requires only one more GPIO, the Chip Select line. That goes for standard SPI, QuadSPI and OctoSPI; one line is all you need. Of course, such an advantage comes with hindrances, first being speed. Multiplexing changes nothing regarding bus speed, but while talking to one chip, you cannot perform background DMA transfers to the other. Speed wasn’t an issue for this particular client, so this isn’t a problem. The memory mapping location might be a problem for some, but once again, for this project, drivers and planning made using two separate chips less of a hassle (there would be a write operation spanning two devices).