64K: *That* ought to be enough for everyone. [RSS feed] [Twitter]

DMA-based SPI on SHARC ADSP-21489

If you have resorted to Google after having uselessly pored over AD’s reference manuals, hoping to find at least a proper hint about how to approach this, if not a bit of code, it’s OK, you can stop now. It’s right here. Your quest is complete.

In the hope that no poor soul will have to bang his head against the desk trying to piece together the countless pieces of this puzzle, I did a quick write-up to show you how to do SPI communication with DMA on ADSP-21489. This should probably work (with obvious adaptation) on any DSP in the same family, and it should be easy to adapt if your DSP is the SPI slave, not the master, as below.

Programmer’s view of the SPI peripheral

Ignore the mumbo-jumbo about MOSI, MISO and all that stuff that hardware engineers mumble about, because we will not care about it here. To the programmer, the SPI interface basically looks like this:

  • There are two buffers, one for RX, one for TX. These will be either TXSPI and RXSPI for core-driven transfers (i.e. without DMA) or two buffers of your own choosing, and the IISPI register points at either one of them, if you’re using DMA (why there’s only one pointer will become obvious in a minute).
  • The buffers are filled and, respectively, emptied, synchronously, based on a clock that runs at a frequency given through the SPIBAUD register…
  • …and only when the SPI slave is selected, which you can do automatically by poking the relevant bits in SPIFLG.
    (Note: if you are dealing with a weird peripheral, you can also drive CS manually, via a GPIO pin, but we won't cover that here).
  • Transfers can be 8, 16 or 32-bit long, although SHARC uses a somewhat non-obvious packing scheme.

Various parameters about how this happens can be adjusted through the SPICTL register.

If you’re using DMA, the programmer’s view of the transfer process is slightly complicated by the fact that there is a single DMA channel. While this makes sense (because the SPI protocol is half-duplex — you never transmit and receive simultaneously), — you need to write up a little more boilerplate because the setup is different for DMA receive and DMA transmit.

The DMA system works by independently walking over a user-provided buffer; it keeps an index that starts at 0 (i.e. the beginning of the buffer), and it increases it by a user supplied-value after each SPI transfer. It looks very much like this, except the CPU core is not involved in any of it - you just tell the DMA controller when to start a transfer, and you get a notification (via an interrupt or a bit in a control register) when it's done:

    for (dma_idx = 0; dma_idx < dma_count; dma_idx += dma_increment)

You have control over user_buffer, dma_count and dma_increment above. You also have some control over how much is transferred (received or transmitted) in dma_transfer, but that depends on the peripheral you’re using. For SPI, you can set the length of one transfer to be 8, 16 or 32 bits. This suggests that any value other than 1 for dma_increment is meaningless, but it’s not always true. E.g. if your SPI peripheral is a 4-channel ADC, you can use dma_increment = 4 to get the samples interleaved in a single buffer.

Overall, the whole process looks like this:

  1. General setup: set baud rate, automatic/manual slave select, flush TX, RX buffers and DMA FIFO, set transmission parameters (i.e. who’s master and who’s slave, what interrupts to get, how long are the words being transferred and so on. We’ll get to a more detailed example in a minute).
  2. Setup DMA:
    1. Tell the DMA controller where to get the data from, by setting IISPI to point at your TX buffer
    2. Tell the DMA controller how much to increment its index by, by setting IMSPI. Do note that IMSPI is given in words, not bytes: the next transfer will be done from address \<IISP> + \<current index> (i.e. make sure your buffers are word-aligned!).
    3. Tell the DMA controller how many transfers to make by setting CSPI to the desired upper limit: when the DMA index reaches CSPI, the transfer stops and you get notified through an interrupt.
  3. Send command from master: enable the SPI peripheral, then enable DMA in TX mode (i.e. leaving the SPIRCV bit of the SPIDMAC register unset) to send the data.
  4. Wait for the transfer to complete: you can either be notified of that through an interrupt, or just poll the SPIFE bit of the SPISTAT register. Your choice.
  5. Transfer answer from device: disable the SPI peripheral, flush the DMA FIFO and setup DMA again:
    1. Tell the DMA controller where to put the data, by setting IISPI to point at your RX buffer
    2. Tell the DMA controller how much to increment its index by, by setting IMSPI.
    3. Tell the DMA controller how many transfers to make by setting CSPI to the desired upper limit
      (Note: AD’s documentation points out a way to do this without disabling the SPI peripheral. but I haven't tried it. It’s probably the only part of their SPI-related documentation that makes sense.)
  6. Enable the SPI peripheral, then enable DMA in RX mode (i.e. set the SPIRCV bit of the SPIDMAC register)
  7. Wait for the transfer to complete. This is done in the same manner as above.

That’s it.

The Code

There is no finished technical documentation that does not have code. Never trust documentation that does not come with code examples. If it does not have any code, the documentation has probably been out of date for years now.

The code below has no portability layer. If you want to use it in a real-life project, you will probably want to modify it is to that:

  • Transmission parameters (e.g. baud rate) can be supplied by the user in a standard format
  • Specific RX and TX functions are provided. Depending on your application and personal taste, you may want to keep the polling loop and subsequent port resetting as a separate function, or seamlessly integrate them in specific RX and TX functions. I opted for the version that’s messier, but makes the algorithm more obvious.

We’ll start by writing a type definition of a SHARC SPI port

        struct sharc_spi_port
            uint32_t* spictl;
            uint32_t* spiflg;
            uint32_t* spibaud;
            uint32_t* spidmac;
            uint32_t* spistat;
            uint32_t* iispi;
            uint32_t* imspi;
            uint32_t* cspi;
            uint32_t* cpspi;
            uint32_t* txspi;
            uint32_t* rxspi;

This can be used to statically initialize two variables, struct sharc_spi_port spi_a, spi_b, so that we don’t end up writing port-specific code.

Based on the above, the init function could look as follows:

        static void sharc_spi_init(struct sharc_spi_port *p)
            (p->spibaud) = 4;

            /* SPI Flag 0 is select output */
            *(p->spiflg) = DS0EN;

            /* Clear TX and RX buffers */
            *(p->spictl) = TXFLSH | RXFLSH;

            /* Clear SPI status reg */
            *(p->spidmac) = FIFOFLSH;

            /* Clear DMA FIFO */
            *(p->spistat) = 0xFF; 

            /* Set DMA chain parameters */
            *(p->cpspi) = 0;

            * - Initialize transfer by DMA
            * - If RX buffer is full, get more data, overwrite previous
            *   contents
            * - No delay between word transfers
            * - When TX buffer is empty, send 0
            * - SPI word length is 16 bits
            * - Send words MSB first
            * - DSP is SPI Master */

            *(p->spictl) = TIMOD2 | GM | SENDZ | WL16 | MSBF | SPIMS;

We’re going to use a single buffer called spi_samples for both TX and RX. You can use separate buffers if you need to; it not only simplifies things a lot, but it also means that you can process a buffer while transmitting the next command.

        static void spi_dma_send(struct sharc_spi_port *p, uint32_t *data,
        unsigned int len)
            memcpy(&spi_samples[0], data, len);

            crt_spi_count = len;

            *(p->iispi) = &spi_samples[0];
            *(p->imspi) = 1; /* Each read increments DMA index reg by 1 */
            *(p->cspi) = crt_spi_count; /* Transfer this many words */

            /* Enable SPI */
            *(p->spictl) |= SPIEN; 

            /* Enable DMA with interrupt on transfer, transfer from internal memory */
            *(p->spidmac) = SPIDEN | INTEN;

The corresponding RX function would be:

        static void spi_dma_recv(struct sharc_spi_port *p, unsigned int len)
            memset(&spi_samples[0], 0, sizeof(rhd_spi_samples));

            crt_spi_count = len;

            *(p->iispi) = &samples[0];

            /* Each read increments DMA index reg by 1 */
            *(p->imspi) = 1;

            /* Transfer this many words */
            *(p->cspi) = crt_spi_count;

            /* Enable SPI. Needs to be done before enabling DMA */
            *(p->spictl) |= SPIEN;

            /* Enable DMA with interrupt on transfer, transfer from internal
            * memory */
            *(p->spidmac) = SPIDEN | INTEN | SPIRCV;

And we can piece this up together in a simple echo loop for a fictitious SPI device that echoes all it gets back when you send it 0xCAFE:

        static void rhd_query(struct device_drv* drv)
            uint32_t cmd = 0xCAFE;

            while (1) {
                spi_dma_send(drv->spi_port, &cmd, 1);

                /* Change from TX to RX DMA. */
                /* You will want to integrate this in a separate function, or in
                * spi_dma_{send|recv} */

                while (!*(drv->spi_port->spistat) & SPIFE)) {
                    /* Twiddle thumbs, yield to another task etc. */

                *(drv->spi_port->spictl) = 0x00;
                *(drv->spi_port->spidmac) = 0x00;
                *(drv->spi_port->spistat) = 0xFF;

                spi_init(&spi_a); spi_dma_recv(&spi_a, 1);

                while (! (*(drv->spi_port->spistat) & SPIFE)) {
                    /* Twiddle thumbs, yield to another task etc. */

                *(drv->spi_port->spictl) = 0x00;
                *(drv->spi_port->spidmac) = 0x00;
                *(drv->spi_port->spistat) = 0xFF;

That’s about it. I trust the astute reader will be able to work out more serious real-life requirements (i.e. doing something useful while the DMA transfer is happening) on their own.

NOTE: This article is republished from my old website, in the interest of preserving it after my Geocities-era personal place on the Internet is gone.