ESP8266 SPI Flash protocol

I was wondering whether the ESP8266 SPI flash could be used for atomic (or sort-of-atomic) write operations - so I wanted to understand how the underlying SPI operations work.

TLDR

API call	SPI command	Note
`spi_flash_erase_sector`	`0x20`	Repeated status reads (command `0x05`) until erase completes
`spi_flash_read`	`0xBB`	Dual SPI
`spi_flash_write`	`0x02`	Single status read, test completed immediately

Read on to learn more...

Some of the different SPI flash chips found on ESP8266 modules

The ESP-12F on my WeMos D1 Mini uses a Winbond W25Q32BV
The SparkFun Thing uses an Adesto AT25SF041
A black ESP-01 I brought on ebay uses a Berg Micro 25q80a

These different chips seem to have enough commands in common that they can be substituted for one another.

Dual and Quad SPI

Traditional SPI has 'MOSI' and 'MISO' lines, with data travelling in opposite directions. If you don't know what MOSI and MISO mean, go review that wikipedia article otherwise the rest of this explanation won't make much sense!

The flash chips listed above support 'Dual SPI' which basically means, instead of MOSI and MISO always going in opposite directions, certain commands cause both lines to temporarily be used in the same direction. Here's an example:

The first byte on MOSI is 10111011 or 0xBB in hex - what the Winbond datasheet calls "Fast Read Dual I/O".

When that command is received, MISO changes direction for 16 clock cycles, allowing the master to output 24 bits of address data in 12 clock cycles followed by 8 bits of control options, sent in 4 clock cycles. You read the bytes MISO-MOSI-MISO-MOSI- so the address above is 001111111011000000000000 or 0x3FB000 and the control options are 0x00.

After this, MISO and MOSI change direction - both lines are used by the slave to send data to the master. In the example above, the data returned by the slave is all ones, or 0xFFFFFFFF.

Quad SPI adds an extra two lines between master and slave, and allowing for the transfer of four bits per clock cycle. When the Arduino IDE offers the choice between flash modes 'DIO' and 'QIO' flash this is what you're choosing between (whether you'll have the choice depends on the board you have selected).

The practical speed benefits of Quad SPI over Dual SPI seem suprisingly modest, according to speed test results reported on esp8266.com which reports read times reducing by only 14% when changing from dual to quad SPI.

Programs execute from SPI flash

As you may know, user programs can be up to a megabyte, but there are only 64 kilobytes of instruction memory. This means quite a lot of data gets shuttled over the SPI bus under normal operation.

I haven't looked into the details, but I assume the ESP8266 does some sort of multitasking, so it can look after all the wifi maintainance stuff in the background while the user's program runs. Anyway, in my tests I found that even when I put a long delay in my program, the SPI bus wasn't always silent during that time.

How I tested the SPI behaviour

Initially, I tried to use a 1.5MHz bandwidth two-channel scope to record MOSI and MISO, inferring the clock transitions from changes in the data lines. This didn't work very well, because you can't capture a 40MHz bus with a 1.5MHz scope; and if you slow the bus down far enough that you can capture it your program will keep resetting with watchdog timer failures; and even if you ignore those you'll most likely get a recording of code being loaded, as there's a lot of that going on over the bus.

So I moved to a 20MHz bandwidth 8-channel scope. This allowed me to also capture clock, chip select, and a pin I configured to act as a trigger when my test started.

I used an ESP-12F on a breakout board that exposes all pins (including the SPI flash pins)

Here's the program I used:

extern "C" {
#include "spi_flash.h" // Provides SPI_FLASH_SEC_SIZE (usually 4096)
}

extern "C" uint32_t _SPIFFS_end; // Usually points to 0x405FB000
const uint32_t START_ADDRESS_BYTES = (uint32_t)&_SPIFFS_end - 0x40200000;
const uint32_t START_ADDRESS_FLASH_SECTORS = START_ADDRESS_BYTES/SPI_FLASH_SEC_SIZE;

const int led = 2;
const int LED_ON = LOW;
const int LED_OFF = HIGH;

void setup() {
  pinMode(led, OUTPUT);
  digitalWrite(led, LED_OFF);
  Serial.begin(115200);
  Serial.println("Started.");
}

void loop() {
  delay(500);
  Serial.println("Testing SPI flash...");

  noInterrupts();
  digitalWrite(led, LED_ON); // We can trigger our scope on this.

  // Reduce SPI clock speed so it fits in our scope's bandwidth
  uint32_t clkbefore = SPI0CLK;
  SPI0CLK = 0x00D43002;

  spi_flash_erase_sector(START_ADDRESS_FLASH_SECTORS);
  uint32_t readBuf[1] = {0};
  spi_flash_read(START_ADDRESS_BYTES, readBuf, sizeof(readBuf));
  uint32_t writeBuf[1] = {0xF0F0F0F0};
  spi_flash_write(START_ADDRESS_BYTES, writeBuf, sizeof(writeBuf));

  SPI0CLK = clkbefore; // Restore SPI clock speed.
  digitalWrite(led, LED_OFF);
  interrupts();
}

Results

Sector erase

spi_flash_erase_sector is called to erase sector 0x3FB, producing the command 20 3F B0 00 on MOSI then repeated status checks (MOSI 05 00, MISO FF 02 and FF 03) until the status reports the operation is complete (MISO FF 00)

Data read

spi_flash_read is called to read 32 bits from start of that sector - address 0x3FB000. The scope capture is shown in the 'Dual SPI' section, as the read is performed with Dual SPI; BB 74 00 FF FF on MOSI and 00 7C 00 FF FF on MISO. A read of 0xFFFFFFFF is the expected result for NAND flash that has just been erased.

Data write

spi_flash_write is called to write 32 bits to the same address; this produced 02 3F B0 00 F0 F0 F0 F0 on MOSI and 00 00 00 00 00 00 00 00 on MOSI, indicating that dual SPI was not used. Look how many clock cycles there are compared to the write! A status check (MOSI 05 00 MISO 00 00) returned immediately.

Raw scope capture

Want to look at the data yourself in more detail? You can download it as a 15 megabyte, 2 million row CSV - but you won't be able to load it in Excel because it's too big.

Published

19 September 2016