Here's the explanation: The serial register chain that loads the GPIO configurations has hold timing violations between each of the GPIOs. The timing violations corrupt the configuration bitstream, but do so in a (more or less) predictable way. Unfortunately, the timing is close to the boundary of what I've been calling an "independent hold violation" or a "dependent hold violation". An independent hold violation is a hold violation for both rising edge and falling edge transitions. Its effect is to slide an extra bit forward at the boundary between GPIOs, effectively moving two spaces in the shift register on one clock. The "dependent hold violation" (in this case) is a hold violation that only happens on a falling edge transition. Its effect is to move 0
bits two spaces forward on one clock cycle, but not 1
bits. Also unfortunately, the distribution of independent hold violations is pretty random and varies from chip to chip. So the only way to know how to configure the GPIO is to run tests that establish which kind of hold violation exists where along the chain. That's what the test above is doing.
The test checks each channel from 1 to 18, then 36 to 19. The first chain goes up the right side of the chip, and the second chain goes up the left side. The second chain programs in the reverse direction of the way the GPIOs are numbered. Channels 0 and 37 are the first GPIOs in the serial chain, thus the last to be programmed, and there is no timing violation going into either one, so those don't need to be checked.
Each channel is tested first assuming it has an independent hold violation. If that test fails, it is tested again assuming that it has a dependent hold violation. In some cases, that will be sufficient to calibrate the part. But some chips will have hold violations that are right on the boundary between a dependent and independent hold violation, and the type of hold violation will tend to flip from one to the other randomly. That's what happened in your test above. GPIO failed when assuming an independent hold violation, passed when assuming a dependent hold violation, and then flipped on the next test and failed again. There is one thing that can be done about such cases: Lower the power supply voltage to try to shift the timing of the hold violation. Usually, a lower power supply will help resolve such cases. The default power supply is 1.6V (which was the best overall voltage when we tested a number of different chips). You can do
make run PART=XYZ VOLTAGE=1.5
and see if you get better results. A voltage of 1.4 will most likely fail completely because the SPI flash chip stops running at that voltage. I have been able to run the Caravel chip and development board at VOLTAGE=1.43.
In the worst case, you may not be able to find a voltage where the hold timing becomes stable, or you might find that stabilizing one hold violation just causes another unstable hold violation to appear somewhere else. In that case, you're probably best off just trying a different chip.
Hopefully that explanation helps clear things up!