https://open-source-silicon.dev logo
Title
k

Kavya Sreedhar

03/14/2023, 10:12 PM
@Tim Edwards I originally had the diagnostic passing at 1.45V for all my IOs and configured them as independent/dependent with the file copied by make get_config. However, today on the same chip, I’m now observing some configuration failures (particularly IO[13] in low which fails for independent and dependent but previously passed with independent and IO[32] in high which fails for independent and dependent but similarly previously passed with independent) at the same voltage…any ideas on what might be good to check? I have confirmed that the flexy pins are in contact with the pads on the chip with a multimeter. thanks for the help!
all that said, one of my user project tests with these (now failing GPIOs) still seems to work with the previous gpio_config_def.py file which is a little confusing…but I’m not sure if maybe these new failures are the reason that I am seeing other user project tests I have failing
the diff between the files is here:
< # configuration failed in gpio[13], anything after is invalid
---
> # IO configuration chain was successful
28,33c28,33
< ['IO[13]', H_UNKNOWN],
< ['IO[14]', H_UNKNOWN],
< ['IO[15]', H_UNKNOWN],
< ['IO[16]', H_UNKNOWN],
< ['IO[17]', H_UNKNOWN],
< ['IO[18]', H_UNKNOWN],
---
> ['IO[13]', H_INDEPENDENT],
> ['IO[14]', H_INDEPENDENT],
> ['IO[15]', H_INDEPENDENT],
> ['IO[16]', H_INDEPENDENT],
> ['IO[17]', H_INDEPENDENT],
> ['IO[18]', H_INDEPENDENT],
36c36
< # configuration failed in gpio[32], anything before is invalid
---
> # IO configuration chain was successful
43,56c43,56
< ['IO[32]', H_UNKNOWN],
< ['IO[31]', H_UNKNOWN],
< ['IO[30]', H_UNKNOWN],
< ['IO[29]', H_UNKNOWN],
< ['IO[28]', H_UNKNOWN],
< ['IO[27]', H_UNKNOWN],
< ['IO[26]', H_UNKNOWN],
< ['IO[25]', H_UNKNOWN],
< ['IO[24]', H_UNKNOWN],
< ['IO[23]', H_UNKNOWN],
< ['IO[22]', H_UNKNOWN],
< ['IO[21]', H_UNKNOWN],
< ['IO[20]', H_UNKNOWN],
< ['IO[19]', H_UNKNOWN],
---
> ['IO[32]', H_INDEPENDENT],
> ['IO[31]', H_INDEPENDENT],
> ['IO[30]', H_INDEPENDENT],
> ['IO[29]', H_INDEPENDENT],
> ['IO[28]', H_INDEPENDENT],
> ['IO[27]', H_INDEPENDENT],
> ['IO[26]', H_DEPENDENT],
> ['IO[25]', H_INDEPENDENT],
> ['IO[24]', H_INDEPENDENT],
> ['IO[23]', H_INDEPENDENT],
> ['IO[22]', H_INDEPENDENT],
> ['IO[21]', H_DEPENDENT],
> ['IO[20]', H_INDEPENDENT],
> ['IO[19]', H_INDEPENDENT],
also, I have IO[25] configured as a MGMT output which still toggles as expected, so I’m not sure if the configuration is actually fine and am trying to understand why the diagnostic is now failing (and if that might cause some issues)
t

Tim Edwards

03/15/2023, 2:17 AM
The evidence does strongly suggest that the original diagnostic was good, and that it's the diagnostic itself that is failing. To what extent do you actually need to keep running diagnostic on this part? If you're re-running the diagnostic because you continue to see failures you don't understand, it could be that there are (occasional) failure points at IO[13] and IO[32]. Intermittent failures are by far the hardest to diagnose and pin down. There could be contributing factors that make the failure more likely at some times than others, including ambient temperature. Shifts over time (like aging effects or electromigration) are pretty unlikely over the short time and mild conditions (compared to real stress tests) in which you've been testing the part.
👍 2
k

Kavya Sreedhar

03/15/2023, 2:18 AM
I don’t need to keep running the diagnostic on this part, but I was trying different chips I had and I observed that IO[13} and above and IO[32] and below were failing on all the new chips. So out of curiosity, I tried the original chip I had been using again and saw these new failures.
These new failures do not seem to be intermittent, it consistently failed at different voltages when I ran it five times for the original chip and also failed on five different chips I tried.
As a result, I’m not sure how to configure these other chips.
t

Tim Edwards

03/15/2023, 1:09 PM
I am failing to come up with any mechanism that would do that short of an issue with the Nucleo board, and that seems unlikely. There's one thing I can suggest, which hopefully isn't just a time sink, but I have another method that I worked up for calibrating chips. It's not as well developed as the existing method, but it has the advantage of being blazingly fast by comparison---it will calibrate a chip in a few seconds. The main issue is that I did not add a way to provide feedback on where an error occurred. But it should be pretty obvious whether or not it is having the same issue on every chip. I will post the code in a bit.
This tarball should be untarred in the
firmware_vex
directory of the caravel_board repository. How to run it is described in the README file. It works a bit differently from the calibration test you've been using, because it uses the automatic transfer instead of the bit-bang. But at least to set up and run the calibration is pretty simple. The output is a list of channels with dependent hold violations (it assumes that all the others are independent, and doesn't report on channels 0 and 37, which are assumed to work properly). If it never reports anything in the range of 13 to 32, then the issue is persistent, and in the hardware somehow.