Maker Pro
Maker Pro

AT25128 Mass Erase / Write Lock

Hello,
I am using the AT25128B EEPROM and I have observed some strange behavior. I am storing data in approximately 15% of the addresses and what mysteriously happens is the stored data is all erased and becomes 0xFF. When diagnosing the issue I also see that the BP0 and BP1 bits of the status register are set to 1 which locks out future writes. In the application all EEPROM data writes are completed 2 bytes at a time--which would take numerous "accidental" writes of 0xFF to cause this issue. Also there is nothing developed in software to send the OP-code to write to the status register.

I can answer many other questions about the design if you need it:
-master is a 16-bit microcontroller
-VCC=+5V
-10kOhm pull up to +5V on chip select
-10kOhm pull up to +5V on WP and HOLD
-10kOhm pull up to +5V on HOLD
-10kOhm pull up to +5V on Master Receive
-10kOhm pull down on Master Transmit
-10kOhm pull down on clock
-EE and microcontroller all reference the same VCC and Ground

I have not observed any goofy behavior on VCC, clock, chip select, serial out, etc on power up or power down.

Has anyone experienced weird behavior around EEPROM data being completely lost or the status register getting changed?
 
A quick glance at the datasheet doesn't seem to indicate any opcodes that do a bulk erase of the chip.

Barring some faulty code that is writing 64-byte pages of FF to the entire chip, two things that come to mind are: (1) do you have decoupling capacitors in the circuit, near the EEPROM as as well as the micro? And (2) could there be electrostatic discharge causing your issues? Or maybe a damaged/faulty EEPROM. Do you have another you can try?

What about this:
-10kOhm pull up to +5V on WP and HOLD
-10kOhm pull up to +5V on HOLD
Do you have two pull-ups on HOLD? Or WP & HOLD connected together? Are they connected to I/O pins on the micro?
 
Thank you for your reply.

Atmel does have some proprietery methods of mass erasing the chip; but, they do not share this procedure. They shared some of the details with me and from what they told me I am not accidentally entering this mass erase protocol.

I have a decoupling cap right across the EEPROM chip VCC and GND. The microcontroller has more than 10 decoupling caps on all of the different VDD pins.

After I unlock the EEPROM, everything works as normal again. I can read and write. I have considered ESD; but after consideration it does not appear to be ESD. I have tested ESD in the past.

I am mass producing this piece of electronics. I have only seen this issue occur on 7 out of approximately 5,000 units. I produced tens of thousands more of these units back on the A revision of the AT25128. I have never seen this on the A revision. Atmel made a die shrink from rev A to rev B. I have talked with Atmel directly to get ideas.

That was a typo I made. WP and HOLD indeed due share the same pull up resistor to +5V. The HOLD does not have two different pull up resistors. WP and HOLD do not tie back to the microcontroller.
 
If you have an I/O pin available,tie WP to it and use it to write-disable the chip when it isn't being written to.
 
Thank you for the great idea. I considered that also to prevent accidental writes/operations. Unfortunately our microcontroller has no spare outputs. I may need to free up some I/O pins if I can't resolve this issue soon.
 
We observed the same behaviour on several devices too. Root cause is still unclear, because there is no write access to the SR in our SW. Could you solve the problem with the AT25128B till now?
 
Are you reading back all 0xFFs in the EEPROM also?

I still do not have a solution with the AT25128B device. I am changing manufacturers and switching to ST to hopefully resolve this issue. I haven't determined root cause yet; but, I am getting close. When putting the device in the cold chamber (-35°C to -40°C). I would periodically read back data from EEPROM saying the majority of addresses are 0x00 when they previously contained other data at room temp. When it is in this state I cannot write to EEPROM (at least I haven't been able to yet). I tested this and got this behavior on 3 different AT25128B devices when I put 5 devices in the cold chamber (i.e. 3 out of 5 failed). Then if I increase temperature back to room temp those areas which were falsely reading back 0x00 were now reading back the good data again. Of course this is not the same behavior as locking the status register and reading back all 0xFFs; but, I believe it is related. The next test I will run is to attempt to reprogram the microcontroller (SPI master) when it is in this state where it is reading back 0x00s and see if it locks the status register. Then after that the next test is to test the ST EEPROM chip in the temp chamber.

It seems like when Atmel shrunk the die size they are now having issues at temperature extremes. I have only tested at -40°C for cold temp so I haven't verified this occurs at, say, 0°C. I had to soak at -40°C for around an hour before I saw the issue.
 
Are you reading back all 0xFFs in the EEPROM also?

I still do not have a solution with the AT25128B device. I am changing manufacturers and switching to ST to hopefully resolve this issue. I haven't determined root cause yet; but, I am getting close. When putting the device in the cold chamber (-35°C to -40°C). I would periodically read back data from EEPROM saying the majority of addresses are 0x00 when they previously contained other data at room temp. When it is in this state I cannot write to EEPROM (at least I haven't been able to yet). I tested this and got this behavior on 3 different AT25128B devices when I put 5 devices in the cold chamber (i.e. 3 out of 5 failed). Then if I increase temperature back to room temp those areas which were falsely reading back 0x00 were now reading back the good data again. Of course this is not the same behavior as locking the status register and reading back all 0xFFs; but, I believe it is related. The next test I will run is to attempt to reprogram the microcontroller (SPI master) when it is in this state where it is reading back 0x00s and see if it locks the status register. Then after that the next test is to test the ST EEPROM chip in the temp chamber.

It seems like when Atmel shrunk the die size they are now having issues at temperature extremes. I have only tested at -40°C for cold temp so I haven't verified this occurs at, say, 0°C. I had to soak at -40°C for around an hour before I saw the issue.
Hi John,
This Clay at Atmel.
Would you be able to enlighten me on the way you can force this problem to occur?
I would like to try to see this problem in our laboratory.
Thank you,
 
I did the following:
- AT25128B was soldered down on a PCB assembly which is connected to a microcontroller SPI master (original post has more details)
- At room temp, I successfully wrote all the data to EEPROM that I need in the application
- I verified this by reading back the data and it matched what I wrote
- I stuck the entire PCB assembly (it is in a water tight enclosure) with a power, ground, CAN harness in the temperature chamber and set it to -40°C
- I powered up the device and read back the data succesfully over CAN
- (I have some CAN messaging in place to read and write to EEPROM)
- I soaked at -40°C for 1 hour for the assembly to reach -40°C
- After that, about every 10 minutes I would read back the data succesfully
- After another hour, all the sudden several locations were reading back 0x00 instead of the valid data
- At this time I could not write to EERPOM
- The software application I was running did not have a way of reading the status register via the CAN messaging I had set up so I could not read back the status register at that time
- I turned off the chamber and allowed the unit to warm up
- After an hour or so the data read back all the valid data again

I was able to do this on three different AT25128B ICs.
 
Hi John, have you ever made any Power On/Power Off test in the cold chamber? May a problem with the recommended POR?
 
I have not performed those tests in the temperature chamber yet. I performed a lot of power cycling, brown out voltage, etc tests at room temperature before I ever went in the cold chamber and I never saw any anomalys.

DiSa, did your issue result in all memory locations being written 0xFF when the status register got locked out?
 
I did the following:
- AT25128B was soldered down on a PCB assembly which is connected to a microcontroller SPI master (original post has more details)
- At room temp, I successfully wrote all the data to EEPROM that I need in the application
- I verified this by reading back the data and it matched what I wrote
- I stuck the entire PCB assembly (it is in a water tight enclosure) with a power, ground, CAN harness in the temperature chamber and set it to -40°C
- I powered up the device and read back the data succesfully over CAN
- (I have some CAN messaging in place to read and write to EEPROM)
- I soaked at -40°C for 1 hour for the assembly to reach -40°C
- After that, about every 10 minutes I would read back the data succesfully
- After another hour, all the sudden several locations were reading back 0x00 instead of the valid data
- At this time I could not write to EERPOM
- The software application I was running did not have a way of reading the status register via the CAN messaging I had set up so I could not read back the status register at that time
- I turned off the chamber and allowed the unit to warm up
- After an hour or so the data read back all the valid data again

I was able to do this on three different AT25128B ICs.
Would you please list all of the markings from the failed devices (all three lines)?
 
I will provide the markings in the coming weeks. Right now I want to keep these assemblies full production intent for the remainder of my testing. Since they are sealed assemblies I cannot view the EEPROM chip right now. Unsealing the assembly introduces the slight risk of cracking a trace on the PCB and may cause a separate issue. This last week was the very first time I have been able to reproduce the issue so these failed assemblies are very valuable to me.

This issue has occurred on at least 80 different devices in the field the last year. I was never able to reproduce the issue until now.
 
I got more to fail. Here are the markings of parts which failed at -40°C.
DUT 1 ATMLP352 5DBD B .3X2692A
DUT 2 ATMLP416 5DBD B .3Y4690A
DUT 3 ATMLP348 5DBD B .3X1297B
DUT 4 ATMLP482 5DBD B .3W2975C

The third row markings were difficult to make out on the last two devices so what is written above is my best guess.
 
On these four DUTs I was able to recreate the issue easily every time--typically within an hour of being at -40°C. I modified the four PCBs which had this issue and took off the AT25128 parts and soldered on M95128 parts (ST equivalent EEPROM). Last week I spent several consectutive days in the temp chamber with these ST parts at -40°C and could not get recreate the failure. I still plan to test on a larger scale; however, this is the first step at gaining confidence of moving away from the Atmel parts.
 
Top