SSD MX300, can't finish Device self test

Kilobyte Kid

SSD MX300, can't finish Device self test

Hello,

I'm using Crucial SSDs for many years without any problems until today when I noticed that my latest Crucial_CT525MX300SSD1 showed some warnings in S.M.A.R.T.:

 

- Raw Read Error Rate 55
- Reallocated NAND Block Count 19

- Reallocation Event Count 19 Events
- SMART Off-line Scan Uncorrectable Errors 1

 

see full S.M.A.R.T.:

1 Raw Read Error Rate 55 Errors/Page
5 Reallocated NAND Block Count 19 NAND Blocks
9 Power On Hours Count 6114 Hours
12 Power Cycle Count 2007 Power Cycles
171 Program Fail Count 0 NAND Page Program Failures
172 Erase Fail Count 0 NAND Block Erase Failures
173 Block Wear-Leveling Count 8 Erases
174 Unexpected Power Loss Count 14 Unexpected Power Loss events
180 Unused Reserved Block Count 1914 Blocks
183 SATA Interface Downshift 0 Downshifts
184 Error Correction Count 0 Correction Events
187 Reported Uncorrectable Errors 0 ECC Correction Failures
194 Enclosure Temperature 30 Current Temperature (C)
48 Highest Lifetime Temperature (C)
196 Reallocation Event Count 19 Events
197 Current Pending ECC Count 0 ECC Counts
198 SMART Off-line Scan Uncorrectable Errors 1 Errors
199 Ultra-DMA CRC Error Count 0 Errors
202 Percentage Lifetime Remaining 100 % Lifetime Remaining
206 Write Error Rate 0 Program Fails/MB
210 RAIN Successful Recovery Page Count 88 TUs successfully recovered by RAIN
246 Cumulative Host Sectors Written 3859849066 512 Byte Sectors
247 Host Program Page Count 120630587 NAND Page
248 FTL Program Page Count 91101288 NAND Page

 

What worries me it the fact that the "SMART Off-line Scan Uncorrectable Errors 1" showed after firmware update OR during last self-test after update. Form last firmware update (through the app Storage Executive, all went well without any error) I CANNOT finish Device Self-Test (neither of them), in about 30%, I'm getting error that the test failed without further explanation. I tried another apps to test device S.M.A.R.T. self test and all failed as well in about 30%. My another Crucial SSDs are passing the test without problems. From the last FW update I cannot see the SSD in BIOS. During work in Windows it seems to perform well, but I do not know if the SSD is still reliable when considering all above failure and errors.

 

Weird is that this my SSD is nearly all the time idling, my other Crucial SSDs are working 1000% harder for about 5-7 years without any error in S.M.A.R.T.

Can You please help me with this issue or let me know if I should return the SSD in warranty?

 

Thank You for any advice or help.

10 Replies
JEDEC Jedi

Re: SSD MX300, can't finish Device self test

The SMART self-tests can get stuck if the drive is being used (whether due to the OS accessing the drive or due to the SSD's own internal routines).   Most times the self-tests run without issue, but I've encounted this on some systems with hard drives.   You could try booting from another drive and running the Self-Test to see if it is able to finish.   My guess is the MX300 needs a Secure Erase to reset the SSD.  I had to do this after encountering Reallocated Blocks which got stuck reallocating even after the firmware update was installed.  It seems the SSD gets stuck with old behavior until a Secure Erase is performed (at least in my experience).  If you haven't done so yet, update the firmware to M0CR070 first before performing a Secure Erase.    After updating to M0CR060 I don't believe our MX300's had anymore issues.

 

Make sure to backup your system before performing a Secure Erase as it will wipe all data from the drive.

Kilobyte Kid

Re: SSD MX300, can't finish Device self test

Thank You very much for Your answer. This SSD is not system disk and I'm using it only to storage photos, so it is used very very sporadicaly. I tried to reset computer several times to "cut" all processes which may interfer with S.M.A.R.T. self device test, but still the same - always ending with failure in about 30%. 

 

How do you recommend to do "Secure Erase"? That is the same as "Sanitize drive" in Storage Executive? 

 

About the FW update, I did not perform any FW update since purchase so there was still FW from factory till today when updated to M0CR070.

 

Should I worry about these values:

 

- Raw Read Error Rate 55
- Reallocated NAND Block Count 19

- Reallocation Event Count 19 Events
- SMART Off-line Scan Uncorrectable Errors 1

 

I'm little bit scared about "SMART Off-line Scan Uncorrectable Errors 1", which showed today after updating to M0CR070, or about that time...

 

Thank You in advance.

Highlighted
JEDEC Jedi

Re: SSD MX300, can't finish Device self test

The Uncorrectable Errors are from the bad blocks not being reallocated quickly enough.   SSDs auto correct errors up to a point and should reallocate a block once it reaches the maximum safe limit.  If something inhibits the drive from properly realloating the failing block or if the errors accumulate too quickly then you get the Uncorrectable Errors which are bad.  Hopefully the new firmware minimizes either scenario.

 

I think the Secure Erase is the only option here as it will reset the drive to factory defaults.   I believe the "Sanitize" option in CSE also performs a Secure Erase.

 

After the Secure Erase, see if the Extended Self-Test will finish successfully. 

 

Just pay attention to the SMART attributes after the Secure Erase to see how the drive is working.  It is normal for some blocks to get reallocated, but if you continue to see Uncorrectable Errors accumulating, then that is bad since those errors are reaching the filesystem & data.   For our MX300's some of which had thousands of Uncorrectable Errors and 30 Reallocated Blocks, the drives have been running fine for a year after the firmware update (M0CR060) and the Secure Erase.

Kilobyte Kid

Re: SSD MX300, can't finish Device self test

Once again, thank You very much for Your help. I'll perform  Secure Erase, but I'm afraid that anything wrong is going with the SSD.

 

Before the erase I wanted to backup all data, some important data were copied to two separate locations and files which was able to copy at 1st run, can't be copied to the second destination few hours later. File manager alerted with I/O error and several files are now corrupted (JPEGS) and can't be copied or viewed at all. Following value increased from 0 to 287 from the morning today:

 

187Reported Uncorrectable Errors287ECC Correction Failures

 

Do not know it he FW update caused this or if the SSD is going to die,... 

 

It also weird I can't see the SSD in BIOS after FW update. 

 

Still in warrany, should I return it to local dealer? 

 

Once again, thank You for Your help.

JEDEC Jedi

Re: SSD MX300, can't finish Device self test

Out of curiosity did any more blocks get reallocated or are there any Pending Blocks since your original post (attributes 5, 196 & 197)?

 

You should Secure Erase the MX300 and see how it works since you likely want to remove your data from the SSD before exchanging it anyway.  Remember to try the long/extended self test as well.   If the issue persists after the Secure Erase, then yes I would RMA it if it is still under warranty.  If you are concerned about more failures, then I would transfer a lot of test data to the drive to see if it happens again (after the firmware update & Secure Erase).   A little extra wear on the drive to be sure it is working correctly is worth it in my opinion.  Just make it non-personal data in case the drive fails completely & you cannot erase it again.

 

I experienced these types of issues with the pre-M0CR060 firmwares on several MX300s and the issues seemed to have stopped after I updated the firmware and Secure Erased the drives.  I think something of the old firmware or internal behavior remains until after the Secure Erase resets the SSD on some drives.   Or the SSD might just be bad (if the issues persist after the firmware update & Secure Erase).

Kilobyte Kid

Re: SSD MX300, can't finish Device self test

Again, thanks a lot. I'll try all You recommended and we will see. None of the mentioned attributes values changed since morning, only 187: 

 

ID Description Attribute Data Units

1Raw Read Error Rate55Errors/Page
5Reallocated NAND Block Count19NAND Blocks
9Power On Hours Count6124Hours
12Power Cycle Count2007Power Cycles
171Program Fail Count0NAND Page Program Failures
172Erase Fail Count0NAND Block Erase Failures
173Block Wear-Leveling Count8Erases
174Unexpected Power Loss Count14Unexpected Power Loss events
180Unused Reserved Block Count1914Blocks
183SATA Interface Downshift0Downshifts
184Error Correction Count0Correction Events
187Reported Uncorrectable Errors391ECC Correction Failures
194Enclosure Temperature31Current Temperature (C)
 48Highest Lifetime Temperature (C)
196Reallocation Event Count19Events
197Current Pending ECC Count0ECC Counts
198SMART Off-line Scan Uncorrectable Errors1Errors
199Ultra-DMA CRC Error Count0Errors
202Percentage Lifetime Remaining100% Lifetime Remaining
206Write Error Rate0Program Fails/MB
210RAIN Successful Recovery Page Count88TUs successfully recovered by RAIN
246Cumulative Host Sectors Written3859882234512 Byte Sectors
247Host Program Page Count120631624NAND Page
248FTL Program Page Count91104743NAND Page
Crucial Employee

Re: SSD MX300, can't finish Device self test

@Yaromil so the SMART values you've reported actually look perfectly fine. Did you notice Storage Executive says your drive is healthy, or is it reporting as bad? From what I can see all the numbers you have look entirely normal for a drive that's been in use for several years. You've used up only 19 of the roughly 2000 reserved blocks your drive shipped with.


The Self Test to be honest is a pretty hit or miss tool, it doesn't always work, and I almost wished it wasn't added because when it doesn't work, people now have doubt in drives that have otherwise had zero issues and most likely are perfectly fine.

The SMART Off-line Scan Uncorrectable Error count simply means the last self test you ran was unable to complete for some reason, so the wording of "error" is a little misleading. This value is set to 0 when a drive leaves the factory, and it's set to 0 during the beginning of a new self test, since your test isn't able to finish it's reported as a 1; but this again isn't any indication there are problems with the drive.

The Sanitize is a good idea, and you'd ideally want to run the self test in an entirely different computer with the SSD as a secondary drive to help rule out other variables. However this seems like a lot of effort and time put into a drive that's otherwise working perfectly fine, and has healthy SMART attributes.





Crucial_Benny, Micron CPG Support, US


How do I know what memory to buy?
Shop for your region: US | UK | EU | France |
I think my memory is bad. What do I do now?
FAQs and Top Forum Solutions
Did a user help you? Say thanks by giving Kudos!
Still need help? Contact Customer Service
Want to be a Super User?
JEDEC Jedi

Re: SSD MX300, can't finish Device self test

187	Reported Uncorrectable Errors	391	ECC Correction Failures

@Crucial_Benny  It appears Attribute 187 "Reported Uncorrectable Errors" has increased since  @Yaromil's original post.   I'm not sure, but I think this appeared after the firmware update.  I'm guessing this is why the selftest is failing.

Kilobyte Kid

Re: SSD MX300, can't finish Device self test

@Crucial_Benny @HWTech

 

Thank You very much for the comprehensive answer. Yes the SE app is showing:

Drive3 - Good Health
Crucial_CT525MX300SSD1

I found at least one particular file (JPEG photo) which I can't copy or view, the computer get freeze or file manager alert with I/O error. Everytime I try to access this file following value is increasing:

 

187Reported Uncorrectable Errors391ECC Correction Failures

 

It is possible that the problem with this particular file was present before FW update, bacause I use this SSD very sporadicaly and only to store photos so I'm not often working with files to see if the problem was present in the past.... I'm accessing this SSD only once for few months when emptying my SD card, other than that the SSD is 99% of time idle / idling. My other Crucial SSDs (M4 and MX100) are working like work horses for 8-10 hours (Windows, programmer work) per day for many many years without zero issue and that is my reason why I'm trying to figure out what it going on not to loose trust in Crucial products, which are my top storage choice for many years. 

 

Once again, Thank You!