"Unrecoverable medium error during recover" in a Crucial SSD drives RAID

Kilobyte Kid

"Unrecoverable medium error during recover" in a Crucial SSD drives RAID

Hi,

 

we're running a system where we have a RAID of RAID 1 of 2 HDs and a RAID 5 of  6 HDs, all Crucial SSD drives 1 TB size. When we run some operations that access some sectors in the RAID we get these errors at OS level:

 

 MR_MONITOR[6303]: <MRMON111> Controller ID:  0   Unrecoverable medium error during recovery:   **bleep**   Port 0 - 3:0:2      Location   0xee48ef2#012Event ID:111

 

2018-11-07 18_50_37-Clipboard.jpg

The RAID software  also reports errors periodically (see image above))

 

We have run the 'micron storage executive' tool and all the checks give that the HDs are OK (smart OK).

 

2018-11-07 18_59_34-Clipboard.png

 

 

 

We wonder where the problem is: RAID reports problems in the HDs but HDS seems OK. Firmware of the HDS is at MU01, there is MU02 available.

 

What do you think can be the problem? Can the firmware update solve the problem? (it seems that this firmware update can "

  • Corrected SMART attribute threshold values")

 

 

Thanks and regards

7 Replies
Highlighted
JEDEC Jedi

Re: "Unrecoverable medium error during recover" in a Crucial SSD drives RAID

Provide screenshots for the SMART Attributes of each SSD if you are able.

Kilobyte Kid

Re: "Unrecoverable medium error during recover" in a Crucial SSD drives RAID

smart_test_drive_crucial.PNGsmart_test_drive_micron.PNG

Crucial Employee

Re: "Unrecoverable medium error during recover" in a Crucial SSD drives RAID

Hello and thank you for your question. Updating the firmware is a good idea as it calls out fixing some SMART data readings that might not be reading correctly. This is where I would start to see if this fixes the issue. 





Crucial_AgentC, Micron CPG Support, US


How do I know what memory to buy?
Shop for your region: US | UK | EU | France | Global
I think my memory is bad. What do I do now?
FAQs and Top Forum Solutions
We want your feedback! Post in the Suggestion Box
Did a user help you? Say thanks by giving Kudos!
Still need help? Contact Customer Service
Want to be a Super User?
Kilobyte Kid

Re: "Unrecoverable medium error during recover" in a Crucial SSD drives RAID

Hi,

 

thx for your answer! Yep, this seems to be the first action to do. I think we will give it a try, I'll keep you posted

 

Thx!

JEDEC Jedi

Re: "Unrecoverable medium error during recover" in a Crucial SSD drives RAID

Your mdraid0:8 SSD has some Uncorrectable Errors listed.  These Uncorrectable Errors are most likely the source of your problem.   Ideally the bad blocks should be reallocated before any Uncorrectable Errors show up so it does not affect your filesystem or data, but it appears enough sections failed too quickly before the controller realized the blocks needed to be reallocated.

 

If you continue having problems after the firmware update, I would suggest pulling this SSD and performing a Secure Erase on it to reset it.  You may also want to write to the whole drive to see if any more errors occur or if it will reallocate more blocks.  Reallocated Blocks are fine as long as they are not accompanied by Uncorrectable Errors.  Then if things seem fine, perform another Secure Erase to reset the SSD one more time.   Then add it back into your RAID to see if it resolves your problem, otherwise it looks like you may need to replace this SSD.   If your RAID is not backed up, then I would install a new drive instead so you don't risk the other drive failing while you test out the other one.

Crucial Employee

Re: "Unrecoverable medium error during recover" in a Crucial SSD drives RAID

@HWTech suggestion of secure erasing the drive is a good one. My guess is the errors are a result of the RAID controller timing out with request for new writes to the SSD. A secure erase will wipe the drive clean of all data at a controller level, so it should theoretically perform like it were new out of the box.

You need to realize the M550 and even the Micron 1100 are client grade storage devices. They're not designed to run around the clock in 24/7 enterprise environments. If they're subjected to continuous erase operations the controller can get backed up, which in a normal single drive scenario would simply mean loss in performance as it trys to catch up, but when you throw the drive into a RAID array this could mean timeouts and more serious issues as write request lag behind the other drives.

If you continue to have problem with the drive it's probably time to retire it from mission critical or write intensive operations. It should theoretically still work great in normal work environments or static read storage. If you end up buying a replacement I would highly recommend you look at enterprise class SSDs like the Micron 5200. They designed and more well suited for 24/7 operation and high erase environments, and they aren't that much more expensive than consumer SATA devices.

https://www.micron.com/~/media/documents/products/product-flyer/5100_product_brief.pdf






Crucial_Benny, Micron CPG Support, US


How do I know what memory to buy?
Shop for your region: US | UK | EU | France | Global
I think my memory is bad. What do I do now?
FAQs and Top Forum Solutions
We want your feedback! Post in the Suggestion Box
Did a user help you? Say thanks by giving Kudos!
Still need help? Contact Customer Service
Want to be a Super User?
Kilobyte Kid

Re: "Unrecoverable medium error during recover" in a Crucial SSD drives RAID

Hu Bennym thansk for your update. Yep, when installed the previous sysadmin considered that even if they are not intended for a critical 24*7 environment, they could run smoothly. However, I think it wasn't a good decision. We will think about it, upgrading the drives seems a good thing to do!