M500/M5x0 QUEUED-TRIM data corruption alert (mostly for Linux users)

hmh
Binary Boss

M500/M5x0 QUEUED-TRIM data corruption alert (mostly for Linux users)

[ Edited ]

The M550 (all released firmware), and the M500 (up to MU04 all released firmware) can cause data corruption when QUEUED TRIM is used.

 

Since Crucial is not urging everyone to update to MU05 and are taking their time with the M550 update, I assume that Windows does not issue QUEUED TRIM by default, and therefore does not trigger the issue (yet).

 

I have no idea about Intel RST enhanced windows drivers, MacOS or FreeBSD. Chances are they cannot trigger the bug, but you might want to check with the vendors.

 

EDIT: The M500 MU05 firmware still has the QUEUED TRIM data-killer bug, there are no safe firmware versions.

EDIT: This is a problem on several outdated versions of the Linux kernel, for the 3.12, 3.13, 3.14 and 3.17 branches. Linux releases older than 3.12 will NOT trigger the bug. Recent releases of the 3.12, 3.13, 3.14 and 3.17 branches have a blacklist in place and will NOT trigger the bug. The 3.15 and 3.16 kernels also have the blacklist, and won't trigger the firmware bug.

 

Dangerous for use in kernels:

  • 3.12 (before 3.12.29);
  • 3.13 (before 3.13.7);
  • 3.14 (before 3.14.20);
  • 3.17 (before 3.17.1) - regression in the blacklist, fixed in 3.17.1.

 

Safe kernels:

  • anything before 3.12;
  • 3.12.29 and later;
  • 3.13.7 and later;
  • 3.14.20 and later;
  • 3.15 (all);
  • 3.16 (all);
  • 3.17.1 and later.

Bug workaround for any kernel version (tanks performance down to a crawl on most workloads): disable NCQ in the kernel command line, by adding the libata.force=noncq parameter in the bootloader.

 

The "uname -r" command will tell you the Linux kernel release you're running on.

74 Replies
Tracer Lite

Re: M500/M5x0 QUEUED-TRIM data corruption alert (mostly for Linux users)

[ Edited ]

Thanks for mentioning this. This was raised in another thread, but perhaps ignored because most people are using windows and some people have seen a drop in 4K perfomance with MU05.

http://forum.crucial.com/t5/Solid-State-Drives-SSD/The-newest-response-from-Crucial-about-the-4K-per...

 

The specific references were:

http://comments.gmane.org/gmane.linux.ide/56084

http://www.spinics.net/lists/linux-ide/msg48361.html

 

JEDEC Jedi

Re: M500/M5x0 QUEUED-TRIM data corruption alert (mostly for Linux users)


hmh wrote:

I had a much larger post where I tracked the relevant kernel releases where the M500 and M550 were blacklisted for QUEUED TRIM, etc. The forum ate it before I could post, and I won't hunt down that information again.


Thanks for the information anyway Smiley Happy As for the disappearing long posts, emails - my personal remedy for that is "Ctrl+C Ctrl+V strategy" Smiley Wink

______________________________________

FAQs and Top Forum Solutions
Did a user help you? Say thanks by giving Kudos!
How do I know what memory to buy?
Still need help? Contact Crucial Customer Service
Remember to regularly backup your important data!

Bit Baby

Re: M500/M5x0 QUEUED-TRIM data corruption alert (mostly for Linux users)

I have Crucial_CT480M500SSD1 (M500, 2.5, 480 GB) with firmware MU05 and that problem still persist.

 

Recent kernels (3.14.[45]) have firmware exclusion for QUEUED-TRIM workaround and it can get you REAL TROUBLE after deleting large files. BROKEN PARTITION TABLE included.

 

 

hmh
Binary Boss

Re: M500/M5x0 QUEUED-TRIM data corruption alert (mostly for Linux users)

[ Edited ]

xarafaxz, Have you contacted Crucial technical support directly about it? Do you have a sequence of commands that reproduce the issue?

 

EDIT: the regression is already reported upstream on the Linux kernel, and an Oracle employee is going to report it directly to Micron. https://bugzilla.kernel.org/show_bug.cgi?id=71371

 

How to know you hit the bug: The SSD will **bleep** over itself, corrupt data, and return something like this in the kernel log:

 

ata4.00: failed command: READ FPDMA QUEUED
ata4.00: cmd 60/08:d0:88:da:9f/00:00:09:00:00/40 tag 26 ncq 4096 in
             res 40/00:e4:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
ata4.00: status: { DRDY }
Tracer Lite

Re: M500/M5x0 QUEUED-TRIM data corruption alert (mostly for Linux users)

[ Edited ]

Hi, can you post all steps needed to reproduce this bug, so that we can confirm that it affects MU05.

I'm sure it would help convince Crucial to find a solution if multiple people can reproduce the bug using MU05.

 

hmh
Binary Boss

Re: M500/M5x0 QUEUED-TRIM data corruption alert (mostly for Linux users)

[ Edited ]

I believe the issue has been taken directly to Micron, but reporting it to Crucial would be helpful.

 

Anyway, from the Linux bug report I linked in my previous post, all you need to do is to write to the filesystem if online discard mode is enabled. I.e. booting is enough(!!!).

 

Note that even "fstrim" will also trigger the bug, as once Linux detects it can send QUEUED TRIM, it uses the queued version for everything.  It is not just "online discard" mode that is dangerous.

 

I will ask for a per-device override to force-disable this feature in Linux. It is pretty obvious that it will be needed.

JEDEC Jedi

Re: M500/M5x0 QUEUED-TRIM data corruption alert (mostly for Linux users)

[ Edited ]

hmh, this is far beyond my experience but did you/anyone confirm that this is related to firmware rather than let's say faulty individual units?

I could try to replicate the issue with my M500 that I am pretty sure it works well. Could you please guide me through how to replicate the issue? Nevermind, I should manage to follow your description ;-)

 

Excuse me if I am on the completely wrong track, I don't want to interfere with this thread but I do not track linux communities and on the other hand I have seen individual M4 drives that were reporting READ FPDMA QUEUED but those drives were reporting other SMART errors too.

______________________________________

FAQs and Top Forum Solutions
Did a user help you? Say thanks by giving Kudos!
How do I know what memory to buy?
Still need help? Contact Crucial Customer Service
Remember to regularly backup your important data!

JEDEC Jedi

Re: M500/M5x0 QUEUED-TRIM data corruption alert (mostly for Linux users)

[ Edited ]

Okay, so I have recently downloaded ubuntu 3.13.0-24-generic

discard added to fstab

sudo fstrim -v /
/: 107803365376 bytes were trimmed

 

No reference of READ FPDMA QUEUED in /var/log/kern.log

 

Everything seems to work fine with M500 MU05. Is there anything I am missing there?

______________________________________

FAQs and Top Forum Solutions
Did a user help you? Say thanks by giving Kudos!
How do I know what memory to buy?
Still need help? Contact Crucial Customer Service
Remember to regularly backup your important data!

Bit Baby

Re: M500/M5x0 QUEUED-TRIM data corruption alert (mostly for Linux users)

What you did miss is that that in the 3.13.0-24-generic this bug is known to the kernel developers and therefore Queued TRIM is blacklisted for the M500.

 

Check also the details of the earlier mentioned bugreport: https://bugzilla.kernel.org/show_bug.cgi?id=71371

 

If you would download the latest kernel from e.g. http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.14.5-utopic/ where the blacklist is disabled, you could see that the errors appear.