SD cards are the literal worst.
they've expanded to be the size of small hard drives, and devices like the rpi keep using them as boot media, but they:
- use garbage tier low endurance flash cells internally
- have little to no overprovisioning for wear
- perform only the most basic wear levelling
- have no protocol level integrity checking
- have few internal error correction features, if any
- decay comparatively quickly without patrol scrubs
- do not perform patrol scrubs
- cannot do PLP
I've been mad about this pretty much forever. Don't use SD cards for stuff where there's literally any other option.
@gsuberland I agree with you, but what other options are there at the moment? I guess USB drives might work.
@kelpana USB flash drives are better, although still have issues. CompactFlash is ok, SATA/UAS or NVMe is preferred where you can (but obviously much more complex). For stuff where you only need firmware and a small amount of persistent storage (e.g. for a config block) it can make more sense to use EEPROMs and select parts that meet your longevity needs, and implement your own wear levelling / scrubbing where needed.
@kelpana @gsuberland USB-attached SATA or NVMe SSDs.
USB flash drives are the exact same (or often worse). Don't use them either.
@manawyrm @gsuberland I know some USB flash drives are bad, or even just SD cards. But what can be used to interface to a microcontroller project? Is it even possible to interface NVMe in a small system?
@gsuberland i wish they made small capacity (full physical size) sd cards that do proper wear leveling and use good quality and overprovisioned flash
maybe with a built in indicator for how healthy the flash is?
would be very useful to a lot of people
something I've been thinking about a lot is engineering panic signals into embedded stuff.
put a rail supervisor on the incoming supply, and leave enough capacitance on the low side of your buck(-boost) reg to keep the device running for a few milliseconds when the power is pulled. treat the supervisor IC's PG output as a panic signal and have it trigger an IO interrupt on your MCU/SoC which tells the filesystem code to ensure integrity, flush, and halt IO.
we think of PLP as being a fancy enterprise feature on PCs, but on embedded stuff it's actually legit important 'cos the user often just yanks the power.
@kelpana @manawyrm @gsuberland wouldn't NAND or NOR flash be better? If you have a Linux kernel, you have UBIFS
@kelpana @manawyrm @gsuberland (also: are eMMC as bad as SD cards? I felt like they had better properties)
@raito @kelpana @gsuberland Yes, if you have a Linux kernel, you suddently have more good options.
"Microcontroller project" just didn't sound like Linux to me :)
@manawyrm @kelpana @gsuberland agreed; as we talked about RPIs in the start, I was considering the large spectrum of microcontroller projects
Nonetheless, I suppose that even ESP32 grade stuff could get access to a good implementation of something like UBIFS?
@gsuberland it’s true! I know about this and I still yank the power sometimes before I realize it
@gsuberland what’s really annoying is that they could do a lot of that, but don’t because it’s marginally cheaper and for most consumers, it’s difficult to notice the regularity of their failures.
eMMC is essentially the same stuff but soldered on, but because it’s always used in a large scale, companies will notice all of that stuff, and therefore money will be spent on reliability.
realistically, the only thing SD cards couldn’t do is PLP
@ignaloidas yeah, it's all entirely doable, just costs money. the PLP stuff can be implemented on the host device but again most don't bother.
@gsuberland what I’m hearing from this is that I should really, really not be using the SD card in my camera as long term storage guess who’s backing it up as soon as they get home today?
@carbontwelve oh yeah absolutely. you can expect to see bit-rot on SD cards for anything that was written longer than a year or two ago.
@gsuberland As far as I've ever been able to determine, there's no way for the host to inquire whether the SD card has flushed all of its dirty RAM buffers to flash, such that the host knows e.g. that it's safe to power the card down.
@brouhaha yup, afaik there's no guarantee. all you can do is try to keep it powered as long as possible after the final write command.
@gsuberland oh dear haha. Yeah this is going on five years. Didn’t have a computer to copy it over to until this year and so it’s just sat on the shelf, forgotten!
@carbontwelve oof. fingers crossed for you.
our wedding photographer gave us our photos on a USB stick. every single photo on it is now unreadable. (naturally I made backups immediately once we got them!)
@gsuberland @raito @kelpana
Look at this datasheet: https://wmsc.lcsc.com/wmsc/upload/file/pdf/v2/lcsc/2007301503_Samsung-KLM8G1GETF-B041_C499918.pdf
You can also often configure/partition eMMC flash to work in pseudo SLC mode (trading size for reliability).
I know that at least some Micron eMMC has working built-in data refresh... Not sure about other vendors.
@manawyrm oh neat. I know the eMMC protocol has much better guarantees about operation timings and caching, so that helps a lot with designing stuff for data integrity.
but what are the alternatives ?
last few years, quality of the housing on thumb drives has deteriorated, so that many thumb drives often go bad ?
iirc, circa 1990, hard drive era:
each datum on at least two drives [ drive failure ], that are kept in separate physical locations [fire, flood ] and if done with "backup software [ remember that ??} you check the data regularly [ to make sure the backup software saved the data AND taht you can get back !! LOL]
@failedLyndonLaRouchite read the thread, plenty of chat about alternatives.
@gsuberland
"honest question"
this is a great post, but why don't you tell us what to do (rail super .. gibberish to me !!) ?
@failedLyndonLaRouchite this is more a thread with advice for hardware designers than consumers.
@gsuberland Yep this is something I've started to design into my architecture, prototyping on the trigger crossbar and then scaling to a next-generation system with more capabilities on my next project.
Luckily there's not a whole lot to worry about in terms of data integrity because MicroKVS is a) seldom written and b) designed to not corrupt data if interrupted at any point.
At some point I need to build a board where I can test that assumption: have one MCU that controls power to another and the target is just constantly booting up, doing KVS reads and writes, and being randomly reset/power cycled.
And see if I can make it break.
@gsuberland feels like the same is true for USB flash drives. The ones I’ve bought more recently also keep failing for me no matter how expensive they were or if they claimed to be extra resilient. Yet I still have a drive that’s easily 10+ years old and been through a lot of abuse which still works fine :(
@gsuberland Why low side? High side seems better wrt brownout detection.
@dascandy rail supervisor on incoming supply is high side?
@gsuberland microSD cards also have a tendency to snap in half if you as much as look at them. We had this happen twice during a university project.
@apicultor someone else asked the same. it would give you better error detection and correction, and scheduled scrubs would help resolve the decay issues, but those scrubs also increase the write wear rate so you're still left using media that will die pretty quickly.
@GNUmatic generally better because they're more commonly used in industrial devices, network appliances, and professional equipment. manufacturers are generally more transparent about the technical specs of their devices. CFast in particular has much better capabilities at the protocol level.
@GNUmatic traditional CF still tends to have limited write longevity and wear levelling capabilities compared to something like a SATA or NVMe SSD, but the specsheet will also generally tell you that instead of just containing marketing material like consumer SD cards.
eMMC is also a vastly superior option to SD.
@gsuberland Plus, the whole industry keeps stalling all technological advancements. I have yet to see a single SD UHS-III card on the market, let alone SD Express. I heard there are some UHS-II cards available, but card readers with the additional pins and data channels are rare. Apparently some DSLRs support them?
@smochi the faster write time also tends to come with a tradeoff in poorer retention periods, especially for high density cards. they're fine if you just want to store some stuff and then copy it off within a month or two, but once you get into the 1-2 year mark you really start running into bit-rot.
@gsuberland Are you sure that scrubbing causes writes (other than to update the last scrub timestamp)?
@apicultor what I mean is that if the goal is to get around bitflips and bitrot and write wear issues with ZFS, the correction itself will cause more write wear.
it's better than not using ZFS, but it accelerates the overall inevitability of complete failure.
@gsuberland Ah, legit, yes.
What about SD cards that don't suck — as in, ones that have their flash configured in SLC mode so you get bonkers durability like 5K TBW (for the largest ones, reducing proportionally by capacity):
https://www.transcend-info.com/embedded/product/embedded-memory-cards/usd230i
https://www.transcend-info.com/embedded/product/embedded-memory-cards/usd240i
Not quite as impressive, but still ~2.8K TBW if I did my math right (it's expressed in hours of HD video at 26 Mbit/s):
https://www.transcend-info.com/product/memory-card/usd350v
@apicultor sure, but the better solution is just to not engineer systems that need you to very specifically pick special cards. an SD with long write longevity still only solves a few items off the list.
@gsuberland and, in the case of the RasPi, get flogged to death with constant tiny writes because they’re using a filesystem designed for spinning platters of rust
@jpm @gsuberland oh oops which other file system should I swap in?