I have read through a few threads that seem to indicate that there are problems booting from NAND on the mini210. Could some kind person summarise the issue as I am trying to make a decision whether or not to buy a large quantity of these devices. Thank you, Dave
Mini210 boot from NAND
Hi Dave, I can throw some light on this for you. There was a perceived issue that showed up as uncorrectable ecc nand errors, these were mostly caused by mismatched versions of bootloader vs kernel, slight differences in the way they handle nand caused the kernel driver to think good nand is bad, this was easily remedied by making sure that the kernel/kernel sources/superboot come from the same software/source code dvd and reburning, using the lowformat= yes option in the friendlyarm.ini file. That is the only major issue with the nand that I have seen, which turns out to be a non-issue. I have seen people mention other uncorrectable ecc nand errors but I believe those are either existing bad blocks or normal wear and tear, and has to do with how yaffs2 fs deals with the nand, when it mounts the partition, it rebuilds the filesystem, yaffs2 scans the whole mtd partition, in 1MB blocks, then it scans backwards looking for data, and essentially doing a ton of 8k page reads too, so it will mark out all of the bad blocks the first time you boot after a flash. it does all of this in about 10seconds, you can speed this up by creating a check point, yaffs2 is setup to create a checkpoint when you unmount, or you can force checkpoint creation by running the 'sync' command at the command line or from a script. I have been testing the nand on and off over the last few weeks, I have really been putting it through it's paces, any time I've seen errors on the nand have been caused by me and are easily remedied.
Reggie, Thank you very much for the feedback. I will now remove the highlighted red areas from my purchase spreadsheet! Cheers, Dave
This SoC comes - like the S3C6410 SoC - with more than one type of error correction unit. One for the older SLC-NAND (Single Level Cell). It generates a short ECC and is only able to correct one bit errors and to detect two bit errors. Modern MLC-NANDs (Multi Level Cell) need much stronger protection. If one cell fails, more than one bit is broken at this time. So, this SoC comes with a second correction unit, with a stronger correction algorithm (Reed Solomon). And every piece of software (ROM code, bootloader and the running OS) must be consistent using the same correction unit *and* the same checksum layout *and* bad block handling. Do you can feel the pain? For the Reed Solomon correction unit there is still no support in the mainline linux lernel. It might be dangerous to rely on vender patches, because you are stuck at their kernel. I'm still fighting with MLC-NAND support for my Tiny6410. The ROM code of this CPU destroys the factory bad block markers! m( And both devices (SoC and NAND device) are from the same vendor...
ARMWorks with FriendlyARM is changing to 1G SLC only for Mini210s (or 256M if an OEM wants it). This raises the pain threshold to reasonable levels :-)
The mini210S has got 1,4,8,12,16bit ecc correction, they're using 16bit correction on the 210S boards, at least on the 4GB versions. In general, any nand driver has the potential to destroy the factory bb markers, looking at the various versions of s3c_nand.c around the interwebs, it would appear that samsung doesn't use BBT handling at all, so it doesn't really care about the factory bad-blocks, it will just re-mark them as bad when it tries to write something to them the first time and fails. It's potentially annoying if you've got bad blocks but certainly not fatal, especially given that the yaffs2 fs does this on first boot after a burn.