Hi, I'm using a Tiny 6410 board, Linux embedded, regular U-boot, booting from Nand Flash. Config : ------------------------------------ U-Boot 1.1.6 (Apr 6 2011 - 14:17:30) for FriendlyARM MINI6410 CPU: S3C6410@532MHz Fclk = 532MHz, Hclk = 133MHz, Pclk = 66MHz, Serial = CLKUART (SYNC Mode) Board: MINI6410 DRAM: 256 MB Flash: 0 kB NAND: 2048 MB ------------------------------------ My application runs correctly during a couple of days or weeks. And suddenly, my board is not able to boot anymore. The Console (TTY) shows me that Uboot is stucked in the menu, and the key selection to boot the kernel has no effect. It restarts and stays in the Uboot menu. Sometimes it repports me a bad CRC warning when analysing the Nand. It seems that Nand flash was altered. The only way to make my application working again is to flash the Nand with Kernel (Zimage), UBIFS and EXT3. Anyone already facing this problem? My application is an embeded system, so I cannot restore the system in this way. Regards, Eric3
Tiny6410, impossible to boot again after sometime
And let me guess: you erase your NAND completely prior writing its new content? If yes, I would also guess your NAND reached the end of its lifetime.
Hi Juergen, Thanks for your interest. No, I never erase the NAND. The only things I've done is to programm several times the NAND with new code. I already deployed 20 systems based on this Tiny 6410 core module, I would say without any problems, I we just perform update of the firmware by writing the new code in flash thanks to the Tiny Board and the SD card Boot. I'm quite sure this is not a problem of "end of life" of the NAND. Eric3
Hi eric, > No, I never erase the NAND. You did. To program a NAND you *must* erase it first. And if your bad block management is broken or non existent after such an erase the NAND *seems* to work again. Only to start failing again very fast. How fast a NAND wears out depends on how often you change the data on it. *Each* change needs an erase. So, if you have a high filesystem activity on this NAND memory (for example logfiles) it may needs a few weeks or only a few days to reach the end of life of this kind of memory.
Hi Juergen, Ok, I agree : I erase the Nand. What I mean by "I no erase the Nand", I do no do it by myself, I'm simply using the "superboot-20110405.bin" provided (No idea of the source contents). Maybe, this is one of my problems. But I saw "Bad Nand CRC" with a completly new Tiny Core Module. My idea of the problem was more oriented on an absence of "proper linux shutdown" before power down. I read literature on it and it could explain my troubles. I've tried to reproduce the bug, without any success. So i'm not sure the patch I've done solve my problem. Any idea? Eric3
CRC errors can have many sources. One source can be the wrong formating of the image one writes to the NAND memory. For example a JFFS2 image must be generated in accordance to the erase block size of the NAND. If not, you will get many funny filesystem errors later on which includes CRC errors. And yes, another cause can be the improper shutdown. Older flash aware filesystems are still fighting with this issue. But you seem to use UBIFS, its currently the best solution for that issue. The next cause can be the use of recent flash devices. They shrink the die structure more and more which also increases the error rate. Recent NANDs need a much more stronger checksum. What checksum generator does your NAND driver use? The 1-bit or the 4/8-bit type? Your Mini6410 system comes with a 2 GiB NAND. Does is use the K9WAG08U1B or a different type?
Hello Juergen, Sorry for my late reply, I had to switch to another project that took me all my energy! Just to give you more details : This morning, I would like to restart my application. My 6410Core CPU was stayed on a table since the last use. When I put it on the Tiny6410 dev kit, I was not able to start the system. You can have a look on the PB_samsung.png (LOST) After several try, I decided to flash again the CPU. No problem for flashing. See PB_samsung.png (FLASH) And no Problem after Flash to start my system. See PB_samsung.png (START) In addition, to answer your question regarding the Nand Driver, you can have a look in the Menu Config. Below, some information of my runnin system : S3C NAND Driver, (c) 2008 Samsung Electronics MLC nand initialized, 2011 ported by FriendlyARM S3C NAND Driver is using hardware ECC. NAND device: Manufacturer ID: 0xec, Chip ID: 0xd5 (Samsung NAND 2GiB 3,3V 8-bit) Creating 3 MTD partitions on "NAND 2GiB 3,3V 8-bit": 0x000000000000-0x000000400000 : "Bootloader" 0x000000400000-0x000000c00000 : "Kernel" 0x000000c00000-0x000080000000 : "File System" UBI: attaching mtd2 to ubi0 UBI: physical eraseblock size: 1048576 bytes (1024 KiB) UBI: logical eraseblock size: 1032192 bytes UBI: smallest flash I/O unit: 8192 UBI: VID header offset: 8192 (aligned 8192) UBI: data offset: 16384 UBI: max. sequence number: 0 UBI: volume 0 ("FriendlyARM-root") re-sized from 165 to 2008 LEBs UBI: attached mtd2 to ubi0 UBI: MTD device name: "File System" UBI: MTD device size: 2036 MiB UBI: number of good PEBs: 2032 UBI: number of bad PEBs: 4 UBI: number of corrupted PEBs: 0 UBI: max. allowed volumes: 128 UBI: wear-leveling threshold: 4096 UBI: number of internal volumes: 1 UBI: number of user volumes: 1 UBI: available PEBs: 0 UBI: total number of reserved PEBs: 2032 UBI: number of PEBs reserved for bad PEB handling: 20 UBI: max/mean erase counter: 1/0 UBI: image sequence number: 92685634 UBI: background thread "ubi_bgt0d" started, PID 641 PPP generic driver version 2.4.2 PPP Deflate Compression module registered PPP BSD Compression module registered PPP MPPE Compression module registered NET: Registered protocol family 24 Thank you for your help, Regard, Eric
I would continue to guess your NAND memory is broken. And: You shouldn't erase the NAND in a hard way. You must use the corresponding UBI tool to do so, because you need to keep the block's erase counters. And I don't know how reliable the MLC driver is ("MLC nand initialized, 2011 ported by FriendlyARM").
Hi Eric3, I'm facing the exact same problem (see http://www.friendlyarm.net/forum/topic/6150#lastpost). Did you manage to solve it? What did you do? Thanks in advance.