I've got three Mini2440s, all running the same software and Qt application but one of them freeze. It does not generate a segmentation fault so somewhere the Mini is getting stuck and can't get out of it. I do have the watchdog enabled to help recover from this problem but I want to resolve this without having the watchdogva to "fix" the problem. I've added debug values all over the code and output these values on the UART to see where it freezes in the software but the pattern is not consistent. I'm starting to think it could be hardware related because two boards are running fine and the other one doesn't. I'm using barebox with PTXdist. Barebox and the kernel is executed from the NAND and the Qt application is executed from the SD card. 1.) Do you guys have any idea how I can go about resolving this problem? 2.) Could the FLASH have bad sectors? How can I check for this in barebox? 3.) Could the RAM have bad sectors? How can I check for this in barebox? Any ideas or help will be appreciated. Thanks
MIni2440 Freeze
Can you run any programs on the bad mini2440? 2. As far as I am aware Barebox does a proper BBT setup, so this shouldn't be necessary. Try running the kernel from SD as well? 3. There must be some method of when RAM is accessed for the machine to know if it is working or not. http://computer.howstuffworks.com/ram1.htm Other functions of the memory controller include a series of tasks that include identifying the type, speed and amount of memory and checking for errors.
Yeah, everything works fine, only that it will freeze after a while. The freeze vary from startup to a few hours and can even sometimes take several days before it freezes.
Can you check if the size of the SDRAM the Kernel "believes" to have is the size the system provides physically? Maybe you have a 64 MiB system and the Kernel tries to use 128 MiB. And it always crashes when the memory gets filled up after a while and the kernel must use the non-existing SDRAM above 64 MiB.
Sure sounds like hardware. Can you swap SD cards without the problem following the SD? Are the boards mounted? Is it possible that there is some stress or torque on the bad board so that with heating and expansion, a cold joint or cracked trace produces a fault? Are you providing power with the barrel jack or the 4 wire 2mm connector?
Also, if you have a JTAG dongle you could take a memory snapshot when the kernel is frozen and try to analyze that (although that's certainly not an easy task).
There are probably many reasons for this to happen. How often does it happen? How much free memory do you have? Does syslog say anything? Can you attach a remote terminal and hope something gets flashed up on the screen? USB involved? I had a problem with a mains power relay corrupting a USB hub. Sure you are not losing power?