Kernel panic when rearranging NAND partitions

PatrickMeyer
I'm trying to improve an embedded system here at work, namely in
restructuring the partition layout. I am definitely new to embedded work,
but not new to Linux in the least.

My hardware is an i.MX27 CPU on a board with 32mb NOR flash and 512mb NAND,
among other things. I'm running barebox 2010.04.0 and Linux 2.6.33.3 from
ptxdist 2010.04.1. It's all out of date, but I figured I should stick with
what the board manufacturer provided us until I could muster true command
of the system.

Originally, the partition layout was:

----------------------------------------------------------------------
nor_parts="256k(barebox)ro,128k(bareboxenv),2560k(kernel),-(root)"
#rootpart_nor="/dev/mtdblock3 ro"

nand_parts="256k(barebox)ro,128k(bareboxenv),3M(kernel),32M(root),32M(usr_l
ocal),32M(ifp),-(data)"
rootpart_nand="/dev/mtdblock8 ro"
----------------------------------------------------------------------

They had tried booting from NOR in the past but gave up and stuck with
housing barebox on NOR and the rest on NAND. The issue is, where we store
our software (ifp), between unreasonably verbose log output and needless
backups (I'm working on those issues in parallel with this), 32 megs gets
eaten up extremely quickly.

So, I modified the layout to the following:

----------------------------------------------------------------------
nor_parts="256k(barebox)ro,128k(bareboxenv)"

nand_parts="3M(kernel),-(root)"
rootpart_nand="/dev/mtdblock4 ro"
----------------------------------------------------------------------

Now, when I boot, I get the following output:

----------------------------------------------------------------------
barebox 2010.04.0 (Oct 19 2011 - 17:46:52)

Board: Phytec phyCORE-i.MX27
cfi_probe: cfi_flash base: 0xc0000000 size: 0x02000000
NAND device: Manufacturer ID: 0x2c, Chip ID: 0xac (Micron NAND 512MiB 1,8V
8-bit)
Bad block table found at page 262080, version 0x01
Bad block table found at page 262016, version 0x01
imxfb@imxfb0: i.MX Framebuffer driver
cfi_protect: protect 0xc0040000 (size 131072)

Using environment in NOR Flash
Malloc space: 0xa7500000 -> 0xa7f00000 (size 10 MB)
Stack space : 0xa74f0000 -> 0xa7500000 (size 64 kB)
running /env/bin/init...

   Verifying Checksum ... OK
   Image Name:   Linux-2.6.33.3
   Created:      2011-11-02  22:06:56 UTC
   Data Size:    2114112 Bytes =  2 MB
   Load Address: a0008000
   Entry Point:  a0008000
OK
commandline: console=ttymxc0,115200 mt9v022.sensor_type=color
pcm038_otg_mode=device root=/dev/mtdblock4 ro rootfstype=jffs2
mtdparts="physmap-flash.0:256k(barebox)ro,128k(bareboxenv);mxc_nand:3M(kern
el),-(root)"
arch_number: 1551

Starting kernel ...

Uncompressing Linux... done, booting the kernel.
Linux version 2.6.33.3 (root@teckla) (gcc version 4.3.2
(OSELAS.Toolchain-1.99.3) ) #253 PREEMPT Wed Nov 2 17:06:53 CDT 2011
CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), cr=00053177
CPU: VIVT data cache, VIVT instruction cache
Machine: ULCB-i.MX27
Memory policy: ECC disabled, Data cache writeback
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 15808
Kernel command line: console=ttymxc0,115200 mt9v022.sensor_type=color
pcm038_otg_mode=device -init=/sbin/init root=/dev/mtdblock4 ro
rootfstype=jffs2
mtdparts="physmap-flash.0:256k(barebox)ro,128k(bareboxenv);mxc_nand:3M(kern
el),-(root)"
PID hash table entries: 256 (order: -2, 1024 bytes)
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
Memory: 32MB 32MB = 64MB total
Memory: 60576KB available (3916K code, 267K data, 112K init, 0K highmem)
Experimental preemptable hierarchical RCU implementation.
NR_IRQS:272
MXC GPIO hardware
MXC IRQ initialized
Console: colour dummy device 80x30
Calibrating delay loop... 199.06 BogoMIPS (lpj=995328)
Mount-cache hash table entries: 512
CPU: Testing write buffer coherency: ok
regulator: core version 0.5
NET: Registered protocol family 16
bio: create slab <bio-0> at 0
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
Advanced Linux Sound Architecture Driver Version 1.0.21.
regulator: REG1_BKLT: 4500 <--> 5500 mV at 5000 mV 
regulator: REG2_CPU: 2640 <--> 3877 mV at 3300 mV 
regulator: REG3_CORE: 1160 <--> 1703 mV at 1450 mV 
regulator: REG4_DDR: 1440 <--> 2115 mV at 1800 mV 
regulator: REG5_PERS: 2640 <--> 3877 mV at 3300 mV 
mc34704 1-0054: Loaded
Switching to clocksource mxc_timer1
NET: Registered protocol family 2
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 2048 (order: 2, 16384 bytes)
TCP bind hash table entries: 2048 (order: 1, 8192 bytes)
TCP: Hash tables configured (established 2048 bind 2048)
TCP reno registered
UDP hash table entries: 256 (order: 0, 4096 bytes)
UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
NET: Registered protocol family 1
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
i.MX27 CPU frequency change support initialized 133000 399000
JFFS2 version 2.2. (NAND) (SUMMARY)  © 2001-2006 Red Hat, Inc.
msgmni has been set to 118
alg: No test for stdrng (krng)
io scheduler noop registered (default)
imx-fb imx-fb.0: i.MX Framebuffer driver
Serial: IMX driver
imx-uart.0: ttymxc0 at MMIO 0x1000a000 (irq = 20) is a IMX
console [ttymxc0] enabled
imx-uart.1: ttymxc1 at MMIO 0x1000b000 (irq = 19) is a IMX
imx-uart.2: ttymxc2 at MMIO 0x1000c000 (irq = 18) is a IMX
imx-uart.4: ttymxc4 at MMIO 0x1001b000 (irq = 49) is a IMX
imx-uart.5: ttymxc5 at MMIO 0x1001c000 (irq = 48) is a IMX
brd: module loaded
loop: module loaded
at24 1-0052: 4096 byte at24 EEPROM (writable)
physmap platform flash device: 02000000 at c0000000
physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank
 Intel/Sharp Extended Query Table at 0x010A
 Intel/Sharp Extended Query Table at 0x010A
 Intel/Sharp Extended Query Table at 0x010A
 Intel/Sharp Extended Query Table at 0x010A
 Intel/Sharp Extended Query Table at 0x010A
Using buffer write method
Using auto-unlock on power-up/resume
cfi_cmdset_0001: Erase suspend on write enabled
in p cmdlinepart name cmdlinepart
p cmdlinepart name cmdlinepart
2 cmdlinepart partitions found on MTD device physmap-flash.0
Creating 2 MTD partitions on "physmap-flash.0":
0x000000000000-0x000000040000 : "barebox"
0x000000040000-0x000000060000 : "bareboxenv"
Generic platform RAM MTD, (c) 2004 Simtec Electronics
mtd-ram mtd-ram.0: registered mtd device
NAND device: Manufacturer ID: 0x2c, Chip ID: 0xac (Micron NAND 512MiB 1,8V
8-bit)
p X&#151;AÀ&#152;&#153;&#145;Ñ name RedBoot
RedBoot partition parsing not available
in p cmdlinepart name cmdlinepart
p cmdlinepart name cmdlinepart
2 cmdlinepart partitions found on MTD device mxc_nand
Creating 2 MTD partitions on "mxc_nand":
0x000000000000-0x000000300000 : "kernel"
0x000000300000-0x000020000000 : "root"
spi_imx spi_imx.2: probed
FEC Ethernet Driver
fec: PHY @ 0x0, ID 0x00221513 -- unknown PHY!
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
mxc-ehci mxc-ehci.1: initializing i.MX USB Controller
mxc-ehci mxc-ehci.1: Freescale On-Chip EHCI Host Controller
mxc-ehci mxc-ehci.1: new USB bus registered, assigned bus number 1
mxc-ehci mxc-ehci.1: irq 54, io mem 0x10024200
mxc-ehci mxc-ehci.1: USB 2.0 started, EHCI 1.00
usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: Product: Freescale On-Chip EHCI Host Controller
usb usb1: Manufacturer: Linux 2.6.33.3 ehci_hcd
usb usb1: SerialNumber: mxc-ehci.1
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 1 port detected
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
Freescale High-Speed USB SOC Device Controller driver (Apr 20, 2007)
mice: PS/2 mouse device common for all mice
rtc-ds1307 1-0068: rtc core: registered ds1339 as rtc0
i2c /dev entries driver
i.MX SDHC driver
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
PMIC Character device: successfully loaded
ALSA device list:
  No soundcards found.
oprofile: using timer interrupt.
TCP cubic registered
NET: Registered protocol family 17
regulator_init_complete: incomplete constraints, leaving REG5_PERS on
regulator_init_complete: incomplete constraints, leaving REG1_BKLT on
rtc-ds1307 1-0068: setting system clock to 2012-10-24 19:13:06 UTC
(1351105986)
WM9711/WM9712 SoC Audio Codec 0.4
asoc: AC97 HiFi <-> imx-ssi.0 mapping ok
VFS: Mounted root (jffs2 filesystem) readonly on device 31:4.
Freeing init memory: 112K
Warning: unable to open an initial console.
Kernel panic - not syncing: No init found.  Try passing init= option to
kernel.
Backtrace: 
[<c002817c>] (dump_backtrace+0x0/0x108) from [<c03258e0>]
(dump_stack+0x18/0x1c)
 r6:c001fc80 r5:c0424a94 r4:c03fb984
[<c03258c8>] (dump_stack+0x0/0x1c) from [<c0325934>] (panic+0x50/0x150)
[<c03258e4>] (panic+0x0/0x150) from [<c0024640>] (init_post+0x100/0x198)
 r3:00000008 r2:00000005 r1:d18002a0 r0:c03a0898
[<c0024540>] (init_post+0x0/0x198) from [<c00084d4>]
(kernel_init+0x104/0x13c)
 r5:c001ffd0 r4:c001ffd0
[<c00083d0>] (kernel_init+0x0/0x13c) from [<c003c50c>] (do_exit+0x0/0x724)
 r6:00000000 r5:00000000 r4:00000000
----------------------------------------------------------------------

If I change the rootpart_nand parameter to something like mtdblock1
(following t...stripped-down

PatrickMeyer
...and apparently I exceeded the max character count per post or something?

Anyway, what should end the post is:

If I change the rootpart_nand parameter to something like mtdblock1
(following the logic that (kernel) would be 0 and (root) would be 1), I
receive the following boot output (trimmed where they're the same for your
reading convenience):

----------------------------------------------------------------------
barebox 2010.04.0 (Oct 19 2011 - 17:46:52)
[...]
commandline: console=ttymxc0,115200 mt9v022.sensor_type=color
pcm038_otg_mode=device root=/dev/mtdblock1 ro rootfstype=jffs2
mtdparts="physmap-flash.0:256k(barebox)ro,128k(bareboxenv);mxc_nand:3M(kern
el),-(root)"
arch_number: 1551

Starting kernel ...

Uncompressing Linux... done, booting the kernel.
Linux version 2.6.33.3 (root@teckla) (gcc version 4.3.2
(OSELAS.Toolchain-1.99.3) ) #253 PREEMPT Wed Nov 2 17:06:53 CDT 2011
CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), cr=00053177
CPU: VIVT data cache, VIVT instruction cache
Machine: ULCB-i.MX27
[...]
WM9711/WM9712 SoC Audio Codec 0.4
asoc: AC97 HiFi <-> imx-ssi.0 mapping ok
jffs2: Too few erase blocks (1)
List of all partitions:
1f00             256 mtdblock0 (driver?)
1f01             128 mtdblock1 (driver?)
1f02            2048 mtdblock2 (driver?)
1f03            3072 mtdblock3 (driver?)
1f04          521216 mtdblock4 (driver?)
No filesystem could mount root, tried:  jffs2
Kernel panic - not syncing: VFS: Unable to mount root fs on
unknown-block(31,1)
[...]
----------------------------------------------------------------------

The same holds for mtdblock82 or something random. When I try passing
init=5 or init=/sbin/init to the kernel, it doesn't change anything. Give
that I know /dev/nand0.root is actually 521216k in size, I'm confused as to
why root=/dev/mtdblock4 doesn't work. Any help?

Thanks in advance. =)

Juergen Beisert
Seems you are using a modified Barebox. Did you free its C code from any
parition assumptions on the NAND device? How did you fill the root
partition with its content? Did you try to boot via NFSroot and have a look
how the system sees the three kind of MTD devices (NOR, RAM and NAND)?

PatrickMeyer
What area of barebox should I examine to see whether it's presuming a
partition layout?

I used the _update script that exists in my barebox install, which utilizes
tftp to write to flash.

I am unsure how to do an NFSroot boot with my setup (ethernet for tftp and
serial for terminal I/O). Any pointers?

Juergen Beisert
Due to an chicken-egg problem the platform code must register at least the
persistent environment. And sinse the environment partition is behind the
barebox partition, the platform code must register two fixed partitions in
the boot media. And these (offset and size) must correspond with the
settings in the mdtparts variable. It seems you changed the boot media from
NAND to NOR. In this case you should remove all partition registering from
inside the C code from the platform code (arch/arm/board
s/friendlyarm-mini2440/mini2440.c, function mini2440_devices_init() )

PatrickMeyer
Ok, so I found the function (in board/pcm038/pcm038.c) and see the
initialization code, but don't see anything troublesome there. Near the
bottom of the function, it creates self0 and env0 and then sets the boot
params to 0xa0000100 casted to a void pointer. I'm not sure what's there,
but I also don't know how to find out. Then it sets the architecture and
returns.

The thing is though, this device has always booted from NOR, and succeeds
with the original partition layout. In the event that I didn't understand
your reason for pointing me to that devices_init(), though, I am still
listening. =)

Juergen Beisert
It was just an idea what could be wrong with your system. I would recommend
now to boot this machine via NFSroot and then you can take a look into the
running machine, how the MTD devices are really counted, and what the
correct on is from kernel's point of view to act as its root filesystem.