2010-08-13

Formatting a PSP Memory Stick for use with a Pandora battery in Linux

Always a pain to do, and nobody seems to provide the files I want, so I'll just provide my own files (using ipl_ms.bin => no frills, just normal boot) and a short script.

Once extracted, just run something like:
root@sheeva:~/pandora# ./format_ms.sh /dev/sda
+ dd if=/dev/zero of=/dev/sda bs=512 count=32
32+0 records in
32+0 records out
16384 bytes (16 kB) copied, 0.00699201 s, 2.3 MB/s
+ parted /dev/sda
GNU Parted 1.8.8
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel msdos
(parted) mkpartfs primary fat16 32s -1s
(parted) set 1 boot on
(parted) set 1 lba off
(parted) u s
(parted) p
Model: Generic STORAGE DEVICE (scsi)
Disk /dev/sda: 3995648s
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number Start End Size Type File system Flags
1 32s 3995647s 3995616s primary fat16 boot

(parted) q
Information: You may need to update /etc/fstab.

+ dd if=ipl_ms.bin of=/dev/sda bs=512 seek=16
4+1 records in
4+1 records out
2288 bytes (2.3 kB) copied, 0.00179176 s, 1.3 MB/s
+ mount /dev/sda1 tmp_mnt
+ tar -C tmp_mnt -xvf ms.tar
ISO/
ISO/VIDEO/
MEMSTICK.IND
MP_ROOT/
MP_ROOT/100MNV01/
MP_ROOT/101ANV01/
MSTK_PRO.IND
MUSIC/
PICTURE/
PSP/
PSP/GAME150/
PSP/GAME5XX/
PSP/GAME/
PSP/SAVEDATA/
PSP/COMMON/
PSP/SYSTEM/
PSP/THEME/
PSP/RSSCH/
PSP/RSSCH/IMPORT/
VIDEO/
seplugins/
+ umount tmp_mnt
+ sync

2010-08-09

That darn iBFT iSCSI Windows installation error

If you ended up here, it's probably because you too tried to install Windows 7 or Vista on an iSCSI bootable disk using gPXE and, even though Windows setup could see your disk alright, you got one of the following errors:

"Windows cannot be installed to this disk. Setup does not support configuration of or installation to disks connected through a USB or IEEE 1394 port" (Vista)

"Windows cannot be installed to Disk <#> Partition <#>. (Show details)" -> "Windows cannot be installed to this disk. iSCSI deployment is disabled since no NICs referenced in the iBFT can be resolved to actual NT-visible devices. Windows cannot be installed to this disk. This computer's hardware may not support booting to this disk. Even if you're probably smart enough to know what you're doing, and we could definitely let you install to this disk to sort booting later, we're going to be asses about it and prevent you from overriding the idiotic setup decisions we made (by the way, did we mention the Windows recovery partition yet?). Why? Because we're Microsoft and screw you, that's why." (Windows 7)

OK. Let's forget about Microsoft's stupid decisions for a while, and attempt to work within them, to figure out how we can address the issue.

Firs of all, if you see the message above, then I can guarantee that, no matter how properly you think you did setup your iSCSI PXE boot, you screwed up something and your iSCSI boot sequence is wrong.
And yes, getting an iSCSI boot error on a blank disk is to be expected, but NO, not all the errors you see during gPXE iSCSI boot can be safely ignored (if you manage to see them at all, but we'll come to that). There's good error and there's bad error.

The first thing I'll point out, if you're like me and thought you could get your dchp/tftp server to:
  1. Supply the iSCSI disk boot parameter (that dhcp-option=net:gpxe,17,"iscsi:192.150.23.3::3260:2:sheeva:disk1" line or similar, along with the keep-san option)
  2. Attempt to boot from it and
  3. If that fails, fallback to executing WinPE/pxleinux to launch a WinPE installation image
is that such a scheme just won't work. Unless you're fiddling with the gPXE scripting options (and even then), you can only have either
  • Boot from iSCSI then fail and hands things over back to BIOS, or,
  • If a boot image is specified, ignore the iSCSI options provided by the dhcp server altogether and just boot from that image.
You can't just have the dhcp/tftp server alone tell gPXE: "try to boot to iSCSI and if that fails, boot from something else while keeping the iSCSI boot options", to boot WinPE for installation onto a blank iSCSI disk for instance. No siree. If you tried that, well, that was your first mistake. Not to say that this can't be achieved at all (we'll see how to do just that from the commandline below, and, in a next post, I'll try to show you how to do it automatically as well), but that can only be achieved outside of the dhcp/tftp options.

In short, if you're using dnsmasq and with something like:
dhcp-match=gpxe,175   # tags the request with net:gpxe if gPXE was supplied
dhcp-option=175,8:1:1 # turn on the keep-san option (allows installation)
dhcp-option=net:gpxe,17,"iscsi:192.150.23.3::3260:2:sheeva:disk1"
dhcp-boot=net:#gpxe,pxelinux.0 # if NOT (#) gPXE, use pxelinux.0
dhcp-boot=net:gpxe,Boot/startrom.n12 # if gPXE, use WinPE
Then, when WinPE boots, it will not have any of the options that you think gPXE should have fed it with regards to the iSCSI boot disk. Especially, the "dhcp-option=net:gpxe,17," option will be completely ignored. Yeah, that makes as much sense to me to as anybody else, but that's how gPXE works for now.

And that's also the reason why, in most of the guides you see, they'll tell you to first try to boot from an unbootable iSCSI disk with gPXE, let it fail and then use BIOS fallback to boot from an installation CD or DVD. Again, simply chaining WinPE in there from PXE does not work without additional effort that none of these guides provide.

Also, and this is the most important part if you want Windows install to accept your iSCSI disk as bootable, as long as you do not see the following lines during boot:
Booting from root path "<your iSCSI path>"

Registered as BIOS drive 0x80
Booting from BIOS drive 0x80
Boot failed
Preserving connection to SAN disk
Then it's game over, plain and simple.

Granted, those line may be hard to spot at during boot, when gPXE will hand things back over to the BIOS on failure (which it should do, if you followed what I said above), as those darn BIOS makers forgot that the Pause key we have on our keyboards could be put for some good use, but if you try a few times, and you don't see any mention of a BIOS drive 0x80, Windows will simply not see your iSCSI driver as bootable, simple as this.

For your reference, here's a screenshot from a VMWare diskless machine that illustrates what you should see when gPXE executes:



As long as you see the lines I highlighted above, after the iSCSI boot attempt, whatever error is thrown out will come from the iSCSI disk itself, rather than your boot process, so you can ignore it. But if you don't see the "Registered as BIOS drive" line from gPXE however, you should pay very close attention to the iSCSI error you get.

So, of course, now your question is: "I'm not seeing these lines (or they're too fast for me to see). How then can I validate that my iSCSI target is good, and that it can be used for installation with gPXE?"

Well, duh, through the gPXE commandline of course, which you can enter with Ctrl-B at boottime. Gotta wish proprietary PXE was as easy to troubleshoot for power users as gPXE is. But you're in a hurry and don't want to learn about the whole gPXE/DHCP/TFTP internals, so I'll cut down to the chase. The sequence of command you are after:
dhcp net0
set keep-san 1
sanboot iscsi:<iscsi server ip>::<iscsi port>:<iscsi lun>:<iscsi target id>
# and if the above line works and you want to boot to WinPE for instance, you could
chain tftp://<server ip>/Boot/startrom.n12
  • dhcp net0 initializes DHCP and allows you to communicate with the server (for tftp, etc)
  • then the keep-san option is to ensure that Windows can see the iSCSI disk as bootable, which of course is the feature you're after
  • finally the sanboot line is the one that will tell you if something is wrong with your iSCSI access.
But first, let's see an example of what happens when everything works as expected (for an uninitialized disk):



Here, we have the Registered as BIOS drive and the Preserving connection lines so we're good. You might also want to note that I am specifically specifying that I'm using port 3260 (default for iSCSI) and that my device is on LUN2 (very non default).

Now, let's see some common errors:



0x2c0d603b is usually an indication that your iSCSI path is wrong. In the case above, I used the non-existing disk0 instead of disk1 for the target part.



Ah, 0x1d704039 (and now, aren't you glad you found this page)...
Yes, this is an error you should not get, even with a non bootable iSCSI disk. And yes, I agree, that an I/O error is precisely what you'd expect from a non-bootable disk, but actually, that I/O error is unrelated to the disk being bootable or not. On the other hand, it has very much to do with trying to use an iSCSI device that cannot be used as a disk, which, if you are using Linux tgtd/tgtadm is exactly what you're going to get if you're leaving the sequence of four columns as is (::::) because that means that LUN 0 will be used, and LUN 0 is reserved by tgtd for the virtual controller.
In short, if you're using tgtd, your actual device LUNs start at 1, and if you're keeping the options part as '::::', then the default of LUN 0 will be used, so you're not actually accessing your disk!
This is why I'm using iscsi:192.150.23.3::3260:2:sheeva:disk1 in the line that works, because I'm trying to access the second disk I created on that specific target. Even if that was the very first disk I created with tgtadm, I would have to add 1 for the LUN in the line above, because the default of 0 is not a disk.

Then, other errors you might get are 0x0b8080a0 (Operation cancelled) or 0x2e852001 (Exec Format Error), but these should occur after you get the "Registered as BIOS drive" line, so you should be able to safely ignore them. For other errors, google is your friend.

So, to summarize, if Windows doesn't like your iSCSI boot device, it's probably because, despite what you think, gPXE didn't find anything it could use as a bootable disk, and to find out why, you should try to boot from it using the gPXE low level commands.

In a next instalment, we'll see how we can create a nice iSCSI aware WinPE image, that we can launch from PXE, for all of our installation needs, and how we can solve the problem of automating WinPE fallback from a non bootable iSCSI disk, as well as how we can use pxelinux to boot from multiple iSCSI disk.

2010-06-25

chrooted ssh & sftp on Slackware

Since the SheevaPlug makes a nice server for one to share files online securedly, while allowing you to see exactly who is accessing them, today's exercise is to setup sshd so that selected people can SFTP into it, while only seeing what you want them to see.

Now, you'll find plenty of articles on how to do that, some of them very well made, but what they won't tell is how to sort things out when stuff's not exactly working as it should be.

First of all, this is what you want to have in your sshd_config:
# Use a non obvious port for outside connections.
# You could also do port translation on your gateway or something
# but, just so you see how it's done:
Port 22
Port 1234
Default options should be super limiting: no X11/TCP forwarding, no pasword auth, no root logon, etc) and then you use the very convenient Match tag to setup the allowed remote users, as well as the less restrictive options for your own network. Thus:
# only "remote_user" can logon from the outside network, in a chrooted env
Match User remote_user
PasswordAuthentication yes
ChrootDirectory /home/chroot
# connections from inside are OK
Match Address 192.168.1.0/24
PasswordAuthentication yes
PermitRootLogin yes
AllowTCPForwarding yes
X11Forwarding yes
Then I'll assume that you pretty much follow the link I gave above to setup your /home/chroot directory and create the "remote_user" guy.

First bad surprise:
"cannot run command `/bin/bash': No such file or directory"
WTF? But I did copy bash and the script took care of my libraries.
Unfortunately, nope, the script did not take care of all the libs, and instead of reporting "library not found", the error message does not help.

Now, if you go:
# ldd /bin/bash
libtermcap.so.2 => /lib/libtermcap.so.2 (0x4004b000)
libdl.so.2 => /lib/libdl.so.2 (0x40056000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x40061000)
libc.so.6 => /lib/libc.so.6 (0x40075000)
/lib/ld-linux.so.3 (0x40000000)
and check your libraries again, you'll probably find that ld-linux.so.3 was not copied. If you do just that, you should find that you can logon to the chrooted environment at last. Yay!

But then you want to add sftp. To do that, you *must* have the sftp-server and the lib dependencies in your chrooted environment as well.

For Slackware, that means you need to create a /home/chroot/usr/libexec/sftp-server (if you don't know which sftp-server to pick, check the line "Subsystem sftp" in your sshd_config) and once again, copy all the library files (or use the script).

But then, disaster strikes: whenever you try to logon with sftp, you get your connection closed. If you look at the sftp debug log, you'll see something like:
debug1: Sending subsystem: sftp
debug2: channel 0: request subsystem confirm 1
debug2: callback done
debug2: channel 0: open confirm rwindow 0 rmax 32768
debug2: channel 0: rcvd adjust 131072
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug2: channel 0: rcvd eof
debug2: channel 0: output open -> drain
debug2: channel 0: obuf empty
debug2: channel 0: close_write
debug2: channel 0: output drain -> closed
debug2: channel 0: rcvd close
debug2: channel 0: close_read
debug2: channel 0: input open -> closed
debug3: channel 0: will not send data after close
debug2: channel 0: almost dead
debug2: channel 0: gc: notify user
debug2: channel 0: gc: user detached
debug2: channel 0: send close
debug2: channel 0: is dead
debug2: channel 0: garbage collecting
debug1: channel 0: free: client-session, nchannels 1
debug3: channel 0: status: The following connections are open:
#0 client-session (t4 r0 i3/0 o3/0 fd -1/-1 cfd -1)

debug3: channel 0: close_fds r -1 w -1 e 6 c -1
debug1: fd 0 clearing O_NONBLOCK
debug3: fd 1 is not O_NONBLOCK
debug1: Transferred: stdin 0, stdout 0, stderr 0 bytes in 0.3 seconds
debug1: Bytes per second: stdin 0.0, stdout 0.0, stderr 0.0
debug1: Exit status 1
Connection closed
Well, to cut a long story short, if you're seeing this, the problem is likely that you don't have rw permission on /dev/null (and possibly /dev/zero). Just make sure you issue a chmod 777 /home/chroot/dev/null and that should be the end of it.

2010-02-24

Resync'ing a RAID 1 array

Can come handy when attempting to correct unreadable sectors on a RAID disk. Eg. if your RAID1 array is built around /dev/sda3 and /dev/sdb3 and you had unreadable sectors reported by SMART on /dev/sda (Current_Pending_Sector), but got no error during the SMART extended self test, you might want to resync the /dev/sda partitions on /dev/sdb as follows:

mdadm --fail /dev/md2 /dev/sda3
mdadm --remove /dev/md2
/dev/sda3
mdadm --add /dev/md2 /dev/sda3