Now I'll try the Rocks 5.2 install. I decided to go outside the defaults by re-designing the disk partitions. I know what I want, I really do. But after about 1 hour of installation, it failed because something was out of order with the partitions I created. I don't need/want a big NFS partition, because there's an external device that will be attached for that, and it seems like a big waste to set aside almost the whole hard disk for NFS on the front end and the compute nodes.
Philip warned me not to do this, but I tried it anyway. He was correct. In the manual partition device, I'm not sure if the NFS partition is supposed to mount on /export or /state/partition1. I tried the latter, but there was an error that there was not enough space for /var, but if I let it put that on /export, it says OK. Until the Rocks install process gets to the very very end, it asks for the Torque Roll, and disaster happens. I think either i named the NFS partition incorrectly or I did not make it big enough. So I'll start again with default partitions, but in case you want to see, I've uploaded the error log.
https://pj.freefaculty.org/linux/cluster/rockyFailed-1.txt
Too bad, the next time it fails. But differently 🙂
The default partitioning did not solve everything. After inserting the Roll disks, the Rocks installer begins to build the distribution and an error appears.
Unable to read package metadata. This may be due to a missing repodata directory. Please ensure that your tree has been correctly generated.
https://pj.freefaculty.org/linux/cluster/rockyFailed-2.txt
Well, what could cause this? The only Roll I'm using that is not directly from the Rocks distro is Torque. I need the Torque Roll because this system is intended to go into a MOAB system of clusters. If Torque is the problem, I guess I'll find out by rebuilding without it.
The other possibility is that the Centos disk is corrupt. I just wrote it (and verified it in k3b) but the Rocks install does not prompt to do the disk integrity check. Come to think of it, I have seen this one before. Bad Centos disk -> failed install. But now I've found another PC to boot off the Centos disk and run the disk check. It is.
I suppose there may be something wonky with my Rocks Kernel disk. So I made another one.
Perhaps the third time is the charm. WITHOUT the Torque Roll, but with Centos-5.3_x64, and the rolls from Rocks "base" "ganglia" "hpc" "area51" "web server", Rocks does install.