GFS over AoE (or, resilient, highly scalable SAN)
Note - Work in Progress - if, whilst I'm trying to get this to work, I come across some unsolvable stumbling block, I guess I'll need to go back to the drawing board.
Objective
To create a file storage system, which allows for near limitless expansion, accessible from one or more clients.
Summary
The system will use ATA over Ethernet (AoE) to allow the storage capabilities of one or more servers to be pooled together in a RAID configuration, so that they appear as a single device to clients. This device will then have GFS installed as its filesystem, allowing multiple clients concurrent access to the file system.
Additional storage nodes can later be added to the device as required, allowing for increases in storage capacity/more fault tolerant RAID level.
Hardware I'm using for this
2 x RHEL4 servers, each with 1TB storage space. Each 1TB is comprised of 2 x 500GB SATA disks. 300GB is allocated to the servers operating system, the remaining 700GB is striped to create a single device.
6 x RHEL4 servers, which need a shared storage space.
What I'm hoping for
Initially, I'm aiming for a 700GB device, which can be mounted on each of the servers needing the shared storage, in a RAID1 (mirrored) configuration, for fault tolerance.
Longer term, add more storage nodes, so that the mountable device can be in a RAID5/10 configuration.
First thing to do is install the AoE driver on the storage nodes, so that we can present a block device on this server over the network.
At first, 'coz I don't want to risk messing any data up, I'm going to try to get this working using the ramdisk; it behaves similarly to a normal HDD, so should be easy to switch over later.
Get a copy of 'vblade' from http://sourceforge.net/projects/aoetools/ - at time of writing, it is on release 14, so that's what I'll be using. There is also a kernel level version of vblade available, called kvblade (available at the same URL), but it's still in Alpha, and I'm not prepared to take that gamble right now.
Nice and easy to install:
[jad@goofy ~]$ tar zxfp vblade-14.tgz
[jad@goofy ~]$ cd vblade-14
[jad@goofy vblade-14]$ make
gcc -Wall -g -O2 -c aoe.c
gcc -Wall -g -O2 -c linux.c
gcc -Wall -g -O2 -c ata.c
gcc -o vblade aoe.o linux.o ata.o
[jad@goofy vblade-14]$ su
Password:
[root@goofy vblade-14]# make install
install vblade /usr/sbin/
install vbladed /usr/sbin/
install vblade.8 /usr/share/man/man8/
[root@goofy vblade-14]#
Really easy to get running as well:
From the man page:
SYNOPSIS
vblade [ -m mac[,mac...] ] shelf slot netif filename
Because Coraid wrote most (all?) of the vblade software, the terminology for vblade is tied into the marvellous EtherDrive product, which uses shelves and slots to describe different devices with different slots for disks.
[root@goofy ~]# vblade 2 1 eth0 /dev/ramdisk &
[1] 24555
[root@goofy ~]# pid 24555: e2.1, 32768 sectors O_RDWR
With this command, we are exporting /dev/ramdisk, via interface eth0, as shelf 2, slot1.
The output from vblade shows us that the PID of the vblade process is 24555, exported as shelf 2, slot1, with 32768 clusters, exactly as we'd expect.
I then did exactly the same on the other storage node, except using a shelf number of 1 instead of 2.
That's it for vblade - if only all software could be this easy to get working!
Next, it's time to install the AoE kernel module onto the clients.
Download a copy of the AoE Linux Driver from http://www.coraid.com/support/linux/- at time of writing, the file I used was aoe6-48.tar.gz
Nice and easy to install again (but note that you'll need a 2.6.2 kernel or newer):
[jad@grumpy ~]$ tar zxfp aoe6-48.tar.gz
[jad@grumpy ~]$ cd aoe6-48
[jad@grumpy aoe6-48]$ make
ensuring compatibility ... 1
patching file linux/drivers/block/aoe/disk_attr.h
2
MODPOST
CC /home/jad/aoe6-48/linux/drivers/block/aoe/aoe.mod.o
LD [M] /home/jad/aoe6-48/linux/drivers/block/aoe/aoe.ko
make[1]: Leaving directory `/usr/src/kernels/2.6.9-5.EL-smp-i686'
[output snipped for brevity]
[jad@grumpy aoe6-48]$ su
Password:
[root@grumpy aoe6-48]# make install
ensuring compatibility ... 1 2 3 4 5 6 7 8 9 10 11 12 ok
cd aoetools-16 && make
[output again snipped for brevity]
+ install -m 664 aoeping.8 /usr/share/man/man8/aoeping.8
make[1]: Leaving directory `/home/jad/aoe6-48/aoetools-16'
n_partitions=16 \
n_shelves=10 aoetools-16/aoe-mkdevs /dev/etherd
[root@grumpy aoe6-48]#
After making and installing the driver, you should load the aoe driver:
[root@grumpy ~]# modprobe aoe
Now, since vblade is already running on the two storage nodes, I should be able to see from from this server as well:
[root@grumpy ~]# aoe-stat
e1.1 0.016GB eth1 up
e2.1 0.016GB eth1 up
Yes - both of the storage nodes are available, which is nice.
Now, download the source RPMS.
You'll need a more recent kernet than the one which comes with RHEL4 (2.6.9-5); at time of writing, 2.6.9-55 seems OK. If you don't know how to upgrade your kernel, you probably shouldn't be playing with this sort of stuff yet!
You should get the latest versions available for each of these from updates.redhat.com; if any of them isn't on there (at time of writing, think the only one that was'nt was the Perl one), pop over to ftp.redhat.com and get it from there instead.
Don't just get them all from ftp.redhat.com, 'coz when you then remember about updates, you'll kick yourself for not checking there first. Trust me, I know this to be true...
- in /enterprise/4ES/en/os/SRPMS
- device-mapper-1.02.17-3.el4.src.rpm
- lvm2-2.02.21-5.el4.src.rpm
- in /enterprise/4ES/en/RHCS/SRPMS
- ccs-1.0.10-0.src.rpm
- cman-1.0.11-0.src.rpm
- cman-kernel-2.6.9-50.2.src.rpm
- dlm-1.0.3-1.src.rpm
- dlm-kernel-2.6.9-46.16.src.rpm
- fence-1.32.45-1.0.1.src.rpm
- gulm-1.0.10-0.src.rpm
- iddev-2.0.0-4.src.rpm
- magma-1.0.7-1.src.rpm
- magma-plugins-1.0.12-0.src.rpm
- perl-Net-Telnet-3.03-3.src.rpm
- rgmanager-1.9.68-1.src.rpm
- system-config-cluster-1.0.45-1.0.src.rpm
- in /enterprise/4ES/en/RHGFS/SRPMS
- GFS-6.1.14-0.src.rpm
- GFS-kernel-2.6.9-72.2.src.rpm
- lvm2-cluster-2.02.21-7.el4.src.rpm
Next, install all the RPMs you just downloaded, and build them into binary RPMs, suitable for installaion.
To convert a src package to an installable RPM:
[root@daisy jad]# rpm -i ccs-1.0.0-0.src.rpm
warning: ccs-1.0.0-0.src.rpm: V3 DSA signature: NOKEY, key ID db42a60e
[root@daisy jad]# cd /usr/src/redhat/SPECS
[root@daisy SPECS]# rpmbuild -bb --rmsource --clean ccs.spec
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.77348
+ umask 022
+ cd /usr/src/redhat/BUILD
+ LANG=C
[snipped for brevity]
Binary RPM packages have now been built:
[jad@daisy SPECS]$ ls -l /usr/src/redhat/RPMS/i686/ccs-*
-rw-r--r-- 1 root root 71251 Jun 21 13:20 ccs-1.0.10-0.i686.rpm
-rw-r--r-- 1 root root 103306 Jun 21 13:20 ccs-debuginfo-1.0.10-0.i686.rpm
-rw-r--r-- 1 root root 6322 Jun 21 13:20 ccs-devel-1.0.10-0.i686.rpm
A couple of caveats
- If you get the following error whilst running rpmbuild, you'll probably want to add a '--target i686' parameter to the command (unless you're running on something other than the i686 target, obviously):
[root@daisy SPECS]# rpmbuild -bb --rmsource cman-kernel.spec
error: Architecture is not included: i386
[root@daisy SPECS]# rpmbuild -bb --rmsource --target i686 cman-kernel.spec
Building target platforms: i686
Building for target i686 - You're almost certain to hit loads of dependencies when trying to build the RPMs. Some of them will be for RPMS you've just built, some for RPMs that are still on your list to build, and others which you'll need to get from Redhat. Just deal with them as and when it happens, and curse about it as needed.
Note: When I built fence, it needed loads and loads of things, in an almost never-ending dependency tree. It sucked. This included adding all the X stuff, for some stupid reason. That really sucked.
- You need to install GFS-kernheaders before trying to build the GFS package but...when you come to build the GFS package, you might get errors like this:
/usr/include/linux/gfs_ondisk.h:626: error: syntax error before "__be64"
/usr/include/linux/gfs_ondisk.h:628: error: syntax error before "sc_dinodes"
If you do get these errors, you need to uninstall GFS-kernheaders, and modify the source code, before rebuilding the package and reinstalling it.
The tar.gz file should still be in /usr/src/redhat/SOURCES - move it somewhere else, untar it, and edit src/gfs/gfs_ondisk.h
At about line 623, replace
__be64 sc_total;
__be64 sc_free;
__be64 sc_dinodes;with
uint64_t sc_total;
uint64_t sc_free;
uint64_t sc_dinodes;Rebuild the tar.gz file, put it back in /usr/src/redhat/SOURCES, rebuild GFS-kernheaders, install GFS-kernheaders, and they try to build the GFS package again.
See this bug report for more information.
- Posted by jad on 19/06/2007.




