Username:
Password:
Enter your username or email address.
Username:
Email:

GFS over AoE (or, resilient, highly scalable SAN)

Note - Work in Progress - if, whilst I'm trying to get this to work, I come across some unsolvable stumbling block, I guess I'll need to go back to the drawing board. 

Objective

To create a file storage system, which allows for near limitless expansion, accessible from one or more clients.

Summary

The system will use ATA over Ethernet (AoE) to allow the storage capabilities of one or more servers to be pooled together in a RAID configuration, so that they appear as a single device to clients.  This device will then have GFS installed as its filesystem, allowing multiple clients concurrent access to the file system.

Additional storage nodes can later be added to the device as required, allowing for increases in storage capacity/more fault tolerant RAID level.

Hardware I'm using for this

2 x RHEL4 servers, each with 1TB storage space. Each 1TB is comprised of 2 x 500GB SATA disks. 300GB is allocated to the servers operating system, the remaining 700GB is striped to create a single device.

6 x RHEL4 servers, which need a shared storage space.

What I'm hoping for

Initially, I'm aiming for a 700GB device, which can be mounted on each of the servers needing the shared storage, in a RAID1 (mirrored) configuration, for fault tolerance.

Longer term, add more storage nodes, so that the mountable device can be in a RAID5/10 configuration.

 


First thing to do is install the AoE driver on the storage nodes, so that we can present a block device on this server over the network.

At first, 'coz I don't want to risk messing any data up, I'm going to try to get this working using the ramdisk; it behaves similarly to a normal HDD, so should be easy to switch over later.

Get a copy of 'vblade' from http://sourceforge.net/projects/aoetools/ - at time of writing, it is on release 14, so that's what I'll be using. There is also a kernel level version of vblade available, called kvblade (available at the same URL), but it's still in Alpha, and I'm not prepared to take that gamble right now.

Nice and easy to install:

[jad@goofy ~]$ tar zxfp vblade-14.tgz
[jad@goofy ~]$ cd vblade-14
[jad@goofy vblade-14]$ make
gcc -Wall -g -O2 -c aoe.c
gcc -Wall -g -O2 -c linux.c
gcc -Wall -g -O2 -c ata.c
gcc -o vblade aoe.o linux.o ata.o
[jad@goofy vblade-14]$ su
Password:
[root@goofy vblade-14]# make install
install vblade /usr/sbin/
install vbladed /usr/sbin/
install vblade.8 /usr/share/man/man8/         
[root@goofy vblade-14]#


Really easy to get running as well:

From the man page:

SYNOPSIS
       vblade [ -m mac[,mac...] ] shelf slot netif filename

Because Coraid wrote most (all?) of the vblade software, the terminology for vblade is tied into the marvellous EtherDrive product, which uses shelves and slots to describe different devices with different slots for disks.

[root@goofy ~]# vblade 2 1 eth0 /dev/ramdisk &
[1] 24555
[root@goofy ~]# pid 24555: e2.1, 32768 sectors O_RDWR

With this command, we are exporting /dev/ramdisk, via interface eth0, as shelf 2, slot1.

The output from vblade shows us that the PID of the vblade process is 24555, exported as shelf 2, slot1, with 32768 clusters, exactly as we'd expect.

I then did exactly the same on the other storage node, except using a shelf number of 1 instead of 2.

That's it for vblade - if only all software could be this easy to get working!


Next, it's time to install the AoE kernel module onto the clients.

Download a copy of the AoE Linux Driver from http://www.coraid.com/support/linux/- at time of writing, the file I used was aoe6-48.tar.gz 

Nice and easy to install again (but note that you'll need a 2.6.2 kernel or newer):

[jad@grumpy ~]$ tar zxfp aoe6-48.tar.gz
[jad@grumpy ~]$ cd aoe6-48
[jad@grumpy aoe6-48]$ make
ensuring compatibility ... 1
patching file linux/drivers/block/aoe/disk_attr.h
2
  MODPOST
  CC      /home/jad/aoe6-48/linux/drivers/block/aoe/aoe.mod.o
  LD [M]  /home/jad/aoe6-48/linux/drivers/block/aoe/aoe.ko
make[1]: Leaving directory `/usr/src/kernels/2.6.9-5.EL-smp-i686'

[output snipped for brevity] 

[jad@grumpy aoe6-48]$ su
Password:
[root@grumpy aoe6-48]# make install
ensuring compatibility ... 1 2 3 4 5 6 7 8 9 10 11 12 ok
cd aoetools-16 && make 

[output again snipped for brevity] 

+ install -m 664 aoeping.8 /usr/share/man/man8/aoeping.8
make[1]: Leaving directory `/home/jad/aoe6-48/aoetools-16'
n_partitions=16 \
  n_shelves=10 aoetools-16/aoe-mkdevs /dev/etherd
[root@grumpy aoe6-48]#

 After making and installing the driver, you should load the aoe driver:

[root@grumpy ~]# modprobe aoe

Now, since vblade is already running on the two storage nodes, I should be able to see from from this server as well:

[root@grumpy ~]# aoe-stat
      e1.1         0.016GB   eth1 up           
      e2.1         0.016GB   eth1 up  

Yes - both of the storage nodes are available, which is nice. 

 

 

 


Now, download the source RPMS.

 

You'll need a more recent kernet than the one which comes with RHEL4 (2.6.9-5); at time of writing, 2.6.9-55 seems OK. If you don't know how to upgrade your kernel, you probably shouldn't be playing with this sort of stuff yet!

You should get the latest versions available for each of these from updates.redhat.com; if any of them isn't on there (at time of writing, think the only one that was'nt was the Perl one), pop over to ftp.redhat.com and get it from there instead. 

Don't just get them all from ftp.redhat.com, 'coz when you then remember about updates, you'll kick yourself for not checking there first. Trust me, I know this to be true... 

  •   in /enterprise/4ES/en/os/SRPMS
    1. device-mapper-1.02.17-3.el4.src.rpm
    2. lvm2-2.02.21-5.el4.src.rpm
  • in /enterprise/4ES/en/RHCS/SRPMS
    1. ccs-1.0.10-0.src.rpm
    2. cman-1.0.11-0.src.rpm
    3. cman-kernel-2.6.9-50.2.src.rpm
    4. dlm-1.0.3-1.src.rpm
    5. dlm-kernel-2.6.9-46.16.src.rpm
    6. fence-1.32.45-1.0.1.src.rpm
    7. gulm-1.0.10-0.src.rpm
    8. iddev-2.0.0-4.src.rpm
    9. magma-1.0.7-1.src.rpm
    10. magma-plugins-1.0.12-0.src.rpm
    11. perl-Net-Telnet-3.03-3.src.rpm
    12. rgmanager-1.9.68-1.src.rpm
    13. system-config-cluster-1.0.45-1.0.src.rpm
  • in /enterprise/4ES/en/RHGFS/SRPMS
    1. GFS-6.1.14-0.src.rpm
    2. GFS-kernel-2.6.9-72.2.src.rpm
    3. lvm2-cluster-2.02.21-7.el4.src.rpm

 


Next, install all the RPMs you just downloaded, and build them into binary RPMs, suitable for installaion.

 To convert a src package to an installable RPM:

[root@daisy jad]# rpm -i ccs-1.0.0-0.src.rpm
warning: ccs-1.0.0-0.src.rpm: V3 DSA signature: NOKEY, key ID db42a60e
[root@daisy jad]# cd /usr/src/redhat/SPECS
[root@daisy SPECS]# rpmbuild -bb --rmsource --clean ccs.spec
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.77348
+ umask 022
+ cd /usr/src/redhat/BUILD
+ LANG=C
[snipped for brevity]

Binary RPM packages have now been built:

[jad@daisy SPECS]$ ls -l /usr/src/redhat/RPMS/i686/ccs-*
-rw-r--r--  1 root root     71251 Jun 21 13:20 ccs-1.0.10-0.i686.rpm
-rw-r--r--  1 root root    103306 Jun 21 13:20 ccs-debuginfo-1.0.10-0.i686.rpm
-rw-r--r--  1 root root      6322 Jun 21 13:20 ccs-devel-1.0.10-0.i686.rpm

 

A couple of caveats

 

  • If you get the following error whilst running rpmbuild, you'll probably want to add a '--target i686' parameter to the command (unless you're running on something other than the i686 target, obviously):

     

    [root@daisy SPECS]# rpmbuild -bb --rmsource cman-kernel.spec
    error: Architecture is not included: i386
    [root@daisy SPECS]# rpmbuild -bb --rmsource --target i686 cman-kernel.spec
    Building target platforms: i686
    Building for target i686

     

  • You're almost certain to hit loads of dependencies when trying to build the RPMs. Some of them will be for RPMS you've just built, some for RPMs that are still on your list to build, and others which you'll need to get from Redhat. Just deal with them as and when it happens, and curse about it as needed.

     

    Note: When I built fence, it needed loads and loads of things, in an almost never-ending dependency tree. It sucked. This included adding all the X stuff, for some stupid reason. That really sucked.

  • You need to install GFS-kernheaders before trying to build the GFS package but...when you come to build the GFS package, you might get errors like this:

     

    /usr/include/linux/gfs_ondisk.h:626: error: syntax error before "__be64"
    /usr/include/linux/gfs_ondisk.h:628: error: syntax error before "sc_dinodes"

     

    If you do get these errors, you need to uninstall GFS-kernheaders, and modify the source code, before rebuilding the package and reinstalling it.

    The tar.gz file should still be in /usr/src/redhat/SOURCES - move it somewhere else, untar it, and edit src/gfs/gfs_ondisk.h

    At about line 623, replace

    __be64 sc_total;
    __be64 sc_free;
    __be64 sc_dinodes;

     with

    uint64_t sc_total;
    uint64_t sc_free;
    uint64_t sc_dinodes;

    Rebuild the tar.gz file, put it back in /usr/src/redhat/SOURCES, rebuild GFS-kernheaders, install GFS-kernheaders, and they try to build the GFS package again. 

    See this bug report for more information.





Join now for your FREE etribes Account!

etribes