Sun x4540 arrived…

…but I can’t use it! 😦 I didn’t realize that it comes with two PSUs from the factory in a 240VAC-only mode. Because we only have 110VAC, we need to get a 3rd PSU. (3 PSUs lets you run it on 110VAC inputs.) Currently working with Sun to get that third PSU… stay tuned.

Advertisements

More on TCP/IP offloading & iSCSI performance

Spurred on by the surprising[ly poor] results I saw when previously testing TCP/IP offloads, I decided to run some more tests. This time, as a couple of people have suggested, I wanted to load up the server’s four cores (it’s a dual-socket, dual-core Opteron) before testing the network I/O. This way, we should see the benefits of the TCP/IP Offload Engine become clear as our host’s CPUs become busy dealing with application I/O.

Using Phoronix’s Test Suite, I ran the stressCPU2 test for a half-hour while running both TOE-enabled and TOE-disabled tests with Oracle’s ORiON. I did this with the same ORiON syntax as before, namely:

./orion -run advanced -testname mytest -num_disks 1 -write 50

Here are the results for 50% read, 50% write:

With TOE:

Maximum Large MBPS=47.94 @ Small=0 and Large=2
Maximum Small IOPS=1253 @ Small=5 and Large=0
Minimum Small Latency=3.74 @ Small=4 and Large=0

Without TOE:

Maximum Large MBPS=42.82 @ Small=0 and Large=2
Maximum Small IOPS=1098 @ Small=5 and Large=0
Minimum Small Latency=4.00 @ Small=1 and Large=0

Now that I’m seeing the opposite of yesterday’s results, I thought I’d re-run the tests. Like I did above, I ran the tests below back-to-back (changing the TOE driver out as soon as possible):

With TOE:

Maximum Large MBPS=38.95 @ Small=0 and Large=2
Maximum Small IOPS=1015 @ Small=5 and Large=0
Minimum Small Latency=4.71 @ Small=1 and Large=0

Without TOE:

Maximum Large MBPS=34.98 @ Small=0 and Large=2
Maximum Small IOPS=738 @ Small=5 and Large=0
Minimum Small Latency=5.62 @ Small=1 and Large=0

This is virtually the reverse of what we saw before. Now, we see that the host — which is CPU-bound due to our artificial load generation — is having trouble filling network I/O when not utilizing the TOE drivers. When the TOE drivers are used, the host I/O returns to normal.

iSCSI & TCP/IP offload is… slower?!

Riddle me this, Batman. Testing ORiON (Oracle I/O Numbers) gives me some pretty strange results — enabling TCP/IP offload yields slower performance than having it switched on! This is a HP BL25p G1 Blade connecting to an iSCSI LUN on a NetApp R200 filer. The LUN is a single 20GB partition. Both pure-read and 50% read/50% write numbers are slower when TCP/IP offload is enabled.

100% reads with TCP/IP offload:

Maximum Large MBPS=98.40 @ Small=0 and Large=2
Maximum Small IOPS=5025 @ Small=5 and Large=0
Minimum Small Latency=0.96 @ Small=4 and Large=0

100% read without TCP/IP offload:

Maximum Large MBPS=106.27 @ Small=0 and Large=2
Maximum Small IOPS=5561 @ Small=5 and Large=0
Minimum Small Latency=0.85 @ Small=2 and Large=0

50% reads, 50% writes with TCP/IP offload:

Maximum Large MBPS=37.71 @ Small=0 and Large=2
Maximum Small IOPS=1468 @ Small=5 and Large=0
Minimum Small Latency=2.70 @ Small=1 and Large=0

50% reads, 50% writes without TCP/IP offload:

Maximum Large MBPS=43.00 @ Small=0 and Large=2
Maximum Small IOPS=2113 @ Small=5 and Large=0
Minimum Small Latency=1.11 @ Small=1 and Large=0

The tests are a little artificial insofar as I’m using one LUN as a target, not many. I’ll try it with more LUNs and see if there’s a change, but I don’t expect one! The host OS is RHEL 5.3 x86_64 on a dual-socket, dual-core AMD Opteron with 6GB of RAM. The iSCSI driver in use a TCP/IP offload-enhanced version of the bnx2i driver, direct from HP.

Face-off: NetApp v. Sun

We currently have three VMware clusters: production, secure and test-dev. Production and secure are both ESX 3.5 U4 running NFS against a NetApp FAS3020; test-dev is a ESX 4 cluster running NFS & iSCSI against a NetApp R200. While I’m pretty happy with the FAS3020’s performance, they are certainly not cheap — we’re paying (around) $36k for a shelf that gives us around 8TB usable. You don’t need a calculator to tell you that’s more than $4000/TB. Of course, that price includes support, and NetApp support for us has been nothing short of extraordinary. The R200, by comparison, is quite slow as it’s rapidly approaching it’s EOS and EOL terms. So, before the end of the year it’s going to be replaced.

However — and this came up at the recent New England VMUG — we are reaching an interesting paradigm in software, particularly virtualization, where the software itself (in this case, VMware vSphere) is doing enough of the “smarts” (e.g., snapshots and thin-provisioning) that you have to wonder if the price and performance you’re getting out of your filers makes sense from a cost/benefit viewpoint. While it’s great to have our NetApps do the snapshots and thin-provisioned volumes, it is a little wasteful; restoring from a snapshot typically means using something like rsync to pull back the portion of the NFS datastore you want, then either create a new VM or move the existing VM’s VMDKs out of the way. It’s not terrible, but it could be better.

For this reason, and some others, I’m going to give one of these Sun Fire x4540 storage servers a shot. They are mighty powerful in terms of CPU speed compared to a NetApp filer (six-core AMD Opteron. And you can have more than one!), but decidedly lacking in the traditional software support. However, you do get ZFS, and as you can read at Ben’s Cuddletech blog, you can do some pretty cool things with it. At any rate, VMware Data Recovery lets you do incrementalized, snapshot-based backups from within vSphere itself; it ships as a vCenter plug-in and looks to be a great thing. So, I’m going to give one of these Sun Fire servers a shot and see how I think it goes.

At this point, I’ve ordered a trial of the base model (8 CPUs, 32GB of RAM, 12TB of disk!) to see how it performs. I imagine I’ll leave Solaris on it, and simply use it to front some iSCSI and/or NFS LUNs (and store them in ZFS.) I presume that I’ll let VMware do most of the storage shenanigans via Data Recovery and its native support for snapshots, though I’ll probably take a look at how ZFS can do some of these things, too.