iSCSI & TCP/IP offload is… slower?!

Riddle me this, Batman. Testing ORiON (Oracle I/O Numbers) gives me some pretty strange results — enabling TCP/IP offload yields slower performance than having it switched on! This is a HP BL25p G1 Blade connecting to an iSCSI LUN on a NetApp R200 filer. The LUN is a single 20GB partition. Both pure-read and 50% read/50% write numbers are slower when TCP/IP offload is enabled.

100% reads with TCP/IP offload:

Maximum Large MBPS=98.40 @ Small=0 and Large=2
Maximum Small IOPS=5025 @ Small=5 and Large=0
Minimum Small Latency=0.96 @ Small=4 and Large=0

100% read without TCP/IP offload:

Maximum Large MBPS=106.27 @ Small=0 and Large=2
Maximum Small IOPS=5561 @ Small=5 and Large=0
Minimum Small Latency=0.85 @ Small=2 and Large=0

50% reads, 50% writes with TCP/IP offload:

Maximum Large MBPS=37.71 @ Small=0 and Large=2
Maximum Small IOPS=1468 @ Small=5 and Large=0
Minimum Small Latency=2.70 @ Small=1 and Large=0

50% reads, 50% writes without TCP/IP offload:

Maximum Large MBPS=43.00 @ Small=0 and Large=2
Maximum Small IOPS=2113 @ Small=5 and Large=0
Minimum Small Latency=1.11 @ Small=1 and Large=0

The tests are a little artificial insofar as I’m using one LUN as a target, not many. I’ll try it with more LUNs and see if there’s a change, but I don’t expect one! The host OS is RHEL 5.3 x86_64 on a dual-socket, dual-core AMD Opteron with 6GB of RAM. The iSCSI driver in use a TCP/IP offload-enhanced version of the bnx2i driver, direct from HP.

Advertisements

2 thoughts on “iSCSI & TCP/IP offload is… slower?!

  1. this is similar to our findings with TOE, the only time we saw any real benefit was when the load was heavy and even then it was negligible, our customer could not justify the $500 a HBA cost for so little benefit.

    Reply
  2. Thanks, Tom. I understand that some might want to boot from their SAN fabric (we don’t), and thus might be justified in purchasing an iSCSI HBA for that reason.

    For us, though, we are seemingly more network-bound (particularly filer-bound) than we are CPU-bound. Management made a bit of a deal about making sure we had TOE-capable NICs in our Blades, but (*sigh*) we haven’t gotten around to configuring many of them.

    While ORiON generates non-trivial amounts of I/O — that’s its job, after all — it’s possible, as you suggest, that it simply isn’t taxing the server strong enough. Still, even with the additional latency introduced by the transport to and from the TOE chipset, I am somewhat shocked that it’s /slower/.

    I’ll do some more testing over the next few days and try and work the Blade quite a bit harder to see how it responds.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s