We nerds are back to get deeper into storage and the various options and solutions out there. We're starting a multi-part blog series to try and help people get a better technical grasp over storage for media and entertainment. There's a vast majority out there that see storage as a mystic and cryptic situation; and we aim to help change that and aid normal people in their battles on the subject. We get a large number of enquiries about this at Escape Studios - and for too long, too many people have been purchasing the wrong hardware and software for the right purpose.
So how can we help? We'll kick it all off discussing bandwidth efficiency - potentially one of the most crucial aspects for all storage platforms - and seemingly an all-too-often overlooked issue.
Let's start by discussing the biggest divide in storage today - sustained and unsustained storage platforms.
Sustained and Unsustained Storage Platforms Explained
Sustained storage platforms are a term referring to a storage unit, that has a finite, locked-down limitation on how much bandwidth it has internally, externally or both. Typically speaking, these units cannot be upgraded to operate faster without major upgrades of parts, and even then can be marred by physical limitations.
Un-sustained storage platforms are just the opposite. Typically a cluster or distributed system would be the unit gaining this aspect - but it means that the bandwidth (again, both in and out) can be upgraded with ease, and as a platform is in general more robust and capable of expansion via in-built hardware or software functionality.
Moving past that, let's define the split between bandwidths in storage - internal and external.
Internal and External Bandwidth Explained
Internal bandwidth, when on the subject of a storage platform is how quickly the unit operates within itself. This could be a number of things. For example - in a typical JBOD storage unit, there will be a singular backplane connecting the disks, and a RAID controller bringing them all together. The ultimate throughput of the RAID controller would be a hard limitation to it's internal bandwidth. Another example would be a fibre-attached unit moving data from the RAID unit to an ultimate server unit to the point where it becomes external to the entirety of the unit itself.
External bandwidth. would typically refer to the ability for the machine providing the connectivity to the storage unit. For example (following the above) - how quickly the endpoint of either RAID controller or HBA extension can be moved out into a network to be used in a tangible fashion - at the end users terminal. This bandwidth is typically the piece that's overlooked, which is the main problem with mis-informed storage purchases today.
So we've got our definitions! Fantastic! So where do we start to see major issues here?
Common Bottleneck Issues
The biggest set of problems comes from bottlenecks in passed efficiency. The best (and only, really) way to ensure you're not creating these constrictions is to work from the start of the storage, straight through to the end usage of it. Generally speaking, you'd start with disk speeds (so if you had 24 disks capable of 90mb/s, you'd have a disk bandwidth of well over 2gb/s - theoretically), move to RAID controller speeds (preferably something that covers just below, but close to the disk capability) and then to internal network transmission (ala direct-attached served connections, optional) and then to the conversion to external and the movement onto the network. You always want to follow these numbers top down.
Bit confused? Stick with me - here's an example:
- I've got a 12TB storage unit, which is 12 disks that are capable of about 1gbps (on their lonesome, disk-level)
- I'm going to have a choice of RAID controller in my unit, but I'm not going to choose something to bring together all these disks that's too far below 1gbps. For example, I could choose a 3Ware card that maxes out at 300mbps, or an LSI card that goes to about 800. I should be picking the 800 here! Everything moves downwards here; the more we cut off from the top, the less trickles down to the user!
- Great, I've got 800mbps from 1gbps of disk - fairly good allowance as overhead! Now I've got to attach my very large, very fast 12TB disk unit to a server. I could push this out ultraSCSI 320 (320mbps), with 8gbit fibre (800'ish mbps) amongst other options. It would surely be foolish to use the uSCSI connection here, because I'm cutting off over 50% of my speed! I'd choose the faster fibre connection to ensure that's A-OK.
Maximising External Bandwidth Efficiency
Great, so now I've got a server with an 800mbps connection to my disks, all the way to the bottom, sounds great! Here's the point at which people plug in a networking cable and they're happy. This is where it all goes wrong - typically speaking a large amount of solutions like this have an incredibly fast terminal end at the server, but severely limit themselves in terms of external. If you can't tell, this point is where we've finished with internal bandwidth, and have now moved onto external bandwidth. A single gigabit connection will have a maximum throughput of 107mbps, not even 15% efficient vs. the speed of my disk connection. So what do we do?
There are many external bandwidth options, just as there are many internal bandwidth options. Ultimately, these are going to come down to methods of switching and the protocols they support. If you had a lovely central infrastructure, you'd typically have a single, modular switch. You could attach our 800mbps monster machine with 8x 1gbit connections, and trunk them into a singular IP with something like LACP - Link Aggregation Connection Protocol to serve 8gbits into a single IP address (if you're limited to gigabit switching) or you could push it all out over a single 10gigabit ethernet adapter, assuming you could get a module to connect into your switch.
Since a large amount of this boils down to external bandwidth 0 and the above really is only an example - let's determine how badly this could possibly effect us from the end user's point of view.
Common Issues of External Bandwidth Bottlenecks
If I had a disk that was capable of 400mbps to the end of internal bandwidth, and only had a single gigabit connection, how does it harm me?
If you can only serve out 100mbps (on your single gigabit), and a single user is hitting the connection; you'd be 100% efficient if they were attached to the network on a single gigabit. That's fantastic! Unfortunately, if you have a second person jump in there, both of you are going to divide that 100mbps (which is suitable for a single connection) like pie, now you're both 50mbps each! This becomes a ratio, 3 people in, 33mbps each, and so on. If we had a 4-port NIC in the machine running a single 4gbit trunk to the network, we could have 4 people connecting to our 400mbps disk at 100 each, with zero compromise on speeeds. Similarly, if a 5th jumps in, we divide the 400 across 5 users, but this speed difference is considerably less when you think that we're only down to 80 per user, where previously we'd be in some very hot water on a single connection!
Using the Right Tools for the Job
So that's an example of how trickle-down efficiency can be confusing, scary, and costly. But what about over efficiency and purchasing things you can't utilise in the fashion you want?
Take for example the following situation:
I have a 24-bay storage unit which from controller to network is capable of 1gbps. I'm choosing disks to fill this unit with such that I don't go below 1gbps, and try not to go very far above it. I could put great, fast and brand new SSDs into this machine at a cost well above the high-density cheap SATA disks; but it probably won't wind up being worth it to me in t he end. These disks, or any other high-spec disk, are great, if you can take the throughput and really utilise it. When you consider at a 5th of the price, you could use a 70-80mbps enterprise SATA disk to populate, when you use 24 of them - you'll cover the 1gbps easily. This isn't to say the SSD's won't cover it, it's to say that they will be a bit of overkill, and cost nearly 5 times the price. The problem here really is that people are being provided solutions based on the disk technology (and they think there will be tangible benefit on the other end). This is simply not the case, because of all the hoops the data will have to go through to get to you - you will have bought something far too powerful for the job. Again, this isn't to say don't purchase SSDs, but they have a time and a place.
The biggest thing to keep in mind when perusing storage options is - Get the right tool for the right job! We can't stress this enough. If you blindly wander into storage purchases you can be bitten by slower-than-expected solutions, or overly expensive for what you've got solutions, amongst many other things - but total efficiency inside and out is the first step towards making an informed purchase.
I hope this has help you all start to grasp the gravity of the situation here - keep an eye on this space for more useful tips when it comes to the world of storage!
We hold webinars from time to time to dymistify some of the technology myths in the storage world of M&E. Feel free to register for the next webinar and we'll let you know when its on.