|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Hooookay, well that was an interesting meeting! o_O
First, it appears that our head office is replacing their current
arrangement with a big new SAN.
The concept of a SAN utterly baffles me. As far as I can tell, a "SAN"
basically means that instead of connecting your HDs to the server that
wants to use them, you connect them to a network instead, and then the
server accesses it over the network. This has the following benefits:
* Tens of thousands of times more expensive than a direct connection.
* Hundreds of thousands of times reduced performance compared to a
direct connection.
* Radically increased complexity compared to a direct connection.
* If the storage network fails, ALL servers fail, so you're adding a new
single point of failure.
* All the servers now have to share the limited bandwidth available. One
busy server can bring all the others to a crawl.
* Significantly more "enterprisey" than a direct connection.
Actually, wait... those are all disadvantages. Really, really /big/
disadvantages. Huh, OK. So why on Earth would any sane person embark on
this course of action??
Oh, wait, I just found a real advantage: If you want to move a disk from
one server to another, you just press a button, rather than having to
physically move the device.
Oh, well, that /totally/ makes it worth it. And, uh, when will you
*ever* need to perform this operation? I mean, seriously, I've been
working at the UK site for almost 10 years now, and I've never once
wanted to do this. And even if I did, if on one day in 10 years I have
to shut both servers down to physically move a drive, I think I can
probably live with that.
Perhaps if I worked at Google, managing 20,000 "servers" [which are
really just commodity desktop PCs], having that many disks might be an
issue. And with that many machines, perhaps the massive performance hit
would be acceptable. But anywhere else? WTF?
Fortunately, this only affects HQ, so in a sense I don't have to care
about this. Even so, it still baffles me.
Second, they're replacing our tape backup system with a disk-based
backup system. That's right, they're seriously talking about
transferring over 200 GB of data from our UK site to our USA
headquarters, every night, via the Internet.
Jesus, this meeting just gets better, doesn't it?
First of all, speed. The LTO-3 tape system we currently use has a
maximum transfer rate of 80 MB/sec. At that rate, 200 GB should
theoretically take about 41 minutes (which agrees with my actual backup
logs). But our Internet connection is a piffling 5 Mbit/sec. At that
speed, 200 GB should theoretically take... 3 days + 17 hours.
And you want to do this /nightly/???
Second, what is the purpose of a backup? The purpose of having a "backup
copy" of your data is so that if your working copy dies somehow, you can
use the backup copy to get your data back.
This only works if the backup copy is more reliable than the working
copy. If your working copy is on spinning disk, and you're stupid enough
to put your backup copy on spinning disk as well... then it becomes
equally likely that the /backup copy/ will die and you'll be left with
just the working copy.
Third, having access only to the most recent backup is no good. There
are scenarios where that would be perfectly OK, but in our line of
business, we sometimes have to recover data from /months/ ago. Data that
has been accidentally deleted. Data which got corrupted. Data which we
thought we didn't need any more but actually we do. And so forth.
So it's no good at all just mirroring what's on the server onto another
server somewhere else. The /history/ must be kept. Now, there are
various ways you might achieve that, but all of them unavoidably involve
the set of backup disks being drastically larger than the total size of
the working disks. And, if we're going to continue our current policy of
never deleting old backups, then the backup disk set must continue to
that's far less reliable.
And then there's the fact that you either need to keep all this disk
spinning (i.e., ever increasing power and cabling demands), or only keep
the recent backups spinning (i.e., managing powering off and powering on
drives, which supposedly shortens their lifetime).
In all, this seems like an extremely bad idea.
Still, what I do know? Apparently not a lot.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Invisible <voi### [at] devnull> wrote:
> Hooookay, well that was an interesting meeting! o_O
> ...
> Still, what I do know? Apparently not a lot.
like a Systems Administrator. (The capitalization *is* correct because the names
of deities always begin with a capital letter.)
Admin thinking like that. Instead, you need to learn to ask the sorts of
questions that are pertinent to the true concerns and goals of an Administrative
projects and changes to old ones at a sufficient rate such that my only job will
necessity of employing someone at astronomical cost whose only function is to
mange the upheaval could be called into question.
Also, you need to have a broader perspective with regard to how issues which are
IT matters in one sense integrate into a more comprehensive view of world
affairs. Take the overall state of the global economy for example. One with a
numerous unemployed technically inept friends and relatives on the payroll as
well. Dealing with trivialities like storing data securely and managing the
flow of information so as to facilitate genuinely productive tasks is a job for
the low-paid nerds.
I hope this clears things up... No! No!... I mean I hope this obfuscates matters
Mike C.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 8/3/2011 7:44, Invisible wrote:
> Perhaps if I worked at Google, managing 20,000 "servers" [which are really
> just commodity desktop PCs], having that many disks might be an issue.
From what I've read, closer to half a million servers, all with the disk
attached directly. :-)
> Second, they're replacing our tape backup system with a disk-based backup
> system. That's right, they're seriously talking about transferring over 200
> GB of data from our UK site to our USA headquarters, every night, via the
> Internet.
You generate 200G of incremental data a day? ow.
> This only works if the backup copy is more reliable than the working copy.
> If your working copy is on spinning disk, and you're stupid enough to put
> your backup copy on spinning disk as well... then it becomes equally likely
> that the /backup copy/ will die and you'll be left with just the working copy.
That's why you have two spinning disk backups that you alternate. Nothing
wrong with spinning disk backup, especially if it's more reliable than the
tape itself.
> So it's no good at all just mirroring what's on the server onto another
> server somewhere else. The /history/ must be kept. Now, there are various
> ways you might achieve that, but all of them unavoidably involve the set of
> backup disks being drastically larger than the total size of the working
> disks.
Not really. You don't actually change that much, I expect. Tapes don't have
things like hard links and directories, but spinning disks do. Make a full
backup, then an incremental once a day for a week, then a weekly
incremental, etc, until you get up to monthly.
> something that's far less reliable.
I'm not sure how you know spinning disk is less reliable.
> And then there's the fact that you either need to keep all this disk
> spinning (i.e., ever increasing power and cabling demands), or only keep the
> recent backups spinning (i.e., managing powering off and powering on drives,
> which supposedly shortens their lifetime).
Once it fills up, you disconnect it and put in a new one, and you put the
old on on the shelf.
> Still, what I do know? Apparently not a lot.
You know a lot. You should spend time writing this up, especially the
bandwidth part, and send it to the people who would be able to evaluate this.
If nothing else, ask them what backup software they plan to use that will do
incremental backups over a network and keep every old backup separately
restorable.
--
Darren New, San Diego CA, USA (PST)
How come I never get only one kudo?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 8/3/2011 7:44, Invisible wrote:
> having that many disks might be an issue.
Speaking of which, I recently read a quote about the early days of Google.
Levy had seen the racks where Google's storage was running and said "If you
can imagine a college freshman made out of gigabytes, this would be his dorm
room."
--
Darren New, San Diego CA, USA (PST)
How come I never get only one kudo?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 03/08/2011 04:30 PM, Mike the Elder wrote:
> Admin thinking like that. Instead, you need to learn to ask the sorts of
> questions that are pertinent to the true concerns and goals of an Administrative
> projects and changes to old ones at a sufficient rate such that my only job will
> necessity of employing someone at astronomical cost whose only function is to
> mange the upheaval could be called into question.
Pfahahahahah!
Apparently you don't realise what they pay me. :-P
Mind you, the argument works from the point of view of the Director of IT...
> Also, you need to have a broader perspective with regard to how issues which are
> IT matters in one sense integrate into a more comprehensive view of world
> affairs. Take the overall state of the global economy for example. One with a
> numerous unemployed technically inept friends and relatives on the payroll as
Hahah. That would be funny if it wasn't true. The number of
mission-critical applications here developed by somebody's nephew in
QBASIC is ridiculous.
> well.
BULLSHIT!
Uh, I mean, BINGO!
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 03/08/2011 04:35 PM, Darren New wrote:
> On 8/3/2011 7:44, Invisible wrote:
>> Perhaps if I worked at Google, managing 20,000 "servers" [which are
>> really
>> just commodity desktop PCs], having that many disks might be an issue.
>
> From what I've read, closer to half a million servers, all with the
> disk attached directly. :-)
Sure, but not all in one place, and not with one guy in charge of it
all. I'd imagine 20,000 machines in one datacenter is not an
unreasonable guess.
And yes, from what I can tell, Google doesn't do SAN. They replicate at
the filesystem level. Then again, all their machines run the same
application. Business datacenters aren't usually like that. Even so, it
seems "obvious" to me that the network should operate above the
filesystem level, not below it.
> You generate 200G of incremental data a day? ow.
Nah, that's for a full backup. An incrimental backup is more like 10GB.
Which is still going to take a while at 5 mbit/sec. (Once you add IP
framing, TCP overhead, VPN encryption, latency, other WAN traffic...)
> Nothing wrong with spinning disk backup, especially if it's more
> reliable than the tape itself.
This is the thing, really. Disks spin constantly. Quite apart from the
power that uses, it also means that at any instant they can break.
Mechanical failure, electronics failure, software glitch, whatever.
Tapes just sit there on the shelf, doing nothing. About the only thing
you need to worry about is the tape demagnetising.
Did I mention that tape is cheaper?
>> So it's no good at all just mirroring what's on the server onto another
>> server somewhere else. The /history/ must be kept. Now, there are various
>> ways you might achieve that, but all of them unavoidably involve the
>> set of
>> backup disks being drastically larger than the total size of the working
>> disks.
>
> Not really. You don't actually change that much, I expect. Tapes don't
> have things like hard links and directories, but spinning disks do. Make
> a full backup, then an incremental once a day for a week, then a weekly
> incremental, etc, until you get up to monthly.
Note that for our purposes, security information, file modification
times and so forth must also be reliably stored and retrieved.
>> something that's far less reliable.
>
> I'm not sure how you know spinning disk is less reliable.
Well, I don't /know/ for a fact. It just seems highly likely.
(We /know/ that disks have an annual failure rate of about 3% to 5%. We
don't know what the rate is for tape.)
> Once it fills up, you disconnect it and put in a new one, and you put
> the old on on the shelf.
And I thought drives don't like power cycles...
>> Still, what I do know? Apparently not a lot.
>
> You know a lot. You should spend time writing this up, especially the
> bandwidth part, and send it to the people who would be able to evaluate
> this.
Heh, yeah, like I want to get yelled at.
> If nothing else, ask them what backup software they plan to use that
> will do incremental backups over a network and keep every old backup
> separately restorable.
I already know for a fact that they haven't decided how to implement
this yet. They're currently looking at options. I will of course make
sure they know what the hard requirements are.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On Wed, 03 Aug 2011 08:35:42 -0700, Darren New wrote:
> On 8/3/2011 7:44, Invisible wrote:
>> Perhaps if I worked at Google, managing 20,000 "servers" [which are
>> really just commodity desktop PCs], having that many disks might be an
>> issue.
>
> From what I've read, closer to half a million servers, all with the
> disk
> attached directly. :-)
Most recently it was suggested the number was closer to 900,000. :)
Jim
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 8/3/2011 8:54, Invisible wrote:
> They replicate at the filesystem level.
Sorta. I guess if you call the separate library with separate semantics a
"filesystem", then yes. (And yes, I do, but many wouldn't, for some reason.)
> Then again, all their machines run the same application.
Huh? No they don't. That doesn't even make sense, since then the only
application they'd be running would be "the file system." Heck, even the
file system has three or four separate applications to run it.
> Business datacenters aren't usually like that. Even so, it seems "obvious"
> to me that the network should operate above the filesystem level, not below it.
Personally, I've never figured out the draw of paying more for a terabyte on
dedicated hardware that's not any more reliable or inexpensive than just
plugging a terabyte drive into a PC.
>> Nothing wrong with spinning disk backup, especially if it's more
>> reliable than the tape itself.
>
> This is the thing, really. Disks spin constantly.
Not backup disks. No more than your backup tapes from last year do.
> Did I mention that tape is cheaper?
This is true, yes. That's the one benefit.
> Note that for our purposes, security information, file modification times
> and so forth must also be reliably stored and retrieved.
Sure. You can still do that with disks and hard links. If the security
information or file modification times change, you add it to the backup.
Good backup software will keep a checksum and either not actually copy the
data if only the metadata has changed.
> (We /know/ that disks have an annual failure rate of about 3% to 5%. We
> don't know what the rate is for tape.)
Depends if they're spinning or not. I can guarantee that if you left a tape
moving like you say you want to leave disks moving, you're have failures
much more frequently.
>> Once it fills up, you disconnect it and put in a new one, and you put
>> the old on on the shelf.
>
> And I thought drives don't like power cycles...
How often are you going to power cycle it? You'll turn it off when it's
full, and not turn it on again until you're ready to use it.
I think it's not so much that disks don't like power cycles (especially
nowadays in the last 10 years or so), but that most failures happen when
powering them up or down.
In any case, spinning up the drive once a day and then turning it back off
again isn't going to significantly decrease its lifetime. Buy a disk
designed for that sort of use, rather than a "server" disk that's optimized
to not fail when it's always spinning.
> I already know for a fact that they haven't decided how to implement this
> yet. They're currently looking at options. I will of course make sure they
> know what the hard requirements are.
Well, that's all you can do, really. :-)
--
Darren New, San Diego CA, USA (PST)
How come I never get only one kudo?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 8/3/2011 7:44 AM, Invisible wrote:
> * Tens of thousands of times more expensive than a direct connection.
>
> * Hundreds of thousands of times reduced performance compared to a
> direct connection.
>
> * Radically increased complexity compared to a direct connection.
>
> * If the storage network fails, ALL servers fail, so you're adding a new
> single point of failure.
>
> * All the servers now have to share the limited bandwidth available. One
> busy server can bring all the others to a crawl.
>
> * Significantly more "enterprisey" than a direct connection.
>
> Actually, wait... those are all disadvantages. Really, really /big/
> disadvantages. Huh, OK. So why on Earth would any sane person embark on
> this course of action??
>
>...
Hmm.. So, in tech, its the same as everything else now. Facts, reality,
and sane policy don't matter, just whether or not the idea sells? lol
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 04/08/2011 06:26 PM, Patrick Elliott wrote:
> Hmm.. So, in tech, its the same as everything else now. Facts, reality,
> and sane policy don't matter, just whether or not the idea sells? lol
What do you mean "now"? :-P
I read somewhere that way, waaaaay back in the days when people actually
bought line printers from IBM, and they cost hundreds of thousands of
dollars, there was a model that had a "field installable upgrade". For
something like 80,000 USD, an IBM engineer would come to your site,
remove one of the drive wheels, and install one with a different
diameter. This made the printer print twice as fast.
It's an engineering solution - once you realise that the goal is not to
make a better printer, but to bleed the customer dry.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|