Tag Archives: IT

IT Stuff: How not to deal with valuable data

Friday:

Some Facebook groups that you may find helpful:

Friday:

From Andrew’s Blog: How to recover most or all of your journalspace posts/images using Google cache

Friday:

Hi there, Slashdotters. My blog has some more information which you may find interesting.

Tuesday:

Journalspace is no more.

DriveSavers called today to inform me that the data was unrecoverable.

Here is what happened: the server which held the journalspace data had two large drives in a RAID configuration. As data is written (such as saving an item to the database), it’s automatically copied to both drives, as a backup mechanism.

The value of such a setup is that if one drive fails, the server keeps running, using the remaining drive. Since the remaining drive has a copy of the data on the other drive, the data is intact. The administrator simply replaces the drive that’s gone bad, and the server is back to operating with two redundant drives.

But that’s not what happened here. There was no hardware failure. Both drives are operating fine; DriveSavers had no problem in making images of the drives. The data was simply gone. Overwritten.

The data server had only one purpose: maintaining the journalspace database. There were no other web sites or processes running on the server, and it would be impossible for a software bug in journalspace to overwrite the drives, sector by sector.

The list of potential causes for this disaster is a short one. It includes a catastrophic failure by the operating system (OS X Server, in case you’re interested), or a deliberate effort. A disgruntled member of the Lagomorphics team sabotaged some key servers several months ago after he was caught stealing from the company; as awful as the thought is, we can’t rule out the possibility of additional sabotage.

But, clearly, we failed to take the steps to prevent this from happening. And for that we are very sorry.

So, after nearly six years, journalspace is no more.

If you haven’t yet, visit Dorrie’s Fun Forum; it’s operated by a long-time journalspace member. If you’re continuing your blog elsewhere, you can post the URL there so people can keep up with you.

We’re considering releasing the journalspace source code to the open source community. We may also sell the journalspace domain and trademarks. Follow us on twitter at twitter.com/jsupgrades for news.

As a somewhat newly-minted overnight admin for a firm that deals almost exclusivley in important information, this story is a reminder about how important it is to keep up to date backups (something that I don’t do on my personal machines as much as I should)

I don’t imagine for a second that my customers would be happy to find out that I was not keeping a non-volatile backup of my important data like these people were,  but I don’t know if their customers were even paying for the service, so this may be a non-issue in terms of cash.  I didn’t look any deeper to find out.

As for my personal sites,  I keep backups off site and on other mediums, but not as often as one might want or expect.  In Fact, I’m doing one now just in case.

Some Advice for IT Types

“IT is at the heart of business these days and there are real opportunities now to have a career in IT which will ultimately lead to a position on the board.”

If this is the case, why are so many IT jobs filled with people who have no idea what they are doing? I spoke to my share of IT reps from firms all over the Fortune 1000 and Fortune 50 that had no clue what they were doing, nor did they have any idea where they were going with their mandates.  Often they had no plan or action plan.

One example really sticks out for me; a hardware changeover plan that had no “buffer”  the IT rep wanted to replace an important firewall with another one.  He felt assured that he could just replace the current device with a new and wholly different one if the new devide was configured correctly.

This was a bad plan for two reasons:

1) There was no fallback beyond dropping the old hardware in place.

2) The router was the MAIN ingress to their websites and mail systems.  There were no external fallbacks or alternate sites for users to visit during the downtime.  If the transition went BAD (new hardware fails and old device breaks during transition) there was no fallback.

I know, you’re thinking: Kevin, what would you have done?
I would have published a new set of DNS records with a TTL of about 15 minutes.  I would publish them a week before I made the transition and made sure my DNS server was not inside the new router.  Once in place you would have 15 minutes of downtime while you performed the transiton to a new host for your website if something went wrong during the switch.  That’s fairly easy to deal with.

I like the idea of planning for downtime like that; you could even change the TTL on the DNS records back to 24 hours when you are done.

Here are some tips for outage planning

  1. Have a fallback plan for total failure:

    If it is an internet enabled service that users need access to, publish DNS records that point to a “Server is down” page on the net (for web services)  when the primary record(s) is/are down.

    Keep offsite hard copies (by hard copies I mean stored on Hard disk or Tape)

    Keep enough cash in the IT budget to buy server time on multiple hosts should short-term downtime become extended overtime.

    Any server that is important enough to serve all your needs should have a clone on hand with all the same data, backed up every 6 to 12 hours (or less) so that if your primary server(s) go down a clone can go online in seconds.

  2. Announce the outage in as many ways possible.  Email is never enough for big outages.  Warn users in cloud writing if you think they will read it.
  3. When the outage is going to take a machine out of service forever, contact any old admins and/or users and determine if they have stored anything important on the box.  You never know.
  4. Treat every outage as a potential crisis and be ready for complaints regardless of success or shortness of time.
  5. Confirm that all parts and plans are in order before the outage in underway, if at all possible create a schedule and checklist for the outage that creates a series of milestones and ETAs that can be delivered to end users and managers.

After all, you are the heart of the business when you are in IT, right?

Non-IT Grads don't want IT Jobs

Just read this passage and wonder at it:

Non-IT graduates think a job in IT would be “boring,” despite its good career prospects, according to the Career Development Organisation (CDO).

http://www.computerweekly.com/Articles/2008/06/24/231173/it-is-boring-say-graduates.htm

Read it again, I’ll wait.

Okay, got it?  It opens with “Non-IT Graduates” as if to say someone who went through school to get their MBA or Masters in Psychology would be interested or even qualified to fill an IT position.  I think the article is grasping for the why not IT in the first place kind of feeling, but instead comes to a screeching halt right up front with that first line.  I read it as “people who were never interested in IT think that IT jobs are boring” and you know what, they should not get into IT if they feel that way.

I’m fairly certain that there are a number of people in IT these days who got into it for the money; and through sheer personality have excelled.  Good for them.  It’s kept down a few really smart people in the ranks because they don’t have the social skills to impress the uppers, but maybe those types will be weeded out and the more focused geeks will rise to prominence.

Time will tell I guess.

So Where is that Server Located Anyway

A new databse tool has come into play, and some aspects of net access to our sites have become slow. Naturally the new web based tool is blamed. For S & Gs I propose that the new tool is in fact the cause, leading to the usual scoffing fro Mr. J

Jared: siebel is hosted here!
Kevin: I’m just saying, what if they have managed to do this?
Jared: So, what your saying is everyone is insane then?
Kevin: Yes, that is what I am saying, everyone is insane.
Kevin (Thinks about it a bit) : What do I get if I am right, a cookie?
Jared: Yes you get a cookie.
Kevin (quick traceroute later): Second hop, hosted in Cupertino (far away).
Jared: You get your cookie.