123host users, you do understand that just like you pay me for hosting, I pay for a server. Just like you count on me to provide certain services, I count on a data centre to keep the server running.
I do the day to day maintenance and tweaks on the server and have a contract with system administrators to do things that are too hard or too scary. Think of it like you taking care of basic maintenance on your car, but taking it to a mechanic for serious work.
So…what happens when you take your car to your mechanic and when you get it back it makes a noise. You tell them it is something they did, they say it is the way you drive. You insist, they keep fobbing you off. Eventually you demand they do a deeper investigation and gosh, look at that, it was something they did after all.
Similarly this week with the server outage. My system admins identified the problem as being something going on at the server. The data centre were adamant it wasn’t at their end. My system admins dug deeper and were confident of the problem. I got back on to the data centre and dug my heels in and gosh…look at that, it was a problem at their end.
I am sure you can imagine how infuriating this is. Most customers maybe don’t notice as I don’t hear from them. Some customers want to know what is going on and are frustrated but understand. A few customers are justifiably angry as they are trying to run a business…one decided to quit 123host 🙁 and I don’t blame them, I wouldn’t put up with it either.
I have worked hard to build up a really high level of goodwill at 123host and to have it all undone by someone else was shattering to say the least.
Here’s what happened and you may be shocked at the proposed solution: At 123host an offsite backup is run 3 nights a week. That means quite a bit of data is being transferred at the time but it doesn’t seem to cause any problems. I discovered that the data centre does its own backup every 10 days, that means quite a bit of data is being transferred. Now imagine if they both happen at the same time! That a a LOT of data being transferred and starts to look like 20 lanes of cars trying to get onto a 4 lane wide bridge. If there isn’t any traffic, everyone moves quickly, but we all know what rush hour is like.
So all this data was trying to transfer and at the same time something happened to their backup that meant it had been running for 6 days or so. SIX DAYS! Eventually it caused such a bottleneck that though the server was still running it had effectively closed the bridge to all cars. It took me hours to convince the data centre that the problem was at their end, eventually someone found the backup process and killed it and within minutes and ever since those cars are streaming across the bridge.
Since I do my own backups I asked them to turn that facility off, but it seems they can’t do that. I asked “so what do you suggest I do?” and the unbelievable answer was “wait until it happens again and the call us and we will kill the backup”. I laughed at the person who suggested such an idiotic idea – I won’t call it a solution.
Right now I am looking at the options for 123host and it is likely to me moving to a co-located server in partnership with an old friend of mine who also has a web hosting business. This means that we would own (rather than rent) our own server and only be renting the space in the data centre. It also requires a much higher level of skill to manage but that is what my server administrators get paid for.
Moving accounts to a new server is not a task to be undertaken lightly, but if we can work it out, it will be the last move for a long time.
Thanks for listening.