Dear AVR Customers,
This email is followup for the Online system outage that
occurred on Thursday, August 16, and affected a small, but certainly important,
portion of AVR's Online customer base.
At approximately 10:06 a.m. on August 16, a fatal
hardware fault occurred in one of the backend database server systems that
drive the Online product. This backend server system houses your Online
database and those of several other customers. The hardware configuration on
this system features redundant hardware intended to prevent most of these kinds
of outages. However, the specific technical details of this fault included
timing and severity that did indeed cause the system to halt.
The redundancy in place had already effectively PREVENTED
an outage due to an earlier fault, but a second fault while running in degraded
mode overcame the remaining protections.
We worked as rapidly as we could to diagnose and
repair the system, and during that process we made the decision to transfer a
"system image" of the system onto alternate hardware in order to more
rapidly restore service.
We sincerely apologize for the outage. While we're
thankful that the protections we've built in did indeed stave off a much worse
outage, we are nonetheless not satisfied with this interruption, to say the
least.
Our Online road map already included even more back-end
improvements that would have indeed either prevented this particular
interruption, or drastically reduced its length, to get your Online site up and
running again almost instantly. We are still working on implementing those
further improvements--we had been, and are continuing to, even more fervently.
On the "good news" side, many, many advanced features and
improvements we've deployed over the past three months have indeed prevented
other outages entirely--multiple events we know would have caused problems and
outages that did NOT occur, thankfully, due to the protections now in place.
Thank you for your patience, and please let us know of
any questions.
Sincerely,
AVR