Prime Dallas DC - Power Outage affecting multiple providers

**Webhostpython** · 05-28-2024, 01:22 PM

Originally Posted by David-Lava

Agree, but the same with webhostpython 6 months ago ... server was down >3 days

hey @Webhostpython

Any updates?

Hello, David-Lava,

Could you possibly provide us with a ticket reference number so that the team can further look? Thank you in advance and once again, apologies for the inconvenience caused.

**Hexpress** · 05-28-2024, 02:09 PM

Originally Posted by Webhostpython

Hello and apologies for the inconvenience. Can you share the ticket reference number, if you have it, so I can escalate the issue to the team for a focused investigation into your server?

Thank you, I already sent you a private message.

**David-Lava** · 05-28-2024, 02:28 PM

Originally Posted by Webhostpython

Hello, David-Lava,

Could you possibly provide us with a ticket reference number so that the team can further look? Thank you in advance and once again, apologies for the inconvenience caused.

@Webhostpython - this is my server ns65v.stableserver () net

Not sure why you need ticket #, when you know the reason of that failure. If that gonna help you to investigate and fix the server - here is the #XYM-951-94967

Pleeeeeeease, give me any ETA or .... any option how to get backup files. This is the 3rd days and I have no idea when my account gonna be back. Thanks for understanding

**hermetek** · 05-28-2024, 05:12 PM

I have yet to see an actual RCA that explains what happened. Not just the excuse of storms/ generic statement of power loss.

True A+B power should have A completely separate from B. A should have its own UPS, generator, ATS, and distribution. B should as well. I would like to see a proper explanation for how this occurred and what is going to be done to prevent it from happening in the future. My contract is up and now month-to-month, so the answer might impact how I handle that.

**Zhenmue** · 05-28-2024, 05:20 PM

Originally Posted by hermetek

I have yet to see an actual RCA that explains what happened. Not just the excuse of storms/ generic statement of power loss.

True A+B power should have A completely separate from B. A should have its own UPS, generator, ATS, and distribution. B should as well. I would like to see a proper explanation for how this occurred and what is going to be done to prevent it from happening in the future. My contract is up and now month-to-month, so the answer might impact how I handle that.

i haven't seen any yet, maybe because the weather is still caotic, and it may happen again?..

**David-Lava** · 05-28-2024, 08:05 PM

@Webhostpython,

What is the status of ns65v server?

Man this isn't the right strategy ... you asked the ticket #, you got it. Please let me know when

1. I can get backup files
2. when the server can be restored or migrated to another server
3. any updates?

**Webhostpython** · 05-29-2024, 03:17 AM

Originally Posted by David-Lava

@Webhostpython,

What is the status of ns65v server?

Man this isn't the right strategy ... you asked the ticket #, you got it. Please let me know when

1. I can get backup files
2. when the server can be restored or migrated to another server
3. any updates?

Hello, David-Lava,

Thank you for reaching out. I will send you a private message with some further details. In the mean time, please take a look at the latest update for server ns65v on our status page.

**Webhostpython** · 05-29-2024, 03:59 AM

Originally Posted by hermetek

I have yet to see an actual RCA that explains what happened. Not just the excuse of storms/ generic statement of power loss.

True A+B power should have A completely separate from B. A should have its own UPS, generator, ATS, and distribution. B should as well. I would like to see a proper explanation for how this occurred and what is going to be done to prevent it from happening in the future. My contract is up and now month-to-month, so the answer might impact how I handle that.

I'm confident that the team will provide complete information once everything has been resolved and stabilised. Thank you for your patience.

**Steve Eschweiler** · 05-29-2024, 11:31 AM

Originally Posted by Webhostpython

I'm confident that the team will provide complete information once everything has been resolved and stabilised. Thank you for your patience.

We are all still waiting on Prime to provide us with their RFO so we can then pass that information along to customers. We are reaching out to our Prime rep shortly asking for an ETA on their RFO. We have a pretty good sense of what happened at this point but we will await official word from Prime before we say much more.

**David-Lava** · 05-29-2024, 12:27 PM

Originally Posted by David-Lava

@Webhostpython - this is my server ns65v.stableserver () net

Not sure why you need ticket #, when you know the reason of that failure. If that gonna help you to investigate and fix the server - here is the #XYM-951-94967

Pleeeeeeease, give me any ETA or .... any option how to get backup files. This is the 3rd days and I have no idea when my account gonna be back. Thanks for understanding

Nothing has been changed during last 24 hours.

**creativefast** · 05-29-2024, 09:02 PM

I had 2 servers with Psychz network
It took 6hrs to up and running
It was a sunday fir me, so my clients didnt trouble us

**Webhostpython** · 05-30-2024, 03:53 AM

Originally Posted by David-Lava

Nothing has been changed during last 24 hours.

Hello, the team has reached out to you via the support ticket. Please take a look and let us know if everything is operational your end.

**David-Lava** · 05-30-2024, 12:01 PM

Originally Posted by Webhostpython

Hello, the team has reached out to you via the support ticket. Please take a look and let us know if everything is operational your end.

24 hours ago I reported that my account is down, today morning I got response that my only website is Live (I even got screenshot :-) ) .... the only thing is that after being 4-5 days offline and without any response (10 hours) I decided to move from Webhostpython. Thanks for your fantastic support. Last 20 hours my website is hosted with Rachnerd (temporarely), do you have any suggestions where where I can move my account?
@Webhostpython, I liked your service (off course when it was UP), but with 10-24 hours response time, 4-5 days offline ... this isn't for me.

**Webhostpython** · 05-31-2024, 06:58 AM

Originally Posted by David-Lava

24 hours ago I reported that my account is down, today morning I got response that my only website is Live (I even got screenshot :-) ) .... the only thing is that after being 4-5 days offline and without any response (10 hours) I decided to move from Webhostpython. Thanks for your fantastic support. Last 20 hours my website is hosted with Rachnerd (temporarely), do you have any suggestions where where I can move my account?
@Webhostpython, I liked your service (off course when it was UP), but with 10-24 hours response time, 4-5 days offline ... this isn't for me.

Could we please continue the discussion in the ticket?

**bostongio** · 06-04-2024, 09:39 AM

Here's the postmortem report:

https://billing.tier.net/serverstatus.php

The truly disturbing bit is that it took somewhere from 45 to 60 minutes before a human being noticed all of these problems occurring with the power distribution. Like whatever warning or alert systems they have in place, they really, really suck. Or they hire really poor technicians...

**hermetek** · 06-06-2024, 07:35 PM

Anyone have an official RFO?

**nohavps** · 06-07-2024, 05:45 AM

Originally Posted by hermetek

Anyone have an official RFO?

HIVELOCITY:

Hivelocity - DAL1 outage summary for event Sunday, May 26th

Below is a copy of the Full Outage Summary sent to Hivelocity from our DAL1 facility operator https://primedatacenters.com/ in Dallas, TX. All times listed are US Central Standard Time. Hivelocity’s monitoring systems indicate power loss to our gear occurred at roughly 6:15am CT with systems beginning to see power restoration around 8:45am CT. Our NOC Support and Networking Engineering Teams in Dallas were on-site throughout the event and began remediation efforts the moment power was restored. It is important to note that we have confirmed with https://primedatacenters.com/ that both generators have been tested rigorously via the ATS (automatic transfer switch) since the event and operated flawlessly. A second email that will serve as our official RFO will be sent in the coming days once Prime has completed their analysis and is able to provide us with further details. While we wait for that, here is the Full Outage Summary we have been supplied.

Full Outage Summary

Utility is lost and both Gen A and Gen B start running. Gen A then tripped off immediately after load transfer for an undetermined reason. Gen A continued to try to restart, completely draining and killing the battery even though it recently passed annual PM. UPS 1A and 2A fully discharged their batteries. Once that happened STS 1A and 2A changed sources to secondary source, which was the Catcher UPS which was being fed from Gen B. This overloaded the Catcher UPS and forced it to bypass without fully discharging its batteries. This loading on Gen B caused significant voltage fluctuations and caused UPS 1B and 2B to declare the source unavailable and continued to stay on battery until fully discharged then the units went offline. Load on block 1A and 2A remained on powered through the Catcher UPS in bypass fed from Gen B. When the utility returned the open transition between Gen B and Utility caused the remaining online equipment to trip offline resulting in all the PDU main input breakers to have tripped. This is why the customer load was not immediately restored with the utility.

Timeline

5:57 AM – Utility power to the facility was lost.

• Gen A starts but eventually fails and drains battery trying to restart.

• Gen B starts and continues to operate until utility returns with significant voltage and frequency instability. (Due to overloading)

5:58 AM – Overall System Status

• UPS 1A / 2A discharging due to no Generator power available.

• UPS 1B / 2B discharging due to ATS transition.

• Mechanical fed from Gen B back online (Half capacity)

5:58 AM – ATS-1B and ATS-2B Load transferred to Generator.

6:01 AM – UPS-1A Offline, STS-1A transfers to Source 2 (Catcher UPS)

6:03 AM – UPS-2A Offline, STS-2A transfers to Source 2 (Catcher UPS)

6:04 AM – Load on Catcher exceeds catcher capacity forcing Catcher UPS to bypass fed from Gen B

6:05 AM – Gen B starts to experience significant voltage and frequency fluctuations due to overloading.

6:06 AM – UPS-1B and UPS-2B due to poor power quality from the generator declare input power to be substandard, resumes discharging from batteries.

6:15 AM – UPS-1B with input power considered bad, fully discharges batteries and downstream load is lost. Downstream STS-1B unable to switch to source 2 (catcher) due to power quality and shuts down. Downstream PDU’s open main input breakers.

6:21 AM – UPS-2B with input power considered bad, fully discharges batteries and downstream load is lost. Downstream STS-2B unable to switch to source 2 (catcher) due to power quality and shuts down. Downstream PDU’s open main input breakers.

6:28 AM – Utility Comes back, all CRACs come back. When ATS-B performs the open transition from Generator to Utility, the remaining PDU’s operating on Gen B lose power due to having no remaining UPSs with battery capacity. Opening the remaining PDU main input breakers.

7:15 AM – Technician identifies UPS 1A, 1B and 2A are all offline. UPS 2B is online in bypass with no load.

8:20 AM – All UPS’s reset and brought back online in normal operation.

8:50 AM – All tripped PDU main input breakers reset, and customer load restored.

If you have any questions regarding this RFO please feel free to reach out to me at REDACTED.

If you are still experiencing any residual issues from the event please open a ticket via email to REDACTED.

We apologize for the disruption that this has caused to your business and we will work towards correcting the issues that caused this series.

**Steve Eschweiler** · 06-14-2024, 11:01 AM

Prime is load banking the generators today which we believe will uncover some additional information regarding the outage. Additionally, Prime has scheduled a PM (preventive maintenance) in roughly 2 weeks. We will be emailing our customers updated information as it is provided to us. Ideally today or tomorrow but more likely early next week.

**MechanicWeb-shoss** · 06-15-2024, 01:45 PM

Originally Posted by hivelocitygm

Prime is load banking the generators today which we believe will uncover some additional information regarding the outage. Additionally, Prime has scheduled a PM (preventive maintenance) in roughly 2 weeks. We will be emailing our customers updated information as it is provided to us. Ideally today or tomorrow but more likely early next week.

Thanks for the updates, even after the thread seemed to have died down. It's good to see someone keeping the community in the loop.

Do you think there would be any interruption/downtime during the PM?

**Steve Eschweiler** · 06-15-2024, 02:34 PM

Originally Posted by MechanicWeb-shoss

Thanks for the updates, even after the thread seemed to have died down. It's good to see someone keeping the community in the loop.

Do you think there would be any interruption/downtime during the PM?

No interruption during the PM. Typically a PM just involves inspecting the systems, oil change, belt changes, battery test etc…. Most data centers are doing PMs every 3 to 6 months on their generators and customers are none the wiser.

**bostongio** · 06-19-2024, 03:55 PM

Has the facility provider troubleshooted and fixed this issue?

Gen A then tripped off immediately after load transfer for an undetermined reason. <--- what is the reason?!?

Gen A continued to try to restart, completely draining and killing the battery even though it recently passed annual PM.

**Steve Eschweiler** · 06-19-2024, 04:07 PM

Originally Posted by bostongio

Has the facility provider troubleshooted and fixed this issue?

Here is an update we provided to our customers yesterday.

While there is further testing to be done and more information to uncover, we wanted to provide you with the latest information we have regarding the outage at the Prime facility in Dallas, which impacted our DAL1 data center just a few weeks ago. Since the outage Prime engineers have been rigorously inspecting and testing each component associated with the facility’s power redundancy. What we can tell you today is, the UPS, ATS and generators have each been tested and have all performed as designed and desired during these tests. The batteries have held the critical power load with utility power removed. Initiating the ATS automatically started the generators, verifying the ATS logic is working, as well as the relays and wiring to the generator sets. Lastly, the generators have been load banked and held power load equal to that of the facility’s for hours without issue. Next week, each of the diesel generators will get preventive maintenance and further inspection. We will be sure to provide you with updates as further tests and analysis is performed.

**bostongio** · 06-24-2024, 07:52 AM

It's super concerning that everything has "performed as designed and desired." If a root cause isn't established for this outage, there's just as much chance it'll happen again in the future. The fact that technicians have uncovered nothing wrong is not good.

**MechanicWeb-shoss** · 06-24-2024, 12:53 PM

Originally Posted by bostongio

The fact that technicians have uncovered nothing wrong is not good.

That is true for every troubleshooting scenario unless it is a cover-up or eyewash. No one is safe until the root cause of the failure is identified and addressed.

The problem is that although some providers charge a hefty price for servers hosted in Prime, there aren't many good alternatives for these providers in terms of data center cost. I guess they are just biting the bullet on this one and hoping for the best.

Thread: Prime Dallas DC - Power Outage affecting multiple providers

Thread Tools

Display

Similar Threads

Everest DC outage affecting multiple providers

75% of Netherlands Offline - Nikhef DC Power Outage

Massive Power Outage at Rackspace’s Dallas Data Center

Are DC's And B/W Providers REALLY Protected Against Power Outages?

HostEurope London power outage - any better providers?

User Tag List

Posting Permissions