Some systems are experiencing issues

Past Incidents

Thursday 1st August 2019

Planned Outage - Arrêt planifié, scheduled 3 weeks ago

The SciNet datacentre will undergo a maintenance shutdown on August 1st, starting at 7 am EDT. There will be no access to any of the SciNet systems (Niagara, P8, SGC, HPSS, Teach cluster, or the file systems) during this time. This is necessary to finish the installation of an emergency power generator and to perform scheduled maintenance. We expect to be able to bring the systems back online the evening of August 1st.

Filesystem problem - Problème de système de fichiers

There is a problem with the /project filesystem on the Cedar cluster, which seems to be unresponsive. We are investigating.

Il y a un problème avec le système de fichiers /project sur la grappe Cedar, qui ne réagit pas correctement. Nous investiguons actuellement ce qui se passe.

Niagara Planned Outage - Arrêt planifié

The SciNet datacentre will undergo a maintenance shutdown on August 1st, starting at 7 am EDT. There will be no access to any of the SciNet systems (Niagara, P8, SGC, HPSS, Teach cluster, or the file systems) during this time. This is necessary to finish the installation of an emergency power generator and to perform scheduled maintenance. We expect to be able to bring the systems back online the evening of August 1st.

Wednesday 31st July 2019

No incidents reported

Tuesday 30th July 2019

Planned Outage - Arrêt planifié, scheduled 3 weeks ago

Graham Cloud is being updated, so will be unavailable during the outage. Service is expected to return by 5pm.

COMPLETE at 1600

Arbutus Intermittent Arbutus networking issues

UPDATE 1855 PST: We believe the issues have been resolved.

Please notify via cloud@computecanada.ca if you have any further problems.


Intermittent network issues on Arbutus cloud are causing connectivity issues for some VMs. Investigation is underway.

Monday 29th July 2019

Arbutus Login problem - Problème avec les noeuds de connexion

UPDATE: Login issue resolved - investigation into underlying cause underway.

UPDATE: Login issue occurring again. Investigation is underway.

UPDATE: Login issues resolved.

Access to login nodes is disrupted - possibly due to a network failure. L'accès aux noeuds de connexion est interrompu - possiblement dû à un problème de réseau.

Sunday 28th July 2019

No incidents reported

Saturday 27th July 2019

No incidents reported

Friday 26th July 2019

No incidents reported

Thursday 25th July 2019

No incidents reported

Wednesday 24th July 2019

Filesystem problem - Problème de système de fichiers

There is a problem with the /scratch filesystem on the Beluga cluster, which seems to be unresponsive. We are investigating.

Il y a un problème avec le système de fichiers /scratch sur la grappe Beluga, qui ne réagit pas correctement. Nous investiguons actuellement ce qui se passe.

Tuesday 23rd July 2019

No incidents reported

Monday 22nd July 2019

No incidents reported

Sunday 21st July 2019

Cedar Updated: Cedar Planned Outage - Arrêt planifié

Update, July 21: Another contoller arrived today and was installed. This one appears to be working. Cedar is available again.

Update, July 21:

We replaced the faulty storage controller yesterday, but new the one is also faulty. I are still in discussion with the vendor to evaluate how to proceed and we hope to be in production later today.

Update, July 20: The verify process that was triggered by defective hardware has finally finished. There are still a few rebuild processes running that need to finish in order to bring the storage system into a stable state. We are forced to let those verify and rebuild processes finish in order to protect the integrity of the /home and /scratch filesystems and to avoid loss of data. Our current estimate is that Cedar will become available late today. We sincerely apologize for the situation and wish that it could have been avoided. Because of the extended downtime the purge of old files in /scratch will not be done this month.

Update, July 17: The replacement parts for the Cedar storage system that serves the /home and /scratch filesystems arrived in time today (July 17). However, when the parts were installed it was detected that they were also defective. Furthermore, the defects triggered a verify process of all disks in the system. This verify process is very slow and expected to run all of July 18. At this point we hope that the system will become available sometime on July 19. We wish we would have better news and apologize for the situation.

Outage extended to end of day July 17. Unfortunately some extra work is required to complete the outage and it had to be extended by one day. Apologies for the unexpected increase. An outage is planned to do Filesystem maintenance and node upgrades, expected outage to last until end of day July 17. The Cedar facility will be unavailable on July 15, 16 and 17 because of necessary upgrades to the /home and /scratch filesystems. These upgrades require a filesystem check which is estimated to run for about two days. All jobs that are still running in the morning of the 15th will need to be terminated. We apologize for the inconvenience.

Graham Filesystem problem - Problème de système de fichiers

UPDATE 2091-07-22 11h06 EDT The problem with /scratch has been resolved.

There is a problem with the /scratch filesystem on the Graham cluster, which seems to be unresponsive. We are investigating.

Il y a un problème avec le système de fichiers /scratch sur la grappe Graham, qui ne réagit pas correctement. Nous investiguons actuellement ce qui se passe.

Saturday 20th July 2019

No incidents reported

Friday 19th July 2019

No incidents reported