[01:00:57] (03CR) 10Huji: [C: 04-1] "Misunderstanding on my part. Will revise the patch to use $wgNamespaceProtection and only restrict the article namespace." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/721108 (https://phabricator.wikimedia.org/T291018) (owner: 10Huji) [01:01:23] (03PS4) 10Huji: Temporarily disable article editing by anonymous users on fawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/721108 (https://phabricator.wikimedia.org/T291018) [01:07:25] (03PS5) 10Huji: Temporarily disable article editing by anonymous users on fawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/721108 (https://phabricator.wikimedia.org/T291018) [05:25:02] PROBLEM - PHD should be supervising processes on phab1001 is CRITICAL: PROCS CRITICAL: 2 processes with UID = 497 (phd) https://wikitech.wikimedia.org/wiki/Phabricator [05:26:57] RECOVERY - PHD should be supervising processes on phab1001 is OK: PROCS OK: 4 processes with UID = 497 (phd) https://wikitech.wikimedia.org/wiki/Phabricator [07:00:04] Deploy window No deploys all day! See Deployments/Emergencies if things are broken. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210919T0700) [09:09:47] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=atlas_exporter site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [09:11:37] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [11:37:47] PROBLEM - SSH on analytics1069.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [12:38:33] RECOVERY - SSH on analytics1069.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:04:47] (03CR) 10Lucas Werkmeister: Perform rolling restarts on kubernetes (031 comment) [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/721989 (https://phabricator.wikimedia.org/T290833) (owner: 10Lucas Werkmeister) [15:23:19] 10Puppet, 10Cloud-VPS, 10Infrastructure-Foundations, 10User-jbond, 10cloud-services-team (Kanban): Add more rspec test to the puppet code - https://phabricator.wikimedia.org/T289668 (10Majavah) [15:23:28] 10Puppet, 10Cloud-VPS, 10Infrastructure-Foundations, 10User-jbond, 10cloud-services-team (Kanban): Gather a list of puppet modules shared between cloud and production - https://phabricator.wikimedia.org/T289666 (10Majavah) [15:23:35] 10Puppet, 10Cloud-VPS, 10Infrastructure-Foundations, 10User-jbond, 10cloud-services-team (Kanban): Normalise hiera default values - https://phabricator.wikimedia.org/T289665 (10Majavah) [15:23:43] 10Puppet, 10Cloud-VPS, 10Infrastructure-Foundations, 10Patch-For-Review, and 2 others: Audit usages or the realm variable with a view to drop it - https://phabricator.wikimedia.org/T289661 (10Majavah)