[04:37:07] PROBLEM - SSH on aqs1008.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [06:15:28] 10Data-Engineering-Kanban, 10Airflow, 10Patch-For-Review: Update and add copies of projectview hql script to analytics/refinery/hql path - https://phabricator.wikimedia.org/T309023 (10Snwachukwu) [06:19:18] 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow: Migrate the projectview jobs - https://phabricator.wikimedia.org/T305844 (10Snwachukwu) [06:51:46] !killed oozie wikidata reliability metrics job. [06:52:25] !restarted oozie wikidata reliability metrics job. [07:11:38] !log killed Oozie wikidata-articleplaceholder_metrics-coord, wikidata-reliability_metrics-coord, and wikidata-specialentitydata_metrics-coord jobs. [07:11:40] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:12:57] !log Started Airflow3 Wikidata metrics jobs (Articleplaceholder, Relia) [07:12:58] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:14:02] !log Started Airflow 3 Wikidata metrics jobs (Articleplaceholder, Reliability and SpecialEntityData metrics). [07:14:04] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:41:01] RECOVERY - SSH on aqs1008.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:45:40] PROBLEM - SSH on aqs1008.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [12:46:48] RECOVERY - SSH on aqs1008.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:21:57] 10Data-Engineering, 10Equity-Landscape: Affiliate Metrics Transformation - https://phabricator.wikimedia.org/T306619 (10KCVelaga_WMF) initial roll-up at ` kcv.affiliate_output_rank_metrics ` pending: - 3-letter to 2-letter country codes - not having all countries causes issues with percent rank [16:22:44] 10Data-Engineering, 10Equity-Landscape: Grants Metrics Transformation - https://phabricator.wikimedia.org/T306620 (10KCVelaga_WMF) initial roll-up at ` kcv.grants_output_rank_metrics ` pending: - 3-letter to 2-letter country codes - not having all countries causes issues with percent rank [16:25:30] 10Data-Engineering, 10Equity-Landscape: Population Metrics Transformation - https://phabricator.wikimedia.org/T306624 (10KCVelaga_WMF) blocked by pending issues from T306620 and T306619 [16:26:29] 10Data-Engineering, 10Equity-Landscape: Overall Engagement Metric (Transformation) - https://phabricator.wikimedia.org/T306622 (10KCVelaga_WMF) blocked by pending issues from T306619 and T306620 [17:47:06] 10Data-Engineering-Radar, 10Cassandra, 10Generated Data Platform, 10Patch-For-Review: AQS multi-datacenter cluster expansion - https://phabricator.wikimedia.org/T307641 (10Eevans) [18:55:14] 10Data-Engineering-Radar, 10Cassandra, 10Generated Data Platform, 10Patch-For-Review: AQS multi-datacenter cluster expansion - https://phabricator.wikimedia.org/T307641 (10Eevans) [18:55:50] 10Data-Engineering-Radar, 10Cassandra, 10Generated Data Platform, 10Patch-For-Review: AQS multi-datacenter cluster expansion - https://phabricator.wikimedia.org/T307641 (10Eevans) [20:13:24] 10Data-Engineering-Radar, 10Cassandra, 10Generated Data Platform, 10Patch-For-Review: AQS multi-datacenter cluster expansion - https://phabricator.wikimedia.org/T307641 (10Eevans) [21:04:30] PROBLEM - Check unit status of analytics-dumps-fetch-clickstream on clouddumps1002 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-clickstream https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:05:12] PROBLEM - Check unit status of analytics-dumps-fetch-clickstream on clouddumps1001 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-clickstream https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:06:02] PROBLEM - Check unit status of analytics-dumps-fetch-geoeditors_dumps on clouddumps1001 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-geoeditors_dumps https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:06:38] PROBLEM - Check unit status of analytics-dumps-fetch-pageview on clouddumps1002 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-pageview https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:06:50] PROBLEM - Check unit status of analytics-dumps-fetch-mediacounts on clouddumps1002 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-mediacounts https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:07:18] PROBLEM - Check unit status of analytics-dumps-fetch-geoeditors_dumps on clouddumps1002 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-geoeditors_dumps https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:08:02] ACKNOWLEDGEMENT - Check unit status of analytics-dumps-fetch-clickstream on clouddumps1001 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-clickstream Andrew Bogott Downtime expired, these are a work in progress https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:08:02] ACKNOWLEDGEMENT - Check unit status of analytics-dumps-fetch-geoeditors_dumps on clouddumps1001 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-geoeditors_dumps Andrew Bogott Downtime expired, these are a work in progress https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:08:02] ACKNOWLEDGEMENT - Check unit status of analytics-dumps-fetch-mediacounts on clouddumps1001 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-mediacounts Andrew Bogott Downtime expired, these are a work in progress https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:08:02] ACKNOWLEDGEMENT - Check unit status of analytics-dumps-fetch-pageview on clouddumps1001 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-pageview Andrew Bogott Downtime expired, these are a work in progress https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:08:02] ACKNOWLEDGEMENT - Check unit status of analytics-dumps-fetch-pageview_complete_dumps on clouddumps1001 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-pageview_complete_dumps Andrew Bogott Downtime expired, these are a work in progress https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:08:02] ACKNOWLEDGEMENT - Check unit status of analytics-dumps-fetch-unique_devices on clouddumps1001 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-unique_devices Andrew Bogott Downtime expired, these are a work in progress https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:08:03] ACKNOWLEDGEMENT - Check unit status of analytics-dumps-fetch-clickstream on clouddumps1002 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-clickstream Andrew Bogott Downtime expired, these are a work in progress https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:08:04] ACKNOWLEDGEMENT - Check unit status of analytics-dumps-fetch-geoeditors_dumps on clouddumps1002 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-geoeditors_dumps Andrew Bogott Downtime expired, these are a work in progress https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:08:04] ACKNOWLEDGEMENT - Check unit status of analytics-dumps-fetch-mediacounts on clouddumps1002 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-mediacounts Andrew Bogott Downtime expired, these are a work in progress https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:08:05] ACKNOWLEDGEMENT - Check unit status of analytics-dumps-fetch-pageview on clouddumps1002 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-pageview Andrew Bogott Downtime expired, these are a work in progress https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:08:05] ACKNOWLEDGEMENT - Check unit status of analytics-dumps-fetch-pageview_complete_dumps on clouddumps1002 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-pageview_complete_dumps Andrew Bogott Downtime expired, these are a work in progress https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:08:06] ACKNOWLEDGEMENT - Check unit status of analytics-dumps-fetch-unique_devices on clouddumps1002 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-unique_devices Andrew Bogott Downtime expired, these are a work in progress https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers