[00:26:35] RECOVERY - SSH on an-coord1002.mgmt is OK: SSH OK - OpenSSH_7.4 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [06:37:12] (VarnishkafkaNoMessages) firing: (3) varnishkafka on cp2027 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages [06:42:12] (VarnishkafkaNoMessages) resolved: (4) varnishkafka on cp2027 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages [06:46:31] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, 10Thanks: Exception caught: Could not enqueue jobs when attempting to 'thank' on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10Tgr) Seems to happen for [[https://beta-logs.wmcloud.org/goto/b25ee... [07:21:41] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, 10Thanks: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10TheresNoTime) [07:22:22] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, 10Thanks: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10TheresNoTime) [07:23:35] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, 10Thanks: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10TheresNoTime) [07:33:48] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, 10Thanks: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10TheresNoTime) On attempting to create an account, I logged [[ https://beta-logs.wmcloud.org/app... [07:43:56] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, 10Thanks: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10TheresNoTime) More digging, found [[ https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Echo... [08:06:36] 10Data-Engineering, 10GitLab, 10Release-Engineering-Team, 10serviceops-collab: Experiencing pipeline failure due to disk-space issues - https://phabricator.wikimedia.org/T310593 (10hashar) a:05hashar→03None I have done the first pass investigation back in June to free up disk space but I am otherwise n... [08:33:47] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, 10Thanks: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10TheresNoTime) [08:34:08] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, 10Thanks: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10TheresNoTime) p:05Triage→03Unbreak! On advisement of @RhinosF1, I agree this may warrant ho... [08:36:36] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, 10Thanks: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10RhinosF1) >>! In T322342#8372147, @TheresNoTime wrote: > More digging, found [[ https://gerrit.... [11:48:22] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, and 2 others: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10RhinosF1) Can this be resolved? [11:51:00] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, and 2 others: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10Ladsgroup) 05Open→03Resolved a:03Zabe [11:52:31] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, and 2 others: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10TheresNoTime) Still getting `Y2ZOP1rG9juoZpSmy0kEugAAAAQ] /w/index.php?returnto=Main+Page&tit... [14:47:04] joal: https://medium.com/insiderengineering/apache-iceberg-reduced-our-amazon-s3-cost-by-90-997cde5ce931 [15:25:33] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, and 2 others: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10Zabe) 05Resolved→03Open [16:59:44] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10Growth-Team, and 3 others: JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322342 (10Zabe) 05Open→03Resolved