[00:46:07] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [01:22:05] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [01:55:44] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [01:57:18] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [02:11:36] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [02:23:05] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [02:35:05] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [02:50:05] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [02:54:56] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [02:59:22] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [03:13:24] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [03:19:04] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [03:29:50] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [03:33:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [03:51:49] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [03:58:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [04:02:04] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [04:07:04] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [04:07:34] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [04:17:39] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [04:47:55] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [05:14:30] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [05:17:35] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [05:22:26] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [05:35:51] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [05:38:02] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [05:48:17] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [05:52:16] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [05:59:36] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [06:03:24] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [06:10:15] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [06:31:28] FIRING: NfsAlmostFull: The NFS drive is over 85% capacity (currently 86.51%) at host paws-nfs-1 in project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DNfsAlmostFull [07:26:28] FIRING: PuppetAgentFailure: Puppet agent failure detected on instance tools-nfs-2 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [07:31:28] FIRING: PuppetAgentFailure: Puppet agent failure detected on instance toolsbeta-prometheus-2 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [07:46:28] FIRING: [2x] PuppetAgentFailure: Puppet agent failure detected on instance tools-nfs-2 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [07:51:28] FIRING: [3x] PuppetAgentFailure: Puppet agent failure detected on instance tools-nfs-2 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [07:59:23] (03update) 10taavi: Query logs from Loki [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/180 (https://phabricator.wikimedia.org/T398645) [08:01:28] RESOLVED: PuppetAgentFailure: Puppet agent failure detected on instance tools-prometheus-8 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [08:09:49] (03merge) 10taavi: Query logs from Loki [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/180 (https://phabricator.wikimedia.org/T398645) [08:11:48] FIRING: PuppetFailure: Puppet has failed on cloudcontrol1006:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [08:12:00] 06cloud-services-team: PuppetFailure Puppet has failed on cloudcontrol1006:9100 - https://phabricator.wikimedia.org/T400791 (10phaultfinder) 03NEW [08:12:47] (03open) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: jobs-api: bump to 0.0.388-20250730081001-aabaef95 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/886 (https://phabricator.wikimedia.org/T398645) [08:13:11] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [08:16:48] FIRING: [2x] PuppetFailure: Puppet has failed on cloudcontrol1006:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [08:16:57] 06cloud-services-team: PuppetFailure - https://phabricator.wikimedia.org/T400792 (10phaultfinder) 03NEW [08:21:48] FIRING: [2x] PuppetFailure: Puppet has failed on cloudcontrol1006:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [08:22:04] (03open) 10dcaro: d/changelog: bump to 0.103.16 [repos/cloud/toolforge/webservice-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/webservice-cli/-/merge_requests/76 (https://phabricator.wikimedia.org/T360488 https://phabricator.wikimedia.org/T380141 https://phabricator.wikimedia.org/T384788) [08:22:17] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api [08:22:48] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [08:22:58] RESOLVED: PuppetAgentFailure: Puppet agent failure detected on instance toolsbeta-prometheus-2 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [08:23:53] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE, 13Patch-For-Review: Q4:rack/setup/install clouddb102[2-5] - https://phabricator.wikimedia.org/T393733#11045957 (10fnegri) @VRiley-WMF I //think// that my patch above (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1173974) should fix t... [08:26:48] RESOLVED: [2x] PuppetFailure: Puppet has failed on cloudcontrol1006:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [08:32:48] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api [08:33:18] (03merge) 10taavi: jobs-api: bump to 0.0.388-20250730081001-aabaef95 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/886 (https://phabricator.wikimedia.org/T398645) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [08:35:07] (03open) 10taavi: jobs-api: local: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/887 (https://phabricator.wikimedia.org/T398645) [08:35:08] (03update) 10taavi: jobs-api: local: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/887 (https://phabricator.wikimedia.org/T398645) [08:45:04] (03update) 10taavi: jobs-api: local: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/887 (https://phabricator.wikimedia.org/T398645) [08:45:05] (03update) 10taavi: jobs-api: toolsbeta: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/888 (https://phabricator.wikimedia.org/T398645) [08:45:06] (03open) 10taavi: jobs-api: toolsbeta: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/888 (https://phabricator.wikimedia.org/T398645) [08:45:09] (03update) 10taavi: jobs-api: toolsbeta: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/888 (https://phabricator.wikimedia.org/T398645) [08:45:32] (03update) 10taavi: jobs-api: toolsbeta: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/888 (https://phabricator.wikimedia.org/T398645) [08:45:37] (03update) 10taavi: jobs-api: local: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/887 (https://phabricator.wikimedia.org/T398645) [08:47:43] (03approved) 10dcaro: jobs-api: toolsbeta: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/888 (https://phabricator.wikimedia.org/T398645) (owner: 10taavi) [08:48:18] (03approved) 10dcaro: jobs-api: local: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/887 (https://phabricator.wikimedia.org/T398645) (owner: 10taavi) [08:49:28] (03merge) 10taavi: jobs-api: local: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/887 (https://phabricator.wikimedia.org/T398645) [08:49:30] (03update) 10taavi: jobs-api: toolsbeta: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/888 (https://phabricator.wikimedia.org/T398645) [08:52:57] (03merge) 10taavi: jobs-api: toolsbeta: Add Loki URL setting [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/888 (https://phabricator.wikimedia.org/T398645) [08:53:19] (03close) 10dcaro: d/changelog: bump to 0.103.16 [repos/cloud/toolforge/webservice-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/webservice-cli/-/merge_requests/76 (https://phabricator.wikimedia.org/T360488 https://phabricator.wikimedia.org/T380141 https://phabricator.wikimedia.org/T384788) [08:53:24] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component logging [08:53:40] (03open) 10dcaro: d/changelog: bump to 0.103.16 [repos/cloud/toolforge/webservice-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/webservice-cli/-/merge_requests/77 (https://phabricator.wikimedia.org/T360488 https://phabricator.wikimedia.org/T384788) [08:53:51] Change on 12wikitech.wikimedia.org a page Help:Toolforge/Running jobs was modified, changed by Jon Harald Søby link https://wikitech.wikimedia.org/w/index.php?diff=2328371 edit summary: singular [09:08:35] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component logging [09:41:31] FIRING: ToolsToolsDBReplicationLagIsTooHigh: ToolsDB replication on tools-db-6 is lagging behind the primary, the current lag is 1h 1m 35s - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationLagIsTooHigh [09:45:31] 06cloud-services-team, 10Toolforge (Toolforge iteration 22): [jobs-api] 400 bad request when trying to load a large number of logs from Loki - https://phabricator.wikimedia.org/T400795 (10taavi) 03NEW [09:45:39] 06cloud-services-team, 10Toolforge (Toolforge iteration 22): [jobs-api] 400 bad request when trying to load a large number of logs from Loki - https://phabricator.wikimedia.org/T400795#11046125 (10taavi) p:05Triage→03High [09:47:18] 06cloud-services-team, 10Toolforge (Toolforge iteration 22): [jobs-api] 400 bad request when trying to load a large number of logs from Loki - https://phabricator.wikimedia.org/T400795#11046127 (10taavi) The error from Loki is: `counterexample max entries limit per query exceeded, limit > max_entries_limit_per... [09:47:57] 10cloud-services-team (FY2025/26-Q1), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-07-30 - https://phabricator.wikimedia.org/T400796 (10fnegri) 03NEW [09:51:00] 10cloud-services-team (FY2025/26-Q1), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-07-30 - https://phabricator.wikimedia.org/T400796#11046149 (10fnegri) Tagging the tool maintainers: @Edoardolenzi, @Hjfocs, @Magnus, @MaxFrax96. I will sort this out by applying the slow transaction manually... [09:53:52] (03open) 10dcaro: roles: remove unused custom component [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/258 [09:54:38] (03approved) 10dcaro: roles: remove unused custom component [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/258 [09:54:53] (03merge) 10dcaro: roles: remove unused custom component [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/258 [09:56:54] (03open) 10taavi: loki_logs: Raise user-friendly error when exceeding line limit [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/188 (https://phabricator.wikimedia.org/T400795) [09:57:08] 06cloud-services-team: PuppetFailure Puppet has failed on cloudcontrol1006:9100 - https://phabricator.wikimedia.org/T400791#11046173 (10dcaro) 05Open→03Resolved a:03dcaro This was due to gitlab maintenance {T399306} [09:57:15] (03update) 10taavi: loki_logs: Raise user-friendly error when exceeding line limit [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/188 (https://phabricator.wikimedia.org/T400795) [09:57:38] 06cloud-services-team, 10Toolforge (Toolforge iteration 22), 13Patch-For-Review: [jobs-api] 400 bad request when trying to load a large number of logs from Loki - https://phabricator.wikimedia.org/T400795#11046183 (10taavi) 05Open→03In progress [09:57:42] 06cloud-services-team: PuppetFailure - https://phabricator.wikimedia.org/T400792#11046188 (10dcaro) 05Open→03Resolved a:03dcaro This was due to gitlab maintenance {T399306} [09:58:54] 14cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-06-30 - https://phabricator.wikimedia.org/T398170#11046195 (10fnegri) [09:59:36] 14cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-06-30 - https://phabricator.wikimedia.org/T398170#11046207 (10fnegri) [10:00:19] (03update) 10taavi: loki_logs: Raise user-friendly error when exceeding line limit [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/188 (https://phabricator.wikimedia.org/T400795) [10:01:28] FIRING: PuppetAgentFailure: Puppet agent failure detected on instance tools-sgebastion-10 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [10:01:31] RESOLVED: ToolsToolsDBReplicationLagIsTooHigh: ToolsDB replication on tools-db-6 is lagging behind the primary, the current lag is 1h 18m 35s - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationLagIsTooHigh [10:03:11] 10cloud-services-team (FY2025/26-Q1), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-07-30 - https://phabricator.wikimedia.org/T400796#11046212 (10fnegri) Applied that transaction manually. Replication restarted and got stuck again on a different query on the same table: ` | 6275807 | syste... [10:03:25] 06cloud-services-team: NovafullstackSustainedFailures Novafullstack tests have been failing for more than 5hours in eqiad - https://phabricator.wikimedia.org/T400432#11046216 (10dcaro) 05Open→03Resolved a:03dcaro This was manually sorted, resolving to clean up queue and allow for new task to be created... [10:03:31] FIRING: ToolsToolsDBReplicationLagIsTooHigh: ToolsDB replication on tools-db-6 is lagging behind the primary, the current lag is 1h 19m 0s - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationLagIsTooHigh [10:08:31] RESOLVED: ToolsToolsDBReplicationLagIsTooHigh: ToolsDB replication on tools-db-6 is lagging behind the primary, the current lag is 1h 21m 2s - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationLagIsTooHigh [10:09:18] 10cloud-services-team (FY2025/26-Q1), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-07-30 - https://phabricator.wikimedia.org/T400796#11046230 (10fnegri) Applied the second transaction manually and replication resumed. [10:10:31] FIRING: ToolsToolsDBReplicationLagIsTooHigh: ToolsDB replication on tools-db-6 is lagging behind the primary, the current lag is 1h 21m 3s - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationLagIsTooHigh [10:17:59] 10cloud-services-team (FY2025/26-Q1), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-07-30 - https://phabricator.wikimedia.org/T400796#11046252 (10fnegri) One more query on the same table: ` | 6277543 | system user | | s51434__mixnmatch_p | Slave_worker | 1 | Update_rows... [10:21:06] 10cloud-services-team (FY2025/26-Q1), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-07-30 - https://phabricator.wikimedia.org/T400796#11046261 (10fnegri) And one more: ` | 6279151 | system user | | s51434__mixnmatch_p | Slave_worker | 19 | Delete_rows_log_event::find_row... [10:23:58] 10cloud-services-team (FY2025/26-Q1), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-07-30 - https://phabricator.wikimedia.org/T400796#11046269 (10fnegri) There's more, looks like very similar queries were executed multiple times on that table. I will continue fixing this later today. [10:24:13] 10cloud-services-team (FY2025/26-Q1), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-07-30 - https://phabricator.wikimedia.org/T400796#11046270 (10fnegri) 05Open→03In progress p:05Triage→03High [10:55:25] 10cloud-services-team (FY2025/26-Q1), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-07-30 - https://phabricator.wikimedia.org/T400796#11046348 (10Magnus) My bad, I'll throw some indexes on it [10:59:07] 10cloud-services-team (FY2025/26-Q1), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-07-30 - https://phabricator.wikimedia.org/T400796#11046360 (10Magnus) Done. [12:29:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [12:31:20] (03open) 10dcaro: webservice: fix command [repos/cloud/toolforge/webservice-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/webservice-cli/-/merge_requests/78 [12:39:41] (03update) 10dcaro: webservice: fix command [repos/cloud/toolforge/webservice-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/webservice-cli/-/merge_requests/78 [12:49:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [13:19:01] RESOLVED: ToolsToolsDBReplicationLagIsTooHigh: ToolsDB replication on tools-db-6 is lagging behind the primary, the current lag is 1h 1m 25s - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationLagIsTooHigh [13:41:57] 10cloud-services-team (FY2025/26-Q1), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-07-30 - https://phabricator.wikimedia.org/T400796#11046893 (10fnegri) @Magnus thank you for your quick intervention! The replication is back in sync: {F65692609} [13:44:17] (03update) 10dcaro: webservice: fix command [repos/cloud/toolforge/webservice-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/webservice-cli/-/merge_requests/78 [13:58:33] 10VPS-project-Codesearch: Codesearch unreachable - https://phabricator.wikimedia.org/T400815 (10SomeRandomDeveloper) 03NEW [14:13:12] (03update) 10dcaro: webservice: fix command [repos/cloud/toolforge/webservice-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/webservice-cli/-/merge_requests/78 [14:16:28] RESOLVED: PuppetAgentFailure: Puppet agent failure detected on instance tools-sgebastion-10 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [14:31:29] FIRING: PuppetStaleCertificates: Found non-revoked Puppet certificates for 1 deleted instances on toolsbeta-puppetserver-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [14:32:20] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE, 13Patch-For-Review: Q4:rack/setup/install clouddb102[2-5] - https://phabricator.wikimedia.org/T393733#11047090 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by pt1979@cumin1002 for host clouddb1022.eqiad.wmnet with... [14:39:16] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE, 13Patch-For-Review: Q4:rack/setup/install clouddb102[2-5] - https://phabricator.wikimedia.org/T393733#11047138 (10fnegri) @Papaul the reimage of clouddb1022 will fail until my patch above is merged, I'm waiting for a review from #data-persis... [14:42:48] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE, 13Patch-For-Review: Q4:rack/setup/install clouddb102[2-5] - https://phabricator.wikimedia.org/T393733#11047161 (10Papaul) @fnegri thanks i was about to ping you also on that. [14:59:35] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install cloudcephosd10[48-51] - https://phabricator.wikimedia.org/T394333#11047222 (10cmooney) >>! In T394333#11044990, @Andrew wrote: > @Jclark-ctr are we waiting on more DACs before we can move ahead with these? We are awaitin... [15:07:49] Change on 12wikitech.wikimedia.org a page Help:Toolforge/Building container images was modified, changed by DCaro (WMF) link https://wikitech.wikimedia.org/w/index.php?diff=2328552 edit summary: /* Tutorials for popular languages */ added node.js static site [15:07:59] Change on 12wikitech.wikimedia.org a page Help:Toolforge/Building container images was modified, changed by DCaro (WMF) link https://wikitech.wikimedia.org/w/index.php?diff=2328553 edit summary: /* Tutorials for popular languages */ [15:11:31] 10cloud-services-team (FY2025/26-Q1), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-07-30 - https://phabricator.wikimedia.org/T400796#11047274 (10fnegri) 05In progress→03Resolved Side note: while running one of the manual queries above, I forgot to run `SET sql_log_bin = 0;` and end... [15:32:12] 10VPS-project-Codesearch: Codesearch unreachable - https://phabricator.wikimedia.org/T400815#11047360 (10SomeRandomDeveloper) 05Open→03Resolved a:03SomeRandomDeveloper Appears to be fixed, the site is reachable again. [15:32:16] 10VPS-project-Codesearch: Codesearch unreachable - https://phabricator.wikimedia.org/T400815#11047363 (10SomeRandomDeveloper) a:05SomeRandomDeveloper→03None [16:01:59] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install clouddb102[2-5] - https://phabricator.wikimedia.org/T393733#11047505 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by pt1979@cumin1002 for host clouddb1022.eqiad.wmnet with OS bookworm executed with... [16:18:39] Change on 12wikitech.wikimedia.org a page Help:Toolforge/My first Django OAuth tool was modified, changed by Zache link https://wikitech.wikimedia.org/w/index.php?diff=2328571 edit summary: /* Configure project for production environment */ use envvars instead of setting values using activate script which doesn't work in toolforge [16:30:37] Change on 12wikitech.wikimedia.org a page Help:Toolforge/My first Django OAuth tool was modified, changed by Zache link https://wikitech.wikimedia.org/w/index.php?diff=2328573 edit summary: /* Configure project for production environment */ fix [16:32:50] (03update) 10dcaro: webservice: fix command [repos/cloud/toolforge/webservice-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/webservice-cli/-/merge_requests/78 [16:37:26] Change on 12wikitech.wikimedia.org a page Help:Toolforge/My first Django OAuth tool was modified, changed by Zache link https://wikitech.wikimedia.org/w/index.php?diff=2328574 edit summary: /* Login and deploy */ test [16:38:38] Change on 12wikitech.wikimedia.org a page Help:Toolforge/My first Django OAuth tool was modified, changed by Zache link https://wikitech.wikimedia.org/w/index.php?diff=2328575 edit summary: /* Login and deploy */ better guide [16:47:24] Change on 12wikitech.wikimedia.org a page Help:Toolforge/My first Django OAuth tool was modified, changed by Zache link https://wikitech.wikimedia.org/w/index.php?diff=2328577 edit summary: uppercasing the parameter names [17:23:10] (03open) 10dcaro: kyverno: upgrade to 3.3.9 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/889 [17:28:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [18:30:35] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install clouddb102[2-5] - https://phabricator.wikimedia.org/T393733#11047938 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by vriley@cumin1002 for host clouddb1022.eqiad.wmnet with OS bookworm [18:32:59] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [18:44:37] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [18:50:11] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install clouddb102[2-5] - https://phabricator.wikimedia.org/T393733#11048010 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by vriley@cumin1002 for host clouddb1023.eqiad.wmnet with OS bookworm [18:58:58] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [19:00:51] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [19:05:08] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install clouddb102[2-5] - https://phabricator.wikimedia.org/T393733#11048079 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by vriley@cumin1002 for host clouddb1022.eqiad.wmnet with OS bookworm completed: - c... [19:06:20] (03update) 10raymond-ndibe: [maintain-harbor.jobs] manage policies and robot accounts [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/47 (https://phabricator.wikimedia.org/T360509) [19:06:28] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install clouddb102[2-5] - https://phabricator.wikimedia.org/T393733#11048082 (10VRiley-WMF) [19:08:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [19:25:21] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install clouddb102[2-5] - https://phabricator.wikimedia.org/T393733#11048141 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by vriley@cumin1002 for host clouddb1023.eqiad.wmnet with OS bookworm completed: - c... [20:16:05] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [20:38:41] (03update) 10raymond-ndibe: Draft: [maintain-harbor] add tests and configurations for new maintain-harbor jobs [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/881 (https://phabricator.wikimedia.org/T360509) [21:26:05] FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [21:36:05] FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [22:22:28] (03PS1) 10Dzahn: add passwords::mysql::zuul with fake password [labs/private] - 10https://gerrit.wikimedia.org/r/1174567 (https://phabricator.wikimedia.org/T395938) [22:23:01] (03CR) 10Dzahn: [V:03+2 C:03+2] add passwords::mysql::zuul with fake password [labs/private] - 10https://gerrit.wikimedia.org/r/1174567 (https://phabricator.wikimedia.org/T395938) (owner: 10Dzahn) [22:56:05] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [22:59:56] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install clouddb102[2-5] - https://phabricator.wikimedia.org/T393733#11048613 (10VRiley-WMF) [23:02:40] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install clouddb102[2-5] - https://phabricator.wikimedia.org/T393733#11048614 (10VRiley-WMF) Was able to run though 2 of these. Running into issues with BMC password. clouddb1022 - Finished, no issues clouddb1023 - Finished, Pass...