[00:45:45] FIRING: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:50:45] RESOLVED: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:46:26] 06cloud-services-team, 10Data-Services: Create views for globaljsonlinks tables - https://phabricator.wikimedia.org/T387419 (10Bugreporter) 03NEW [07:42:31] 06cloud-services-team, 06DC-Ops, 10Ganeti, 06Infrastructure-Foundations, and 2 others: Temperature Inlet Temp issue on clouddumps1001:9290 - https://phabricator.wikimedia.org/T383723#10585955 (10ops-monitoring-bot) Draining ganeti1029.eqiad.wmnet of running VMs [07:45:48] 06cloud-services-team, 06DC-Ops, 10Ganeti, 06Infrastructure-Foundations, and 2 others: Temperature Inlet Temp issue on clouddumps1001:9290 - https://phabricator.wikimedia.org/T383723#10585970 (10ops-monitoring-bot) Draining ganeti1029.eqiad.wmnet of running VMs [07:47:53] 06cloud-services-team, 06DC-Ops, 10Ganeti, 06Infrastructure-Foundations, and 2 others: Temperature Inlet Temp issue on clouddumps1001:9290 - https://phabricator.wikimedia.org/T383723#10585984 (10ops-monitoring-bot) Draining ganeti1029.eqiad.wmnet of running VMs [07:51:00] 06cloud-services-team, 06DC-Ops, 10Ganeti, 06Infrastructure-Foundations, and 2 others: Temperature Inlet Temp issue on clouddumps1001:9290 - https://phabricator.wikimedia.org/T383723#10585996 (10ops-monitoring-bot) Draining ganeti1030.eqiad.wmnet of running VMs [09:37:53] 06cloud-services-team, 06DC-Ops, 10Ganeti, 06Infrastructure-Foundations, and 2 others: Temperature Inlet Temp issue on clouddumps1001:9290 - https://phabricator.wikimedia.org/T383723#10586311 (10fnegri) 05Resolved→03Open The alert fired again a few minutes ago, then went back to normal: {F58511109} [09:42:30] (03CR) 10FNegri: [C:03+1] "LGTM" [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1122573 (owner: 10David Caro) [09:50:36] (03approved) 10dcaro: [builds-builder] create and use maintain-harbor robot account [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/66 (https://phabricator.wikimedia.org/T361698) (owner: 10raymond-ndibe) [10:02:31] (03CR) 10David Caro: [C:03+2] run_tests: consider the tests passed even if there's skipped ones [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1122573 (owner: 10David Caro) [10:02:42] (03approved) 10dcaro: [toolforge-deploy] maintain-harbor use robot account [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/672 (https://phabricator.wikimedia.org/T361698) (owner: 10raymond-ndibe) [10:02:48] (03update) 10dcaro: [toolforge-deploy] maintain-harbor use robot account [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/672 (https://phabricator.wikimedia.org/T361698) (owner: 10raymond-ndibe) [10:03:48] (03update) 10dcaro: [toolforge-deploy] maintain-harbor use robot account [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/672 (https://phabricator.wikimedia.org/T361698) (owner: 10raymond-ndibe) [10:04:03] (03update) 10dcaro: [builds-builder] create and use maintain-harbor robot account [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/66 (https://phabricator.wikimedia.org/T361698) (owner: 10raymond-ndibe) [10:04:15] (03update) 10dcaro: [do-not-merge][lima-kilo] test maintain-harbor robot account [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/231 (https://phabricator.wikimedia.org/T361698) (owner: 10raymond-ndibe) [10:04:47] (03update) 10dcaro: [jobs-api] create seperate api.py and move flask things there [repos/cloud/toolforge/jobs-api] (diff_job_runtime_method) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/91 (https://phabricator.wikimedia.org/T359804) (owner: 10raymond-ndibe) [10:05:31] (03update) 10dcaro: [jobs-api] save business models in a DB [repos/cloud/toolforge/jobs-api] (split_logic_from_api) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/114 (https://phabricator.wikimedia.org/T359650) (owner: 10raymond-ndibe) [10:06:41] (03Merged) 10jenkins-bot: run_tests: consider the tests passed even if there's skipped ones [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1122573 (owner: 10David Caro) [10:09:27] (03update) 10dcaro: [toolforge-deploy] add maintain-harbor quota config and tests [repos/cloud/toolforge/toolforge-deploy] (maintain_harbor_use_robot_account) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/674 (https://phabricator.wikimedia.org/T352417) (owner: 10raymond-ndibe) [10:11:53] (03update) 10dcaro: [toolforge-deploy] add maintain-harbor quota config and tests [repos/cloud/toolforge/toolforge-deploy] (maintain_harbor_use_robot_account) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/674 (https://phabricator.wikimedia.org/T352417) (owner: 10raymond-ndibe) [10:59:06] FIRING: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [11:04:06] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [11:09:06] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [11:14:06] RESOLVED: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:22:19] (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/31 [12:44:01] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/31 (owner: 10l10n-bot) [12:44:05] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/31 (owner: 10l10n-bot) [13:30:37] (03update) 10dcaro: api: make trailing slash optional [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/137 (https://phabricator.wikimedia.org/T383798) (owner: 10sstefanova) [13:31:14] (03update) 10dcaro: builds-builder: bump to 0.0.125-20250219235943-91ad05c3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/671 (https://phabricator.wikimedia.org/T384327) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [13:54:16] FIRING: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [13:59:16] RESOLVED: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:00:03] (03update) 10dcaro: [ maintain-harbor ] add job for managing harbor quotas [repos/cloud/toolforge/maintain-harbor] (refactor_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/22 (https://phabricator.wikimedia.org/T352417) (owner: 10sstefanova) [14:00:55] (03approved) 10dcaro: [toolforge-deploy] add maintain-harbor quota config and tests [repos/cloud/toolforge/toolforge-deploy] (maintain_harbor_use_robot_account) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/674 (https://phabricator.wikimedia.org/T352417) (owner: 10raymond-ndibe) [14:04:43] (03unapproved) 10dcaro: [toolforge-deploy] add maintain-harbor quota config and tests [repos/cloud/toolforge/toolforge-deploy] (maintain_harbor_use_robot_account) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/674 (https://phabricator.wikimedia.org/T352417) (owner: 10raymond-ndibe) [14:05:36] !log dcaro@urcuchillay toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-builder [14:05:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [14:13:40] !log dcaro@urcuchillay toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-builder [14:13:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [14:14:06] FIRING: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:19:06] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:24:06] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:29:06] RESOLVED: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:29:41] 06cloud-services-team, 06DC-Ops, 10decommission-hardware, 10ops-eqiad, 06SRE: decommission cloudgw100[12] - https://phabricator.wikimedia.org/T386810#10587074 (10Papaul) @VRiley-WMF for both nodes in netbox under interfaces , delete "vlan1107" and "vlan1120" after that re-run the script again [14:33:20] !log dcaro@urcuchillay tools START - Cookbook wmcs.toolforge.component.deploy for component builds-builder [14:40:56] (03unapproved) 10dcaro: [builds-builder] create and use maintain-harbor robot account [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/66 (https://phabricator.wikimedia.org/T361698) (owner: 10raymond-ndibe) [14:41:11] (03update) 10dcaro: [toolforge-deploy] add maintain-harbor quota config and tests [repos/cloud/toolforge/toolforge-deploy] (maintain_harbor_use_robot_account) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/674 (https://phabricator.wikimedia.org/T352417) (owner: 10raymond-ndibe) [14:41:32] !log dcaro@urcuchillay tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-builder [14:41:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:43:56] !log dcaro@urcuchillay tools START - Cookbook wmcs.toolforge.component.deploy for component builds-builder [14:43:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:52:25] !log dcaro@urcuchillay tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-builder [14:52:28] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:06:39] FIRING: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:11:39] RESOLVED: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:12:00] (03approved) 10dcaro: builds-builder: bump to 0.0.125-20250219235943-91ad05c3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/671 (https://phabricator.wikimedia.org/T384327) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [15:12:04] (03merge) 10dcaro: builds-builder: bump to 0.0.125-20250219235943-91ad05c3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/671 (https://phabricator.wikimedia.org/T384327) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [15:12:05] (03update) 10dcaro: builds-builder: bump to 0.0.125-20250219235943-91ad05c3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/671 (https://phabricator.wikimedia.org/T384327) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [15:19:02] (03approved) 10dcaro: [builds-api] use maintain-harbor robot account locally [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/119 (https://phabricator.wikimedia.org/T361698) (owner: 10raymond-ndibe) [15:21:39] FIRING: [3x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:26:39] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:31:39] RESOLVED: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_toolserver_org_redirects_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:35:58] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack [15:38:21] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [15:39:18] (03approved) 10dcaro: api: make trailing slash optional [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/137 (https://phabricator.wikimedia.org/T383798) (owner: 10sstefanova) [15:39:23] (03merge) 10dcaro: api: make trailing slash optional [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/137 (https://phabricator.wikimedia.org/T383798) (owner: 10sstefanova) [15:40:06] FIRING: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:42:24] (03open) 10group_203_bot_4866fc124f4b41659f667468a6115cf3: jobs-api: bump to 0.0.355-20250227153935-4a76dc7c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/680 (https://phabricator.wikimedia.org/T383798) [15:45:06] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:47:07] !log dcaro@urcuchillay toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [15:47:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [15:55:53] !log dcaro@urcuchillay toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api [15:55:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [16:10:06] RESOLVED: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [16:40:58] !log dcaro@urcuchillay tools START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [16:41:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:41:06] FIRING: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [16:46:06] RESOLVED: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [16:49:07] !log dcaro@urcuchillay tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api [16:49:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:57:06] FIRING: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [16:57:53] (03update) 10dcaro: deploy: support deploying source build components (buildpack) [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/53 (https://phabricator.wikimedia.org/T384634) [16:58:04] (03approved) 10dcaro: jobs-api: bump to 0.0.355-20250227153935-4a76dc7c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/680 (https://phabricator.wikimedia.org/T383798) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [16:58:11] (03merge) 10dcaro: jobs-api: bump to 0.0.355-20250227153935-4a76dc7c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/680 (https://phabricator.wikimedia.org/T383798) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [16:58:51] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [17:00:58] 06cloud-services-team, 10Toolforge (Toolforge iteration 18), 13Patch-For-Review: [jobs-api] treat URLs with and without a trailing slash the same - https://phabricator.wikimedia.org/T383798#10588112 (10dcaro) 05In progress→03Resolved [17:01:34] 06cloud-services-team, 10Toolforge (Toolforge iteration 18), 13Patch-For-Review: [jobs-api,jobs-cli] Introduce a way to stop stuck cronjobs - https://phabricator.wikimedia.org/T377420#10588121 (10dcaro) 05In progress→03Resolved [17:02:15] 10Toolforge (Toolforge iteration 18): [components-cli] Add the `refresh` subcommand to the autocomplete file - https://phabricator.wikimedia.org/T384641#10588130 (10dcaro) 05In progress→03Resolved [17:02:31] 06cloud-services-team, 10Toolforge (Toolforge iteration 18), 07Kubernetes, 13Patch-For-Review: [jobs-api] Allow Toolforge scheduled jobs to have a maximum runtime - https://phabricator.wikimedia.org/T306391#10588132 (10dcaro) 05In progress→03Resolved [17:07:58] (03update) 10fnegri: Draft: Upgrade Kubernetes to 1.29 [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/227 (https://phabricator.wikimedia.org/T362868) [17:08:10] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 18): [infra,k8s] Upgrade Toolforge Kubernetes to version 1.29 - https://phabricator.wikimedia.org/T362868#10588166 (10fnegri) [17:26:04] 06cloud-services-team, 06DC-Ops, 10decommission-hardware, 10ops-eqiad, 06SRE: decommission cloudgw100[12] - https://phabricator.wikimedia.org/T386810#10588226 (10VRiley-WMF) 05Open→03Resolved a:03VRiley-WMF That worked, thank you @Papaul [17:26:33] 06cloud-services-team, 06DC-Ops, 10decommission-hardware, 10ops-eqiad, 06SRE: decommission cloudgw100[12] - https://phabricator.wikimedia.org/T386810#10588231 (10VRiley-WMF) [17:43:06] 10Tools, 06Security-Team, 10wikimedia-risk-calculator, 07SecTeam-Processed, 07Security: https://risk-calculator.toolforge.org/site/ssvc-calc/ loads JS from polyfill.io - https://phabricator.wikimedia.org/T368472#10588326 (10sbassett) [17:43:51] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [17:47:06] RESOLVED: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [17:48:06] 10Tools, 06Security-Team, 10wikimedia-risk-calculator, 07SecTeam-Processed, 07Security: https://risk-calculator.toolforge.org/site/ssvc-calc/ loads JS from polyfill.io - https://phabricator.wikimedia.org/T368472#10588345 (10AntiCompositeNumber) As far as I can tell, the site is still loading cross-si... [18:11:39] FIRING: [3x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:16:39] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:27:25] (03open) 10dcaro: deployment: update the deployment state during deploy [repos/cloud/toolforge/components-api] (deploy_source_build_too) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/54 [18:27:58] (03update) 10dcaro: deployment: update the deployment state during deploy [repos/cloud/toolforge/components-api] (deploy_source_build_too) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/54 [18:36:39] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:36:54] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:37:19] (03open) 10fnegri: wmcs-k8s-metrics: upgrade charts for K8s v1.29 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/681 (https://phabricator.wikimedia.org/T362868) [18:39:37] (03update) 10fnegri: wmcs-k8s-metrics: upgrade charts for K8s v1.29 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/681 (https://phabricator.wikimedia.org/T362868) [18:40:13] (03update) 10fnegri: wmcs-k8s-metrics: upgrade charts for K8s v1.29 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/681 (https://phabricator.wikimedia.org/T362868) [18:40:35] (03update) 10fnegri: wmcs-k8s-metrics: upgrade charts for K8s v1.29 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/681 (https://phabricator.wikimedia.org/T362868) [18:41:39] RESOLVED: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:51:06] (03update) 10fnegri: wmcs-k8s-metrics: upgrade charts for K8s v1.29 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/681 (https://phabricator.wikimedia.org/T362868) [19:00:06] FIRING: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [19:05:06] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [19:11:09] (03update) 10fnegri: Draft: Upgrade Kubernetes to 1.29 [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/227 (https://phabricator.wikimedia.org/T362868) [19:12:21] (03update) 10fnegri: Upgrade Kubernetes to 1.29 [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/227 (https://phabricator.wikimedia.org/T362868) [20:00:06] RESOLVED: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:06:06] FIRING: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:10:21] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:46:34] FIRING: DiskSpace: Disk space cloudcontrol2006-dev:9100:/ 0% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudcontrol2006-dev - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [21:04:04] RESOLVED: DiskSpace: Disk space cloudcontrol2006-dev:9100:/ 0% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudcontrol2006-dev - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [21:21:06] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [21:25:21] 10VPS-project-Phabricator, 06collaboration-services: phabricator.wmcloud.org should be more clearly marked as a testing instance - https://phabricator.wikimedia.org/T283088#10588989 (10Dzahn) I reset my account using the "auth recover" shell command and logged in. Then went to setting and changed the "ui.logo... [21:25:21] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [21:26:06] RESOLVED: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [21:28:38] 10VPS-project-Phabricator, 06collaboration-services: phabricator.wmcloud.org should be more clearly marked as a testing instance - https://phabricator.wikimedia.org/T283088#10589002 (10stjn) Could you maybe also add a panel to the home page like we have here (‘Welcome to Wikimedia Phabricator’ one) with some s... [22:00:39] FIRING: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [22:05:39] FIRING: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [22:26:48] 10VPS-project-Phabricator, 06collaboration-services: phabricator.wmcloud.org should be more clearly marked as a testing instance - https://phabricator.wikimedia.org/T283088#10589197 (10Dzahn) Sure, I just don't know yet how to do that and if it will be overwritten on every deploy. [22:27:48] FIRING: PuppetFailure: Puppet has failed on cloudcontrol2006-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [22:27:55] 06cloud-services-team: PuppetFailure Puppet has failed on cloudcontrol2006-dev:9100 - https://phabricator.wikimedia.org/T387529 (10phaultfinder) 03NEW