[01:21:29] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [02:07:21] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [05:21:29] FIRING: CloudVPSDesignateLeaks: Detected 5 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:07:21] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [07:33:38] 10Cloud-VPS (Quota-requests): Temporary (1-2 weeks) quota increase for disaster recovery exercise - https://phabricator.wikimedia.org/T375977#10245158 (10Benoit74) Just for the record, we are a bit late on this and we've identified more tasks to perform, so will need the increased quota for a bit longer, sorry a... [07:47:02] 10Cloud-VPS (Quota-requests): Temporary (1-2 weeks) quota increase for disaster recovery exercise - https://phabricator.wikimedia.org/T375977#10245185 (10Slst2020) No concern at all – thank you for the update. :) [08:13:58] 10Cloud-VPS (Project-requests): Request creation of wikiqlever VPS project - https://phabricator.wikimedia.org/T377655#10245232 (10aborrero) You mentioned 3 different RAM/disk quotas. The 512GB RAM quota is a a big ask. Could we start with the lower one and see from there? [08:35:31] 10Cloud-VPS (Project-requests): Request creation of wikiqlever VPS project - https://phabricator.wikimedia.org/T377655#10245264 (10Physikerwelt) Sure, we can start with 128GB. However, if the experiment is a success and there are many users (on the order of the number of visitors of scholia) this won't be enough. [08:55:32] 10Cloud-VPS (Project-requests): Request creation of wikiqlever VPS project - https://phabricator.wikimedia.org/T377655#10245371 (10aborrero) Could you please clarify the initial disk quota as well? [09:21:29] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:30:35] 10cloud-services-team (Hardware): wmcs codfw hardware changes proposal - https://phabricator.wikimedia.org/T377568#10245551 (10aborrero) I just noticed {T377570} and we can use some of the hardware for that: * one additional cloudgw * one additional cloudnet * one additional cloudservices I'll update the ticket... [09:32:01] 10cloud-services-team (Hardware): wmcs codfw hardware changes proposal - https://phabricator.wikimedia.org/T377568#10245564 (10aborrero) [09:47:02] 06cloud-services-team: WMCS hardware services: 3-node HA redundancy model - https://phabricator.wikimedia.org/T377570#10245605 (10aborrero) [10:07:21] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [11:52:24] 10Tool-openstack-browser: openstack-browser and Horizon show different projects membership - https://phabricator.wikimedia.org/T377710 (10hashar) 03NEW [12:03:46] 10Tool-openstack-browser: openstack-browser and Horizon show different projects membership - https://phabricator.wikimedia.org/T377710#10245979 (10aborrero) p:05Triage→03Low [12:04:03] 10Tool-openstack-browser: openstack-browser and Horizon show different projects membership - https://phabricator.wikimedia.org/T377710#10245977 (10aborrero) I checked using the openstack CLI. Horizon is right, and should be considered the source of truth. The openstack-browser cache is most likely out of sync. [12:06:51] (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/intuition] - 10https://gerrit.wikimedia.org/r/1081938 (owner: 10L10n-bot) [12:06:59] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack [12:09:14] !log aborrero@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [12:15:12] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS, 13Patch-For-Review: openstack: develop a script to migrate a VM instance from the old network setting (vlan) to the new (vxlan, IPv6) - https://phabricator.wikimedia.org/T377346#10245997 (10aborrero) 05In progress→03Stalled waiting for {T377467} [12:23:46] (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/map-of-monuments] - 10https://gerrit.wikimedia.org/r/1081956 (owner: 10L10n-bot) [12:23:47] (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/weapon-of-mass-description] - 10https://gerrit.wikimedia.org/r/1081957 (owner: 10L10n-bot) [12:23:49] (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/watch-translations] - 10https://gerrit.wikimedia.org/r/1081958 (owner: 10L10n-bot) [12:35:16] 06cloud-services-team, 10Cloud-VPS, 07Epic: CloudVPS: introduce tenant networks - https://phabricator.wikimedia.org/T270694#10246060 (10aborrero) [12:36:27] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS, 13Patch-For-Review: openstack: develop a script to migrate a VM instance from the old network setting (vlan) to the new (vxlan, IPv6) - https://phabricator.wikimedia.org/T377346#10246070 (10aborrero) [12:36:28] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS: Migrate Cloud VPS instances to VXLAN based networks - https://phabricator.wikimedia.org/T364725#10246056 (10aborrero) 05In progress→03Stalled blocked on {T377467} [12:36:44] 06cloud-services-team: alerting: detect if a kernel had a panic - https://phabricator.wikimedia.org/T376719#10246071 (10aborrero) 05In progress→03Resolved [12:37:31] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS: openstack: create some automation to migrate VMs from VLAN to VXLAN networks - https://phabricator.wikimedia.org/T374822#10246068 (10aborrero) →14Duplicate dup:03T377346 [12:40:50] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: openstack: work out IPv6 and designate integration - https://phabricator.wikimedia.org/T374715#10246081 (10aborrero) [12:44:43] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: openstack: work out IPv6 and designate integration - https://phabricator.wikimedia.org/T374715#10246085 (10aborrero) 05Open→03In progress Let me check what is left to be done here. [13:18:13] (03CR) 10D3r1ck01: [C:03+1] "LGTM! I'll like Krinkle to have a look at this but" [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1080836 (https://phabricator.wikimedia.org/T377168) (owner: 10Brian Wolff) [13:19:48] 06cloud-services-team, 10Toolforge (Toolforge iteration 16), 13Patch-For-Review: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.28 - https://phabricator.wikimedia.org/T362867#10246270 (10Raymond_Ndibe) [13:21:29] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:33:45] 10Tool-video-answer-tool, 06Future-Audiences: [Bug] Article title attribution for Prawo Jazdy - https://phabricator.wikimedia.org/T377735 (10Maryana) 03NEW [14:36:22] 10Tool-video-answer-tool, 06Future-Audiences: [Styling fix] Article attribution box - https://phabricator.wikimedia.org/T377736 (10Maryana) 03NEW [14:36:59] 10Tool-video-answer-tool, 06Future-Audiences: [Styling fix] Article attribution box - https://phabricator.wikimedia.org/T377736#10246665 (10Maryana) [14:40:05] 06cloud-services-team, 10Toolforge: toolsdb: review alerting - https://phabricator.wikimedia.org/T306453#10246677 (10taavi) [14:40:08] 06cloud-services-team, 10Toolforge, 05Goal: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206#10246676 (10taavi) [14:40:15] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge, 05Goal: [toolsdb] Migrate quickstatements db to Trove - https://phabricator.wikimedia.org/T369177#10246679 (10taavi) [14:40:19] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge: Ensure all ToolsDB databases comply with current naming conventions - https://phabricator.wikimedia.org/T269609#10246678 (10taavi) [14:40:31] 06cloud-services-team, 10Toolforge: Tools violating the connection handling policy - https://phabricator.wikimedia.org/T353551#10246682 (10taavi) [14:40:35] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge, 05Goal: [toolsdb] Migrate mixnmatch db to Trove - https://phabricator.wikimedia.org/T350862#10246683 (10taavi) [14:40:37] 06cloud-services-team, 10Toolforge, 10Tools: s52421__commonsdelinquent_p.event needs index on done column? - https://phabricator.wikimedia.org/T178327#10246668 (10taavi) [14:40:40] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge, 05Goal: [toolsdb] test failover procedure - https://phabricator.wikimedia.org/T344719#10246684 (10taavi) [14:40:49] 06cloud-services-team, 10Cloud-VPS, 10Toolforge: [wmcs-cookbooks] Write cookbook for restarting ToolsDB - https://phabricator.wikimedia.org/T328282#10246687 (10taavi) [14:40:50] 06cloud-services-team, 10Toolforge: ToolsDB: discard obsolete GTID domains - https://phabricator.wikimedia.org/T334947#10246686 (10taavi) [14:40:52] 06cloud-services-team, 10Toolforge, 07Epic: [toolsdb] convert myisam tables to innodb - https://phabricator.wikimedia.org/T306455#10246688 (10taavi) [14:41:01] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge: [toolsdb] set gtid_domain_id to 0 - https://phabricator.wikimedia.org/T357341#10246691 (10taavi) [14:41:02] 06cloud-services-team, 10Toolforge, 07Epic: Migrate largest ToolsDB users to Trove - https://phabricator.wikimedia.org/T291782#10246689 (10taavi) [14:41:10] 06cloud-services-team, 10Toolforge: ToolsDB: simplify volume chain - https://phabricator.wikimedia.org/T335593#10246694 (10taavi) [14:41:19] 06cloud-services-team, 10Toolforge: ToolsDB: setup pt-heartbeat replication monitor - https://phabricator.wikimedia.org/T334925#10246696 (10taavi) [14:41:33] 06cloud-services-team, 10Toolforge: Configure `report_host` on ToolsDB - https://phabricator.wikimedia.org/T355761#10246693 (10taavi) [14:41:40] 06cloud-services-team, 10Toolforge: [toolsdb] Replica is frequently lagging behind the primary - https://phabricator.wikimedia.org/T357624#10246680 (10taavi) [14:41:43] 06cloud-services-team, 10Toolforge: toolsdb: evaluate storage usage by some tools - https://phabricator.wikimedia.org/T301967#10246701 (10taavi) [14:41:51] 06cloud-services-team, 10Toolforge, 10Tools: Determine if templatetiger is abandoned - https://phabricator.wikimedia.org/T253424#10246702 (10taavi) [14:44:06] 06cloud-services-team, 10Toolforge: [toolsdb] Clean up users and manage as code - https://phabricator.wikimedia.org/T367772#10246690 (10taavi) [14:47:35] 06cloud-services-team, 10Data-Services, 10Toolforge: Allow self-serve database credential and permissions management for Toolforge projects - https://phabricator.wikimedia.org/T136335#10246739 (10taavi) [14:49:43] 06cloud-services-team, 10Cloud-VPS: neutron: clarify why DNS extension is not enabled - https://phabricator.wikimedia.org/T377740 (10aborrero) 03NEW [14:49:53] 06cloud-services-team, 10Cloud-VPS: neutron: clarify why DNS extension is not enabled - https://phabricator.wikimedia.org/T377740#10246782 (10aborrero) p:05Triage→03Low [15:03:18] RESOLVED: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [16:41:03] (03reopen) 10bd808: This is a test of the gitlab irc reporter [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/33 (https://phabricator.wikimedia.org/T90594) [16:41:41] (03close) 10bd808: This is a test of the gitlab irc reporter [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/33 (https://phabricator.wikimedia.org/T90594) [17:04:29] (03CR) 10Abijeet Patro: [V:03+2] Localisation updates from https://translatewiki.net. [labs/tools/watch-translations] - 10https://gerrit.wikimedia.org/r/1081958 (owner: 10L10n-bot) [17:04:49] (03CR) 10Abijeet Patro: [V:03+2] Localisation updates from https://translatewiki.net. [labs/tools/weapon-of-mass-description] - 10https://gerrit.wikimedia.org/r/1081957 (owner: 10L10n-bot) [17:06:26] 10Tool-video-answer-tool, 06Future-Audiences: Improvements to video server-side rendering - https://phabricator.wikimedia.org/T375408#10247502 (10Maryana) 05Open→03Resolved a:03Maryana [17:06:46] (03CR) 10Abijeet Patro: [V:03+2] Localisation updates from https://translatewiki.net. [labs/tools/map-of-monuments] - 10https://gerrit.wikimedia.org/r/1081956 (owner: 10L10n-bot) [17:06:50] (03CR) 10Abijeet Patro: [V:03+2] Localisation updates from https://translatewiki.net. [labs/tools/intuition] - 10https://gerrit.wikimedia.org/r/1081938 (owner: 10L10n-bot) [17:07:51] 10Tool-video-answer-tool, 06Future-Audiences: [Bug] Article title attribution for Prawo Jazdy - https://phabricator.wikimedia.org/T377735#10247520 (10Maryana) Make consistent w/title of article at time of video generation [17:21:29] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:21:57] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [17:22:01] 10cloud-services-team (Hardware): wmcs codfw hardware changes proposal - https://phabricator.wikimedia.org/T377568#10247623 (10RobH) [17:32:12] 10cloud-services-team (Hardware): wmcs codfw hardware changes proposal - https://phabricator.wikimedia.org/T377568#10247706 (10RobH) Answered inline, but 5 hosts all sat unused is a bit alarming to me, as each server all been using budget/power/space since purchase without being leveraged. > cloudcontrol20... [17:33:19] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [17:42:15] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [17:47:10] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [17:54:50] (03update) 10raymond-ndibe: Draft: [toolforge-deploy] deploy maintain-harbor [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/563 (https://phabricator.wikimedia.org/T358225) [17:54:51] 10Tool-query-chest: Expand list of allowed domains in Query Chest to include the split Wikidata graph endpoints - https://phabricator.wikimedia.org/T377675#10247840 (10Ladsgroup) 05Open→03Resolved a:03Ladsgroup Done in https://gitlab.wikimedia.org/toolforge-repos/query-chest/-/commit/91c8fa035f00631d64... [17:55:39] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [18:11:57] 10Tool-video-answer-tool, 06Future-Audiences: [Styling fix] Article attribution box - https://phabricator.wikimedia.org/T377736#10247877 (10derenrich) resolved by https://gitlab.wikimedia.org/repos/future-audiences/video-answer-tool/-/merge_requests/52/diffs [18:12:23] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [18:13:24] 10cloud-services-team (Hardware): wmcs codfw hardware changes proposal - https://phabricator.wikimedia.org/T377568#10247882 (10RobH) >>! In T377568#10247706, @RobH wrote: >> cloudcontrol2006-dev: increase memory in-place, or replace with another server with higher memory > This host was purchsed on 2023-07-... [18:13:30] 10Horizon, 05Cloud-Services-Origin-User: Horizon nova generated ssh key undeletable - https://phabricator.wikimedia.org/T373082#10247883 (10taavi) [18:13:33] 10Tool-video-answer-tool, 06Future-Audiences: [Bug] Article title attribution for Prawo Jazdy - https://phabricator.wikimedia.org/T377735#10247879 (10derenrich) resolved by https://gitlab.wikimedia.org/repos/future-audiences/video-answer-tool/-/merge_requests/52/diffs [18:19:24] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [18:31:45] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [18:49:51] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [18:54:54] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [19:09:32] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [19:16:38] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [19:34:48] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [19:36:48] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [19:52:39] 10Cloud-VPS (Project-requests): Request creation of wikiqlever VPS project - https://phabricator.wikimedia.org/T377655#10248226 (10bd808) > We need at least 128 GB RAM and SSDs. There are no direct access SSD devices available to normal Cloud VPS projects. Cloud VPS uses Ceph volumes for storage which generally... [19:59:32] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [20:00:13] 10Cloud-VPS (Project-requests), 06Data-Platform-SRE, 10Wikidata-Query-Service: Request creation of wikiqlever VPS project - https://phabricator.wikimedia.org/T377655#10248280 (10bking) [20:06:19] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [20:10:51] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [20:17:01] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [20:17:56] 10Quarry: [bug] Quarry queries are stopped - https://phabricator.wikimedia.org/T377010#10248327 (10rook) It is possible that you were encountering the three hour time limit for analytics searches. If there was some lag it could have increased your query time from what looks like an hour to later. I'm unsure of h... [20:18:06] 10Quarry: [bug] Quarry queries are stopped - https://phabricator.wikimedia.org/T377010#10248328 (10rook) 05Open→03Declined [20:26:21] 10Cloud-VPS (Project-requests), 06Data-Platform-SRE, 10Wikidata-Query-Service: Request creation of wikiqlever VPS project - https://phabricator.wikimedia.org/T377655#10248359 (10bking) Hello @Physikerwelt ! I am an SRE on the Search Platform team, and my responsibilities include the current WDQS infrastuctur... [20:33:13] (03CR) 10Volkanurl: "check experimental" [labs/private] - 10https://gerrit.wikimedia.org/r/1072655 (https://phabricator.wikimedia.org/T353788) (owner: 10Stevemunene) [20:33:21] (03CR) 10Volkanurl: "check experimental" [labs/private] - 10https://gerrit.wikimedia.org/r/1072655 (https://phabricator.wikimedia.org/T353788) (owner: 10Stevemunene) [20:33:36] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [20:46:49] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [20:47:55] 06cloud-services-team, 10Toolforge: Add support for replacing a running scheduled job when an overlapping schedule fires (`concurrencyPolicy: Replace`) - https://phabricator.wikimedia.org/T377781 (10bd808) 03NEW [20:51:17] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [20:54:21] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [20:55:21] 06cloud-services-team, 10Toolforge: Add --timeout to toolforge jobs - https://phabricator.wikimedia.org/T377782 (10Multichill) 03NEW [20:58:23] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [21:01:41] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [21:08:38] 06cloud-services-team, 10Toolforge: Add --timeout to toolforge jobs - https://phabricator.wikimedia.org/T377782#10248441 (10bd808) Related: * {T377420} * {T377781} [21:21:29] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [21:24:02] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [21:32:55] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [22:13:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-23 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [22:47:36] FIRING: PuppetCertificateAboutToExpire: Puppet CA certificate mwv-builder-03.mediawiki-vagrant.eqiad.wmflabs is about to expire in 3d 23h 58m 34s - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetCertificateAboutToExpire - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetCertificateAboutToExpire [22:51:50] (03update) 10raymond-ndibe: Draft: [maintain-harbor] Move to become a toolforge component [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/34 (https://phabricator.wikimedia.org/T358225) [23:18:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-23 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses