[00:07:41] FIRING: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [00:22:41] FIRING: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://grafana.wikimedia.org/d/GWvEXWDZk/prometheus-server?var-datasource=codfw%20prometheus%2Fcloud - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [00:27:41] FIRING: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://grafana.wikimedia.org/d/GWvEXWDZk/prometheus-server?var-datasource=codfw%20prometheus%2Fcloud - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [00:34:00] FIRING: OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [00:42:41] RESOLVED: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://grafana.wikimedia.org/d/GWvEXWDZk/prometheus-server?var-datasource=codfw%20prometheus%2Fcloud - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [00:47:30] 10Quarry: Set query result retention time - https://phabricator.wikimedia.org/T360041#10216064 (10Base) If you do do this, it would be good to only remove the older runs results, but leave the most recent run result for each query, or as a less desirable alternative to keep those for only published queries (but... [00:47:41] FIRING: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [01:04:55] 10Tool-ldap, 10Phabricator: https://ldap.toolforge.org/ integration assumes that `cn` and `uid` are equivalent - https://phabricator.wikimedia.org/T376769#10216109 (10Legoktm) Deployed https://gitlab.wikimedia.org/toolforge-repos/ldap/-/commit/99594edfa46508acc8d3f286472e6f2dbd8c08e9 ` $ curl -I 'https://ldap... [01:07:41] RESOLVED: PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://grafana.wikimedia.org/d/GWvEXWDZk/prometheus-server?var-datasource=eqiad%20prometheus%2Fcloud - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [01:21:27] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [01:34:37] (03PS3) 10Pppery: Don't trigger on reopens [labs/tools/github-pr-closer] - 10https://gerrit.wikimedia.org/r/1079048 (https://phabricator.wikimedia.org/T374157) [01:34:37] (03CR) 10Pppery: "Untested, but sufficiently trivial it should work." [labs/tools/github-pr-closer] - 10https://gerrit.wikimedia.org/r/1079048 (https://phabricator.wikimedia.org/T374157) (owner: 10Pppery) [02:07:18] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [04:34:01] FIRING: OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [05:08:40] (03open) 10agarwalmahima: T376371 Added support for URL parameters to enable sharable links [toolforge-repos/yearinreview] - 10https://gitlab.wikimedia.org/toolforge-repos/yearinreview/-/merge_requests/7 [05:12:19] 10Tool-yearinreview: Add Support for URL Parameters to Enable Sharable Links - https://phabricator.wikimedia.org/T376371#10216215 (10MahimaSinghal) @Gopavasanth I have created a merge request for the above task : https://gitlab.wikimedia.org/toolforge-repos/yearinreview/-/merge_requests/7 This update introduce... [05:21:27] FIRING: CloudVPSDesignateLeaks: Detected 5 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:07:19] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [07:18:38] 10VPS-project-Codesearch, 10VPS-project-Extdist, 06collaboration-services, 10Gerrit, 13Patch-For-Review: Move clients off of gerrit-replica.wikimedia.org back to gerrit.wikimedia.org - https://phabricator.wikimedia.org/T336710#10216523 (10hashar) >>! In T336710#10215974, @Ladsgroup wrote: > BTW, last tim... [07:38:15] 10Cloud-VPS (Quota-requests), 10Continuous-Integration-Infrastructure: Quota increase for Integration project (Jenkins CI runners) - https://phabricator.wikimedia.org/T376847#10216566 (10dcaro) Is this going to be freed after the test? Also, it seems you are requesting more cpu and ram than 12 instances of th... [07:44:43] 06cloud-services-team, 10Toolforge: Deleting an envvar breaks ReplicaSet driven automatic restarts of a Pod (CreateContainerConfigError) - https://phabricator.wikimedia.org/T365048#10216586 (10dcaro) That's weird, the envvars are injected at the pod creation time, not in the deployment/replicaset specs, looking [07:48:35] 06cloud-services-team, 10Toolforge: Deleting an envvar breaks ReplicaSet driven automatic restarts of a Pod (CreateContainerConfigError) - https://phabricator.wikimedia.org/T365048#10216589 (10dcaro) yep, only pods, looking ` toolsbeta.test@toolsbeta-bastion-6:~$ kubectl get deployment -o json | jq '.items[].s... [08:07:42] 06cloud-services-team, 10Toolforge: Deleting an envvar breaks ReplicaSet driven automatic restarts of a Pod (CreateContainerConfigError) - https://phabricator.wikimedia.org/T365048#10216624 (10dcaro) I can't reproduce :/ Tried both with webservice and continuous job: ` toolsbeta.test@toolsbeta-bastion-6:~$ t... [08:18:21] 06cloud-services-team, 10Toolforge: Allow to remap/rename secrets from envvars when running specific toolforge job - https://phabricator.wikimedia.org/T376849#10216639 (10dcaro) Have you tried doing something like: ` --- - name: some_random_script_with_admin_permissions command: env PYWIKIBOT_DIR=$PWB_ADMIN_... [08:19:25] (03PS5) 10David Caro: toolforge.component.deploy: handle non-mr package deploys [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1078676 [08:19:45] !log dcaro@urcuchillay toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component toolforge-weld [08:19:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [08:20:54] !log dcaro@urcuchillay toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component toolforge-weld [08:20:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [08:21:23] (03CR) 10David Caro: "Tested with toolforge-weld in toolsbeta:" [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1078676 (owner: 10David Caro) [08:34:00] FIRING: OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [08:38:44] 06Toolforge-standards-committee, 06WMF-NDA-Requests: Volunteer NDA for SD0001 - https://phabricator.wikimedia.org/T374998#10216697 (10SD0001) I have emailed the details requested in T374993#10215543 to @KFrancis. [08:52:13] (03update) 10raymond-ndibe: Draft: [lima-kilo] cache container images [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/196 [09:21:27] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:31:05] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops: keepalived: it doesn't support mixing IPv4 and IPv6 VIPs on the same VRRP instance - https://phabricator.wikimedia.org/T376879 (10aborrero) 03NEW [10:07:19] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [10:20:48] 06cloud-services-team, 10Toolforge: Allow to remap/rename secrets from envvars when running specific toolforge job - https://phabricator.wikimedia.org/T376849#10217017 (10Edgars2007) >>! In T376849#10215977, @bd808 wrote: > remapping feature like this one could possibly also be used to limit which secrets are... [10:25:31] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 13Patch-For-Review: keepalived: it doesn't support mixing IPv4 and IPv6 VIPs on the same VRRP instance - https://phabricator.wikimedia.org/T376879#10217030 (10aborrero) 05Open→03Resolved [10:32:20] (03CR) 10FNegri: [C:03+1] "LGTM, left an optional suggestion." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1078676 (owner: 10David Caro) [10:51:04] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops: keepalived: it doesn't support mixing IPv4 and IPv6 VIPs on the same VRRP instance - https://phabricator.wikimedia.org/T376879#10217086 (10aborrero) 05Resolved→03In progress p:05Triage→03Medium I have detected there is no V... [11:00:31] 06cloud-services-team, 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops: Review/update wikitech-static syncing after wikitech moves to Kubernetes - https://phabricator.wikimedia.org/T374114#10217102 (10fnegri) I noticed there's an alert firing, probably related to this work: > MWVERSION WARNING - wikit... [11:10:15] 06cloud-services-team, 10Cloud-VPS: tofuinfratest creates many entries at - https://phabricator.wikimedia.org/T376888 (10fnegri) 03NEW [11:10:38] 06cloud-services-team, 10Cloud-VPS: tofuinfratest creates many pages in wikitech - https://phabricator.wikimedia.org/T376888#10217176 (10fnegri) [11:28:55] 06cloud-services-team, 10Cloud-VPS: Delete project tf-infra-test - https://phabricator.wikimedia.org/T376890 (10fnegri) 03NEW [11:32:04] 06cloud-services-team, 10Cloud-VPS: Delete project tf-infra-test - https://phabricator.wikimedia.org/T376890#10217228 (10fnegri) p:05Triage→03Low [12:16:04] (03open) 10dcaro: ansible.toolforge-deploy: allow passing any kind of ref [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/197 [12:20:53] (03update) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/20 [12:29:03] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 13Patch-For-Review: keepalived: it doesn't support mixing IPv4 and IPv6 VIPs on the same VRRP instance - https://phabricator.wikimedia.org/T376879#10217375 (10aborrero) still not working. I saw this weird tcpdump capture on cloud... [12:30:57] 06cloud-services-team, 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops: Review/update wikitech-static syncing after wikitech moves to Kubernetes - https://phabricator.wikimedia.org/T374114#10217393 (10Reedy) It's because there were MW releases last week and no one has updated wikitech-static yet :) [12:31:41] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 07Epic: [Hypothesis] WE6.3.1 Consulting Toolforge roots/maintainers - https://phabricator.wikimedia.org/T368601#10217399 (10Slst2020) Final document with all the gathered user stories and next steps: https://docs.google.com/docu... [12:31:49] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 07Epic: [Hypothesis] WE6.3.1 Consulting Toolforge roots/maintainers - https://phabricator.wikimedia.org/T368601#10217401 (10Slst2020) 05In progress→03Resolved [12:34:01] RESOLVED: OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [12:34:34] (03update) 10dcaro: deployment: Add the deployment endpoints and mock storage [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/18 (https://phabricator.wikimedia.org/T362069) [12:34:59] (03merge) 10dcaro: deployment: Add the deployment endpoints and mock storage [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/18 (https://phabricator.wikimedia.org/T362069) [12:37:26] (03update) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: DONOTMERGE components-api: bump to 0.0.29-20241002095441-cd2060f1 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/544 (https://phabricator.wikimedia.org/T362069) [12:38:29] (03open) 10dcaro: global: use fastapi.status instead of http [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/22 [12:38:58] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge, 07Epic: [Hypotesis] 6.3.5 Develop the sustainability score - https://phabricator.wikimedia.org/T376896 (10Slst2020) 03NEW [12:39:48] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge, 07Epic: [Hypotesis] 6.3.5 Develop the sustainability score - https://phabricator.wikimedia.org/T376896#10217449 (10Slst2020) [12:42:37] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 07Epic: [Hypotesis] 6.3.5 Develop the sustainability score - https://phabricator.wikimedia.org/T376896#10217456 (10Slst2020) a:05dcaro→03Slst2020 [12:42:42] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 07Epic: [Hypotesis] 6.3.5 Develop the sustainability score - https://phabricator.wikimedia.org/T376896#10217458 (10Slst2020) p:05Triage→03High [12:42:50] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 07Epic: [Hypotesis] 6.3.5 Develop the sustainability score - https://phabricator.wikimedia.org/T376896#10217465 (10Slst2020) [12:42:54] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 07Epic: [Hypotesis] 6.3.5 Develop the sustainability score - https://phabricator.wikimedia.org/T376896#10217466 (10Slst2020) 05Open→03In progress [12:44:32] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 07Epic: [Hypotesis] 6.3.5 Develop the sustainability score - https://phabricator.wikimedia.org/T376896#10217464 (10Slst2020) [12:46:51] (03PS2) 10Eugene233: Add Commons base file path to config file [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/954926 (https://phabricator.wikimedia.org/T312178) [12:50:28] (03approved) 10sstefanova: global: use fastapi.status instead of http [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/22 (owner: 10dcaro) [12:50:33] (03update) 10sstefanova: global: use fastapi.status instead of http [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/22 (owner: 10dcaro) [12:56:22] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops: keepalived: it doesn't support mixing IPv4 and IPv6 VIPs on the same VRRP instance - https://phabricator.wikimedia.org/T376879#10217541 (10cmooney) One thing I might be messing you up is the "authentication" section in /etc/keepali... [13:02:08] (03open) 10dcaro: components-api.local: deploy from tools repo [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/553 [13:05:31] (03update) 10dcaro: components-api.local: deploy from tools repo [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/553 [13:10:04] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops: keepalived: it doesn't support mixing IPv4 and IPv6 VIPs on the same VRRP instance - https://phabricator.wikimedia.org/T376879#10217599 (10aborrero) >>! In T376879#10217541, @cmooney wrote: > One thing I might be messing you up is... [13:10:20] (03update) 10dcaro: components-api.local: deploy from tools repo [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/553 [13:10:26] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops: keepalived: it doesn't support mixing IPv4 and IPv6 VIPs on the same VRRP instance - https://phabricator.wikimedia.org/T376879#10217600 (10aborrero) there is also this warning in the logs: Oct 10 13:07:05 cloudgw2002-dev Keepalive... [13:12:59] (03update) 10sstefanova: components-api.local: deploy from tools repo [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/553 (owner: 10dcaro) [13:13:00] (03approved) 10sstefanova: components-api.local: deploy from tools repo [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/553 (owner: 10dcaro) [13:13:31] (03merge) 10dcaro: components-api.local: deploy from tools repo [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/553 [13:21:27] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:39:43] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: CloudVPS: IPv6 in codfw1dev - https://phabricator.wikimedia.org/T245495#10217752 (10cmooney) [13:42:32] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: CloudVPS: IPv6 in codfw1dev - https://phabricator.wikimedia.org/T245495#10217761 (10cmooney) [13:43:11] 10Tool-inteGraality, 07Documentation, 07good first task: Add information to InteGraality user guide from Wikipedia Workbook for Cultural Institutions - https://phabricator.wikimedia.org/T376902 (10TBurmeister) 03NEW Thank you for tagging this task with #good_first_task for Wikimedia newcomers! Newcomers o... [13:43:41] 10Tool-inteGraality, 07Documentation, 07good first task: Add information to InteGraality user guide from Wikipedia Workbook for Cultural Institutions - https://phabricator.wikimedia.org/T376902#10217777 (10TBurmeister) p:05Triage→03Low [13:46:53] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: cloudsw: codfw: enable IPv6 - https://phabricator.wikimedia.org/T374713#10217784 (10cmooney) 05Open→03Resolved This is now complete, the cloudsw is set up to route the networks are required and announcing them upst... [13:50:16] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: openstack: initial IPv6 support in neutron - https://phabricator.wikimedia.org/T375847#10217807 (10cmooney) >>! In T375847#10195673, @aborrero wrote: > `lang=shell-session > root@ipv6-test-1:~# ip -br a > lo... [13:51:36] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS: tf-infra-test fails creating dbs and k8s cluster - https://phabricator.wikimedia.org/T376802#10217812 (10fnegri) Restarting RabbitMQ did not fix the issue, but I discovered something: adding the "ssh-from-anywhere" SG to a broken instance makes it move fr... [13:54:11] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: CloudVPS: IPv6 in codfw1dev - https://phabricator.wikimedia.org/T245495#10217836 (10cmooney) The edge (cloudsw/cr) networking is now complete, elements in the range are reachable externally. ` cathal@officepc:~$ mtr -z -b... [14:07:19] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [14:37:43] (03PS1) 10Brouberol: idp: add dummy client secret for aitflow_analytics_test [labs/private] - 10https://gerrit.wikimedia.org/r/1079296 (https://phabricator.wikimedia.org/T374948) [14:39:33] (03CR) 10Bking: [C:03+2] idp: add dummy client secret for aitflow_analytics_test [labs/private] - 10https://gerrit.wikimedia.org/r/1079296 (https://phabricator.wikimedia.org/T374948) (owner: 10Brouberol) [14:39:38] (03CR) 10Bking: [V:03+2 C:03+2] idp: add dummy client secret for aitflow_analytics_test [labs/private] - 10https://gerrit.wikimedia.org/r/1079296 (https://phabricator.wikimedia.org/T374948) (owner: 10Brouberol) [15:03:25] 10ToolforgeBundle, 10AhoCorasick, 10at-ease, 10base_convert, and 24 others: Make new releases of all Wikimedia-authored PHP libraries, and bump their usages (mid-2021) - https://phabricator.wikimedia.org/T287972#10218098 (10Reedy) [15:36:16] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS: tf-infra-test fails creating dbs and k8s cluster - https://phabricator.wikimedia.org/T376802#10218256 (10fnegri) 05In progress→03Resolved This issue was caused by the removal of [default security group rules](https://docs.openstack.org/python-open... [15:55:28] 06cloud-services-team, 10Cloud-VPS: tofuinfratest creates many pages in wikitech - https://phabricator.wikimedia.org/T376888#10218442 (10bd808) I thought there might already be an off switch for this, but I am not finding one in wmfkeystonehooks.py. My suggestion would be to add a check in `KeystoneHooks._on_p... [15:58:31] (03CR) 10Dzahn: "shouldn't this have been submitted automatically by now?" [labs/codesearch] - 10https://gerrit.wikimedia.org/r/920243 (https://phabricator.wikimedia.org/T336710) (owner: 10Hashar) [15:59:26] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops: keepalived: it doesn't support mixing IPv4 and IPv6 VIPs on the same VRRP instance - https://phabricator.wikimedia.org/T376879#10218489 (10aborrero) >>! In T376879#10217600, @aborrero wrote: > there is also this warning in the logs... [16:02:32] (03Merged) 10jenkins-bot: Revert "switch to main gerrit server instead of using the replica" [labs/codesearch] - 10https://gerrit.wikimedia.org/r/920243 (https://phabricator.wikimedia.org/T336710) (owner: 10Hashar) [16:04:36] 06cloud-services-team, 10Cloud-VPS: tofuinfratest creates many pages in wikitech - https://phabricator.wikimedia.org/T376888#10218510 (10fnegri) > Is "tofuinfratest" the only one for now or do the older full stack tests also create new projects? Judging by https://wikitech.wikimedia.org/wiki/Special:Contribu... [16:07:48] 06cloud-services-team, 10Toolforge: Allow to remap/rename secrets from envvars when running specific toolforge job - https://phabricator.wikimedia.org/T376849#10218533 (10dcaro) > I was mentioning fiddling only with one variable, which seems ok-ish, but what if i need to fiddle with 5 variables? Using the same... [16:17:23] 10VPS-project-Codesearch, 10VPS-project-Extdist, 06collaboration-services, 10Gerrit, 13Patch-For-Review: Move clients off of gerrit-replica.wikimedia.org back to gerrit.wikimedia.org - https://phabricator.wikimedia.org/T336710#10218571 (10Dzahn) I had already clicked +2 on that change yesterday and it ha... [16:40:02] (03open) 10dcaro: kubernetes_config: use the default namespace if non in kubeconfig [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/62 [16:45:43] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 07Epic: [Hypothesis] WE6.3.4 If we enable the automatic deployment of a minimal tool, we will be able to evaluate the end to end flow and set the groundwork for adding support f... - https://phabricator.wikimedia.org/T375199#10218664 [16:57:42] (03open) 10dcaro: api_client: when loading cert data from kubeconfig try base64 too [repos/cloud/toolforge/toolforge-weld] (add_default_namespace) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/63 [17:02:21] (03update) 10dcaro: global: use fastapi.status instead of http [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/22 [17:09:28] (03update) 10dcaro: global: use fastapi.status instead of http [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/22 [17:09:41] (03approved) 10dcaro: global: use fastapi.status instead of http [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/22 [17:09:44] (03merge) 10dcaro: global: use fastapi.status instead of http [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/22 [17:10:37] (03open) 10dcaro: Draft: Add the creation of a single continuous job [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/23 [17:12:12] (03update) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: DONOTMERGE components-api: bump to 0.0.29-20241002095441-cd2060f1 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/544 (https://phabricator.wikimedia.org/T362069) [17:16:54] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, and 2 others: openstack: work out IPv6 and designate integration - https://phabricator.wikimedia.org/T374715#10218766 (10cmooney) Reverse delegation is now working for the ranges we've assigned to OpenStack. I've not gotten an ans... [17:18:48] (03update) 10dcaro: Draft: Add the creation of a single continuous job [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/23 (https://phabricator.wikimedia.org/T362066) [17:19:15] (03open) 10dcaro: Add superuser support [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/43 (https://phabricator.wikimedia.org/T362066) [17:21:27] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:26:40] 06cloud-services-team, 10Toolforge: Deleting an envvar breaks ReplicaSet driven automatic restarts of a Pod (CreateContainerConfigError) - https://phabricator.wikimedia.org/T365048#10218850 (10bd808) >>! In T365048#10216624, @dcaro wrote: > Anything else you were doing when it failed? Can you still reproduce?... [17:37:23] (03update) 10dcaro: Draft: Add the creation of a single continuous job [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/23 (https://phabricator.wikimedia.org/T362066) [17:38:40] (03update) 10dcaro: Draft: Add the creation of a single continuous job [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/23 (https://phabricator.wikimedia.org/T362066) [17:58:33] (03update) 10dcaro: Draft: Add the creation of a single continuous job [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/23 (https://phabricator.wikimedia.org/T362066) [18:07:18] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [18:08:06] 10Tool-inteGraality, 07Documentation, 07good first task: Add information to InteGraality user guide from Wikipedia Workbook for Cultural Institutions - https://phabricator.wikimedia.org/T376902#10219038 (10varadshete) Creating and Customizing Dashboards with InteGraality Generating a Dashboard To create a da... [18:28:04] 10Tool-toolwatch: Implementing alert system to notify maintainers of downtime - https://phabricator.wikimedia.org/T368816#10219107 (10MahimaSinghal) >>! In T368816#10210400, @Tacsipacsi wrote: > Who will receive the notification emails? The //Author// column contains the first (why only the first?) author of the... [19:02:50] 10Tool-nfp: Poor grammar on NFP disclaimer header (with patch) - https://phabricator.wikimedia.org/T376935 (10JayCubby) 03NEW [19:22:42] 10Tool-quickcategories: QuickCategories background runner sometimes hangs for no apparent reason - https://phabricator.wikimedia.org/T374152#10219283 (10Theklan) 05Resolved→03Open This happened again today, so I don't know if the restart system is working or it will take a while. [19:52:03] 06cloud-services-team, 10Toolforge, 07User-notice: Copyvios tool: investigate/block suspicious web traffic - https://phabricator.wikimedia.org/T285450#10219438 (10MusikAnimal) This is amazing! Look at the difference it's making: {F57604960} I think running out of our daily quota is officially a thing of th... [19:55:44] 10Tool-toolwatch: Implementing alert system to notify maintainers of downtime - https://phabricator.wikimedia.org/T368816#10219466 (10bd808) >>! In T368816#10219107, @MahimaSinghal wrote: > In that case one more column could be created which lists maintainers of the tools, and then the mail could be sent to main... [20:06:18] 10Cloud-VPS (Quota-requests), 10Continuous-Integration-Infrastructure: Quota increase for Integration project (Jenkins CI runners) - https://phabricator.wikimedia.org/T376847#10219529 (10bd808) >>! In T376847#10216566, @dcaro wrote: > Is this going to be freed after the test? I would expect the quota to stay... [20:15:43] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops: keepalived: it doesn't support mixing IPv4 and IPv6 VIPs on the same VRRP instance - https://phabricator.wikimedia.org/T376879#10219564 (10Multichill) Ipv6 vrrp is all link-local if I recall correctly. Did you configure it like that? [20:24:37] 10Tool-toolwatch: Implementing alert system to notify maintainers of downtime - https://phabricator.wikimedia.org/T368816#10219579 (10Multichill) How are you going to handle planned work on underlying infrastructure? Will you send out alarms or will you correlate it to the planned work so people know what is goi... [20:30:03] (03CR) 10Dzahn: [C:03+2] "This is now deployed. And I did run these commands. but: stopped all hound services, moved /srv/hound, started all hound services, starte" [labs/codesearch] - 10https://gerrit.wikimedia.org/r/920243 (https://phabricator.wikimedia.org/T336710) (owner: 10Hashar) [20:31:46] 10VPS-project-Codesearch, 10VPS-project-Extdist, 06collaboration-services, 10Gerrit, 13Patch-For-Review: Move clients off of gerrit-replica.wikimedia.org back to gerrit.wikimedia.org - https://phabricator.wikimedia.org/T336710#10219597 (10Dzahn) This is now deployed. I did run these commands. but after... [21:21:27] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [21:43:07] 10wikitech.wikimedia.org: ☂ Wikitech account linking and SUL error reporting - https://phabricator.wikimedia.org/T376267#10219873 (10Maurusian) My problem was solved. I was able to do the password reset with my SUL username (maybe that's the username I had on Wikitech in any case, I'm not sure). [21:45:12] 10Tool-quickcategories: QuickCategories background runner sometimes hangs for no apparent reason - https://phabricator.wikimedia.org/T374152#10219874 (10LucasWerkmeister) 05Open→03Resolved That turned out to be a different problem (the “bad content format” pages at the bottom of https://quickcategories.t... [22:07:18] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [22:43:50] 06Toolforge-standards-committee, 06WMF-NDA-Requests: Volunteer NDA for Waldir Pimenta (Waldyrious) - https://phabricator.wikimedia.org/T375110#10220050 (10waldyrious) For the record, I have emailed @KFrancis my details, got the NDA document from her and have signed it. Is there a confirmation step needed, of s... [22:47:33] FIRING: PuppetCertificateAboutToExpire: Puppet CA certificate mwv-builder-03.mediawiki-vagrant.eqiad.wmflabs is about to expire in 14d 23h 58m 34s - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetCertificateAboutToExpire - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetCertificateAboutToExpire [22:54:38] 06cloud-services-team, 10Cloud-VPS, 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 10Release-Engineering-Team (Seen): Various CI jobs failing with: Could not resolve host: gerrit.wikimedia.org - https://phabricator.wikimedia.org/T374830#10220077 (10Umherirrender) F... [23:06:42] 06Toolforge-standards-committee, 06WMF-NDA-Requests: Volunteer NDA for Waldir Pimenta (Waldyrious) - https://phabricator.wikimedia.org/T375110#10220167 (10KFrancis) Hi all, I am confirming the NDA is complete. Thanks! [23:19:57] 06Toolforge-standards-committee, 06WMF-NDA-Requests: Volunteer NDA for Waldir Pimenta (Waldyrious) - https://phabricator.wikimedia.org/T375110#10220203 (10bd808) 05Open→03Resolved a:03bd808 >>! In T375110#10220167, @KFrancis wrote: > Hi all, I am confirming the NDA is complete. Thanks! Thanks @KFra... [23:20:04] 06Toolforge-standards-committee: Facilitate Volunteer NDA application process for 2024 Toolforge standards committee appointees - https://phabricator.wikimedia.org/T374993#10220209 (10bd808) [23:24:20] 06Toolforge-standards-committee, 06WMF-NDA-Requests: Volunteer NDA for Lucas Werkmeister - https://phabricator.wikimedia.org/T375001#10220214 (10bd808) >>! In T374993#10215584, @LucasWerkmeister wrote: > In my case, it turned out that I already signed the NDA but wasn’t added to the #wmf-nda project (see T3750... [23:26:15] 06Toolforge-standards-committee, 06WMF-NDA-Requests: Volunteer NDA for Lucas Werkmeister - https://phabricator.wikimedia.org/T375001#10220221 (10bd808) 05Open→03Resolved a:03bd808 I added @LucasWerkmeister to #WMF-NDA per comments by @LucasWerkmeister and @KFrancis quoted in T375001#10220212 [23:26:37] 06Toolforge-standards-committee: Facilitate Volunteer NDA application process for 2024 Toolforge standards committee appointees - https://phabricator.wikimedia.org/T374993#10220226 (10bd808) [23:26:58] 06Toolforge-standards-committee, 06WMF-NDA-Requests: Volunteer NDA for Lucas Werkmeister - https://phabricator.wikimedia.org/T375001#10220227 (10LucasWerkmeister) Yay, thank you! [23:27:28] 06Toolforge-standards-committee: Facilitate Volunteer NDA application process for 2024 Toolforge standards committee appointees - https://phabricator.wikimedia.org/T374993#10220232 (10bd808) [23:34:09] 06Toolforge-standards-committee, 06WMF-NDA-Requests: Volunteer NDA for JJMC89 - https://phabricator.wikimedia.org/T375041#10220245 (10bd808) @JJMC89 If you have not yet contacted @kfrancis via email, please see T374993#10215543 for the information she needs from you to to kick off the signing process. [23:34:22] 06Toolforge-standards-committee, 06WMF-NDA-Requests: Volunteer NDA for TheProtonade - https://phabricator.wikimedia.org/T375007#10220252 (10bd808) @theprotonade If you have not yet contacted @kfrancis via email, please see T374993#10215543 for the information she needs from you to to kick off the signing process. [23:34:52] 06Toolforge-standards-committee, 06WMF-NDA-Requests: Volunteer NDA for Antonin Delpeuch (Pintoch) - https://phabricator.wikimedia.org/T374995#10220248 (10bd808) @Pintoch If you have not yet contacted @kfrancis via email, please see T374993#10215543 for the information she needs from you to to kick off the sign... [23:35:46] 06Toolforge-standards-committee, 06WMF-NDA-Requests: Volunteer NDA for JJMC89 - https://phabricator.wikimedia.org/T375041#10220254 (10JJMC89) I did. She has my info from another NDA.