[00:07:09] !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [00:07:28] (PuppetAgentStaleLastRun) firing: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:12:28] (PuppetAgentStaleLastRun) resolved: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:16:58] !log andrew@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [00:20:28] (PuppetAgentFailure) resolved: Puppet agent failure detected on instance toolsbeta-test-k8s-etcd-29 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [00:32:26] !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [00:41:13] !log andrew@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [00:42:28] (PuppetAgentNoResources) resolved: No Puppet resources found on instance toolsbeta-test-k8s-etcd-28 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [00:42:58] (PuppetAgentStaleLastRun) resolved: Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-etcd-28 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:47:28] (InstanceDown) firing: Project toolsbeta instance toolsbeta-test-k8s-etcd-28 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:52:28] (InstanceDown) resolved: Project toolsbeta instance toolsbeta-test-k8s-etcd-28 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [02:25:56] !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [02:26:32] !log andrew@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [02:48:20] !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [02:56:32] !log andrew@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [03:02:28] (InstanceDown) firing: Project toolsbeta instance toolsbeta-test-k8s-etcd-27 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [03:07:28] (InstanceDown) resolved: Project toolsbeta instance toolsbeta-test-k8s-etcd-27 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [04:07:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance gitlab-runners-puppetserver-01 on project gitlab-runners - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:11:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance cvn-app10 on project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:12:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance bastion on project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:13:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance extdist-06 on project extdist - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:13:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance metricsinfra-puppetserver-1 on project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:14:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance tf-bastion on project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:14:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance etcd-discovery-1 on project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:16:28] (PuppetAgentNoResources) firing: (3) No Puppet resources found on instance cvn-apache10 on project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:17:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance project-proxy-puppetserver-1 on project project-proxy - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:20:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance clouddb-services-puppetserver-1 on project clouddb-services - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:22:28] (PuppetAgentNoResources) firing: (2) No Puppet resources found on instance bastion on project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:24:28] (PuppetAgentNoResources) firing: (2) No Puppet resources found on instance cloudinfra-internal-puppetserver-1 on project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:29:28] (PuppetAgentNoResources) firing: (3) No Puppet resources found on instance cloudinfra-internal-puppetserver-1 on project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:34:28] (PuppetAgentNoResources) firing: (4) No Puppet resources found on instance cloudinfra-internal-puppetserver-1 on project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:36:28] (PuppetAgentNoResources) firing: (4) No Puppet resources found on instance cvn-apache10 on project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:39:28] (PuppetAgentNoResources) firing: (4) No Puppet resources found on instance cloudinfra-internal-puppetserver-1 on project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:41:28] (PuppetAgentNoResources) firing: (4) No Puppet resources found on instance cvn-apache10 on project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:51:28] (PuppetAgentNoResources) firing: (4) No Puppet resources found on instance cvn-apache10 on project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:52:28] (PuppetAgentNoResources) resolved: No Puppet resources found on instance gitlab-runners-puppetserver-01 on project gitlab-runners - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:53:28] (PuppetAgentNoResources) resolved: No Puppet resources found on instance extdist-06 on project extdist - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:54:28] (PuppetAgentNoResources) resolved: No Puppet resources found on instance tf-bastion on project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:54:28] (PuppetAgentNoResources) resolved: (4) No Puppet resources found on instance cloudinfra-internal-puppetserver-1 on project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:56:28] (PuppetAgentNoResources) resolved: (4) No Puppet resources found on instance cvn-apache10 on project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:57:28] (PuppetAgentNoResources) firing: (2) No Puppet resources found on instance bastion on project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [04:58:28] (PuppetAgentNoResources) resolved: No Puppet resources found on instance metricsinfra-puppetserver-1 on project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [05:02:28] (PuppetAgentNoResources) resolved: (2) No Puppet resources found on instance bastion on project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [05:02:28] (PuppetAgentNoResources) resolved: No Puppet resources found on instance project-proxy-puppetserver-1 on project project-proxy - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [05:05:28] (PuppetAgentNoResources) resolved: No Puppet resources found on instance clouddb-services-puppetserver-1 on project clouddb-services - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [06:13:34] 10Quarry: Error 500 when clicking "stop query" - https://phabricator.wikimedia.org/T362213 (10Novem_Linguae) 03NEW [07:24:58] 10Toolforge (Toolforge iteration 08), 13Patch-For-Review, 07Software-Licensing: [builds-api] builds-api is missing a software license - https://phabricator.wikimedia.org/T361007#9703035 (10CodeReviewBot) sstefanova merged https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/83 dev:... [07:31:15] 10Toolforge (Toolforge iteration 08), 07Software-Licensing: 14[builds-api] builds-api is missing a software license - 14https://phabricator.wikimedia.org/T361007#9703040 (10Slst2020) 05In progress→03Resolved [07:33:41] 10Toolforge (Toolforge iteration 08), 13Patch-For-Review, 07Software-Licensing: 14[builds-api] builds-api is missing a software license - 14https://phabricator.wikimedia.org/T361007#9703043 (10CodeReviewBot) 14project_1317_bot_df3177307bed93c3f34e421e26c86e38 opened https://gitlab.wikimedia.org/repos/cl... [07:37:26] 10Quarry: Error 500 when clicking "stop query" - https://phabricator.wikimedia.org/T362213#9703050 (10Novem_Linguae) [07:44:40] 10Quarry: Error 500 when clicking "stop query" - https://phabricator.wikimedia.org/T362213#9703054 (10Novem_Linguae) [08:08:00] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge, 05Goal: [harbor] Deploy with Helm - https://phabricator.wikimedia.org/T356301#9703085 (10dcaro) [08:15:05] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge, 05Goal: [harbor] Deploy with Helm - https://phabricator.wikimedia.org/T356301#9703101 (10Slst2020) [08:16:35] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge, 05Goal: [harbor] Deploy with Helm - https://phabricator.wikimedia.org/T356301#9703102 (10dcaro) [08:17:31] 10Toolforge (Toolforge iteration 08), 13Patch-For-Review, 07Software-Licensing: 14[builds-api] builds-api is missing a software license - 14https://phabricator.wikimedia.org/T361007#9703103 (10CodeReviewBot) 14sstefanova merged https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_... [08:22:20] 06cloud-services-team, 10Cloud-VPS, 10Infrastructure Security, 10Puppet (Puppet 7.0): cloudcumin: support reimage and other operations - https://phabricator.wikimedia.org/T344412#9703109 (10taavi) [08:25:00] 10Data-Services, 10Transcriber: 14Table 'banwikisource_p.user_properties_anon' doesn't exist - 14https://phabricator.wikimedia.org/T330731#9703128 (10taavi) 05Open→03Invalid [08:28:46] 10Data-Services: 14View 'altwiki_p.page' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them - 14https://phabricator.wikimedia.org/T310488#9703130 (10taavi) 05Open→03Resolved 14` MariaDB [altwiki_p]> select * from page limit 1; +---------+----------... [08:29:54] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 07Epic, 05Goal: 14openstack eqiad1: introduce cloud-private and cloudlb - 14https://phabricator.wikimedia.org/T341060#9703132 (10taavi) 05In progress→03Resolved 14Let's call this done. [08:37:24] 06cloud-services-team, 10Data-Services: 14Investigate and/or deploy LACP to NFS servers for Cloud Services - 14https://phabricator.wikimedia.org/T204359#9703150 (10taavi) 05Open→03Declined 14Declining since this doesn't seem realistic with our NFS servers now in Cloud VPS. [08:39:13] 06cloud-services-team, 10Data-Services, 06Infrastructure-Foundations, 06SRE: 14Switch labstore servers to default SSH configuration - 14https://phabricator.wikimedia.org/T177914#9703154 (10taavi) 05Open→03Invalid 14Closing as we've moved the NFS servers to Cloud VPS VMs and I'm pretty sure we did... [08:52:24] (03PS1) 10Majavah: tools: Also don't notify admins about membership comments [labs/striker] - 10https://gerrit.wikimedia.org/r/1018645 [08:55:18] (03CR) 10CI reject: [V:04-1] tools: Also don't notify admins about membership comments [labs/striker] - 10https://gerrit.wikimedia.org/r/1018645 (owner: 10Majavah) [09:00:34] (03CR) 10FNegri: [C:03+1] "LGTM!" [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1018241 (owner: 10Majavah) [09:09:48] 10Toolforge, 07Epic: [component-api] First iteration of the component API - https://phabricator.wikimedia.org/T362051#9703185 (10dcaro) [09:13:09] 10Toolforge, 07Epic: [component-api] First iteration of the component API - https://phabricator.wikimedia.org/T362051#9703192 (10dcaro) [09:14:27] (03PS2) 10Majavah: tools: Also don't notify admins about membership comments [labs/striker] - 10https://gerrit.wikimedia.org/r/1018645 [09:15:38] (03CR) 10Majavah: [C:03+2] openstack: cloudgw: Migrate to spicerack logging and alerting [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1018241 (owner: 10Majavah) [09:15:52] (03CR) 10CI reject: [V:04-1] tools: Also don't notify admins about membership comments [labs/striker] - 10https://gerrit.wikimedia.org/r/1018645 (owner: 10Majavah) [09:18:51] (03Merged) 10jenkins-bot: openstack: cloudgw: Migrate to spicerack logging and alerting [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1018241 (owner: 10Majavah) [09:25:12] 06cloud-services-team, 10Striker, 06DBA, 13Patch-For-Review: Create a database for Striker test instance - https://phabricator.wikimedia.org/T360149#9703202 (10ABran-WMF) 05Open→03In progress p:05Triage→03Medium a:03ABran-WMF Database has been created, account and admin account SQL passwords are... [09:46:32] 10Toolforge: [component-api] Get a skeleton of API webservice and implement `/tool//deploy` with build-only features - https://phabricator.wikimedia.org/T362069#9703233 (10dcaro) [09:46:48] 10Toolforge: [component-api] Get a skeleton of API webservice - https://phabricator.wikimedia.org/T362069#9703228 (10dcaro) [09:47:54] (03PS3) 10Majavah: tools: Also don't notify admins about membership comments [labs/striker] - 10https://gerrit.wikimedia.org/r/1018645 [09:47:54] (03PS1) 10Majavah: contrib: More fixes to setup guide [labs/striker] - 10https://gerrit.wikimedia.org/r/1018651 [09:52:02] (03PS1) 10Majavah: Update example Striker hiera [labs/private] - 10https://gerrit.wikimedia.org/r/1018652 [09:54:33] (03CR) 10Majavah: [V:03+2 C:03+2] Update example Striker hiera [labs/private] - 10https://gerrit.wikimedia.org/r/1018652 (owner: 10Majavah) [10:28:57] (03PS1) 10Muehlenhoff: puppetboard: Remove obsolete cert [labs/private] - 10https://gerrit.wikimedia.org/r/1018663 [10:33:39] (03CR) 10Muehlenhoff: [V:03+2 C:03+2] puppetboard: Remove obsolete cert [labs/private] - 10https://gerrit.wikimedia.org/r/1018663 (owner: 10Muehlenhoff) [10:41:02] (03CR) 10Majavah: [C:03+2] tools: Also don't notify admins about membership comments [labs/striker] - 10https://gerrit.wikimedia.org/r/1018645 (owner: 10Majavah) [10:41:03] (03CR) 10Majavah: [C:03+2] contrib: More fixes to setup guide [labs/striker] - 10https://gerrit.wikimedia.org/r/1018651 (owner: 10Majavah) [10:42:26] (03Merged) 10jenkins-bot: tools: Also don't notify admins about membership comments [labs/striker] - 10https://gerrit.wikimedia.org/r/1018645 (owner: 10Majavah) [10:43:59] (03Merged) 10jenkins-bot: contrib: More fixes to setup guide [labs/striker] - 10https://gerrit.wikimedia.org/r/1018651 (owner: 10Majavah) [10:49:33] 10Toolforge: [component-api] Get a skeleton of API webservice and implement `/tool//deploy` with build-only features - https://phabricator.wikimedia.org/T362069#9703335 (10dcaro) [10:53:24] 10Toolforge: [component-api] Get a minimal version of the config with build-only data - https://phabricator.wikimedia.org/T362070#9703345 (10dcaro) p:05Triage→03High [10:53:37] 10Toolforge: [component-api] Get a skeleton of API webservice and implement `/tool//deploy` with build-only features - https://phabricator.wikimedia.org/T362069#9703347 (10dcaro) p:05Triage→03High [10:59:24] 10Toolforge: [component-api] Get a minimal version of the config with build-only data - https://phabricator.wikimedia.org/T362070#9703353 (10dcaro) [11:01:26] 10Toolforge: [components-api] Add minimal cli with build-only features - https://phabricator.wikimedia.org/T362082#9703355 (10dcaro) [11:05:38] 10Toolforge: [component-api] Get a skeleton of API webservice and implement `/tool//deploy` with build-only features - https://phabricator.wikimedia.org/T362069#9703364 (10dcaro) [11:11:04] (03PS1) 10Majavah: build: Allow users to set the port to bind on [labs/striker] - 10https://gerrit.wikimedia.org/r/1018674 (https://phabricator.wikimedia.org/T360025) [11:12:18] (03CR) 10Majavah: [C:03+2] build: Allow users to set the port to bind on [labs/striker] - 10https://gerrit.wikimedia.org/r/1018674 (https://phabricator.wikimedia.org/T360025) (owner: 10Majavah) [11:13:39] (03Merged) 10jenkins-bot: build: Allow users to set the port to bind on [labs/striker] - 10https://gerrit.wikimedia.org/r/1018674 (https://phabricator.wikimedia.org/T360025) (owner: 10Majavah) [11:19:37] (03PS1) 10Muehlenhoff: Remove obsolete dummy certs for docker-registry and testreduce [labs/private] - 10https://gerrit.wikimedia.org/r/1018678 (https://phabricator.wikimedia.org/T360636) [11:20:07] 10Toolforge: [component-api] Develop the webhook mechanism to trigger a deploment - https://phabricator.wikimedia.org/T362066#9703394 (10dcaro) p:05Triage→03High [11:20:41] (03CR) 10Clément Goubert: [C:03+1] Remove obsolete dummy certs for docker-registry and testreduce [labs/private] - 10https://gerrit.wikimedia.org/r/1018678 (https://phabricator.wikimedia.org/T360636) (owner: 10Muehlenhoff) [11:21:39] (03CR) 10Muehlenhoff: [V:03+2 C:03+2] Remove obsolete dummy certs for docker-registry and testreduce [labs/private] - 10https://gerrit.wikimedia.org/r/1018678 (https://phabricator.wikimedia.org/T360636) (owner: 10Muehlenhoff) [11:42:41] (CloudVPSDesignateLeaks) firing: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [11:47:41] (CloudVPSDesignateLeaks) firing: (3) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [12:09:48] 10Toolforge: Fix toolsbeta.test5 account name (or delete it) - https://phabricator.wikimedia.org/T362223 (10taavi) 03NEW [12:10:44] 10Toolforge: Fix toolsbeta.test5 account name (or delete it) - https://phabricator.wikimedia.org/T362223#9703562 (10taavi) ldapvi's rename mode segfaults for some reason: `lang=shell-session taavi@mwmaint1002 ~ $ sudo ldapvi --rename -b ou=servicegroups cn=toolsbeta.toolsbeta.test5,ou=servicegroups,dc=wikimedia,... [12:18:55] 10Toolforge: Fix toolsbeta.test5 account name (or delete it) - https://phabricator.wikimedia.org/T362223#9703589 (10taavi) `lang=shell-session taavi@mwmaint1002 ~ $ sudo ldapvi --delete -b ou=servicegroups,dc=wikimedia,dc=org cn=toolsbeta.toolsbeta.test5,ou=servicegroups,dc=wikimedia,dc=org add: 0, rename: 0, mo... [12:29:15] 10Cloud Services Proposals: Decision request - What to use for toolforge components api task execution - https://phabricator.wikimedia.org/T362224 (10dcaro) 03NEW [12:29:37] 10Cloud Services Proposals: Decision request - What to use for toolforge components api task execution - https://phabricator.wikimedia.org/T362224#9703614 (10dcaro) [12:37:27] 10Cloud Services Proposals: Decision request - What to use for toolforge components api task execution - https://phabricator.wikimedia.org/T362224#9703625 (10dcaro) [12:37:28] 10Toolforge: [component-api] add one-off, scheduled and continuous jobs support to the yaml + api (unrefined) - https://phabricator.wikimedia.org/T362075#9703626 (10dcaro) [12:37:30] 10Cloud Services Proposals: Decision request - What to use for toolforge components api task execution - https://phabricator.wikimedia.org/T362224#9703627 (10dcaro) p:05Triage→03High [12:52:58] 10Toolforge, 07Epic: [component-api] First iteration of the component API - https://phabricator.wikimedia.org/T362051#9703671 (10dcaro) [12:53:35] 10Toolforge: [component-api] add one-off, scheduled and continuous jobs support to the yaml + api (to refine) - https://phabricator.wikimedia.org/T362075#9703667 (10dcaro) [12:56:30] 10Toolforge: [component-api] add one-off, scheduled and continuous jobs support to the yaml + api - https://phabricator.wikimedia.org/T362075#9703677 (10dcaro) [12:56:35] 10Toolforge: [component-api] add one-off, scheduled and continuous jobs support to the yaml + api - https://phabricator.wikimedia.org/T362075#9703678 (10dcaro) [12:56:43] 10Toolforge: [component-api] add one-off, scheduled and continuous jobs support to the yaml + api - https://phabricator.wikimedia.org/T362075#9703679 (10dcaro) p:05Triage→03High [12:56:56] 10Toolforge: [component-api] add one-off, scheduled and continuous jobs support to the yaml + api - https://phabricator.wikimedia.org/T362075#9703680 (10dcaro) [12:58:12] 10Toolforge: [components-api] Add minimal cli with build-only features - https://phabricator.wikimedia.org/T362082#9703684 (10dcaro) p:05Triage→03High [12:58:49] 10Toolforge (Toolforge iteration 08), 13Patch-For-Review: [builds-api,jobs-api,envvars-api,api-gateway] Figure out and document how to do non-backwards compatible changes - https://phabricator.wikimedia.org/T356974#9703685 (10dcaro) [13:08:08] 06cloud-services-team, 10Striker, 06DBA, 13Patch-For-Review: Create a database for Striker test instance - https://phabricator.wikimedia.org/T360149#9703729 (10taavi) Thank you @ABran-WMF! Both of the credentials work fine. Would it be possible to change the default character set and collation to `utf8mb4... [13:28:34] 10cloud-services-team (FY2023/2024-Q3-Q4), 06Infrastructure-Foundations, 10Spicerack, 10SRE-tools, 13Patch-For-Review: Remove elasticsearch-curator dependency from Spicerack/Elastic cookbooks - https://phabricator.wikimedia.org/T361647#9703767 (10Volans) a:05Volans→03None De-assigning it from me as B... [13:30:56] 10cloud-services-team (FY2023/2024-Q3-Q4), 06Infrastructure-Foundations, 10Spicerack, 10SRE-tools, and 2 others: Remove elasticsearch-curator dependency from Spicerack/Elastic cookbooks - https://phabricator.wikimedia.org/T361647#9703798 (10bking) a:03RKemper [13:32:09] 10cloud-services-team (FY2023/2024-Q3-Q4), 06Infrastructure-Foundations, 10Spicerack, 10SRE-tools, and 2 others: Remove elasticsearch-curator dependency from Spicerack/Elastic cookbooks - https://phabricator.wikimedia.org/T361647#9703819 (10bking) Assigning to @RKemper /adding DPE SRE tags. [13:32:21] 10Wikibugs: Skip black color for wikibugs task updates - https://phabricator.wikimedia.org/T362230 (10fgiunchedi) 03NEW [13:46:31] (03PS1) 10Majavah: Add option to show a environment-specific banner [labs/striker] - 10https://gerrit.wikimedia.org/r/1018720 (https://phabricator.wikimedia.org/T254598) [13:47:03] 06cloud-services-team, 10Striker, 13Patch-For-Review: Deploy Striker instance for toolsbeta - https://phabricator.wikimedia.org/T360025#9703888 (10taavi) a:03taavi [13:47:23] 10Toolforge (Toolforge iteration 08): [jobs-cli] dump show `no-filelog` always set as `true` - https://phabricator.wikimedia.org/T362231 (10dcaro) 03NEW [13:56:50] 10Toolforge (Toolforge iteration 08): [jobs-cli] dump show `no-filelog` always set as `true` - https://phabricator.wikimedia.org/T362231#9703926 (10dcaro) p:05Triage→03High [13:59:26] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: 14Use cloudbackup100[12]-dev for cinder backup test/dev - 14https://phabricator.wikimedia.org/T358855#9703930 (10Andrew) 05Open→03Resolved [14:03:33] (03PS1) 10Andrea Denisse: ssl: Delete dummy TLS key for the Prometheus hosts [labs/private] - 10https://gerrit.wikimedia.org/r/1018724 (https://phabricator.wikimedia.org/T360414) [14:05:33] 10Toolforge: [jobs-api] separate jobs-framework k8s object templates from code - https://phabricator.wikimedia.org/T358815#9703945 (10aborrero) [14:11:28] 06cloud-services-team, 10Striker, 06DBA, 13Patch-For-Review: Create a database for Striker test instance - https://phabricator.wikimedia.org/T360149#9703951 (10taavi) Never mind, the admin user had enough permissions so I could do `ALTER DATABASE striker_toolsbeta CHARACTER SET utf8mb4;` myself. [14:13:52] 06cloud-services-team, 10Striker, 06DBA, 13Patch-For-Review: 14Create a database for Striker test instance - 14https://phabricator.wikimedia.org/T360149#9703954 (10taavi) 05In progress→03Resolved [14:16:03] 10Toolforge (Toolforge iteration 08): [jobs-cli] dump show `no-filelog` always set as `true` - https://phabricator.wikimedia.org/T362231#9703962 (10dcaro) 05Open→03In progress [14:23:54] 10Toolforge (Toolforge iteration 08), 13Patch-For-Review: [jobs-cli] dump show `no-filelog` always set as `true` - https://phabricator.wikimedia.org/T362231#9703981 (10CodeReviewBot) dcaro opened https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/27 d/changelog: bump to 16.0.7 [14:35:13] 10PAWS: jupyterlab to 4.1.6 - https://phabricator.wikimedia.org/T362232 (10rook) 03NEW [14:36:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance metricsinfra-controller-2 on project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [14:36:55] 10PAWS: jupyterlab to 4.1.6 - https://phabricator.wikimedia.org/T362232#9704021 (10github-toolforge-bot) vivian-rook opened https://github.com/toolforge/paws/pull/398 [14:37:00] vivian-rook opened https://github.com/toolforge/paws/pull/398 [14:40:04] 06cloud-services-team, 10Toolforge: Decision Request - Toolforge policy agent - https://phabricator.wikimedia.org/T362233 (10aborrero) 03NEW [14:41:28] (PuppetAgentNoResources) firing: (2) No Puppet resources found on instance metricsinfra-controller-2 on project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [14:51:28] (PuppetAgentNoResources) firing: (2) No Puppet resources found on instance metricsinfra-controller-2 on project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [14:56:28] (PuppetAgentNoResources) resolved: (2) No Puppet resources found on instance metricsinfra-controller-2 on project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [14:56:51] (03CR) 10Majavah: [C:03+2] Add option to show a environment-specific banner [labs/striker] - 10https://gerrit.wikimedia.org/r/1018720 (https://phabricator.wikimedia.org/T254598) (owner: 10Majavah) [14:57:17] 10Toolforge (Toolforge iteration 08), 13Patch-For-Review: [jobs-cli] dump show `no-filelog` always set as `true` - https://phabricator.wikimedia.org/T362231#9704094 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/27 d/changelog: bump to 16.0.7 [14:58:19] (03Merged) 10jenkins-bot: Add option to show a environment-specific banner [labs/striker] - 10https://gerrit.wikimedia.org/r/1018720 (https://phabricator.wikimedia.org/T254598) (owner: 10Majavah) [14:58:28] 10Toolforge (Toolforge iteration 08), 13Patch-For-Review: 14[jobs-cli] dump show `no-filelog` always set as `true` - 14https://phabricator.wikimedia.org/T362231#9704096 (10dcaro) 05In progress→03Resolved [14:59:01] 06cloud-services-team, 10Toolforge: Decision Request - Toolforge policy agent - https://phabricator.wikimedia.org/T362233#9704101 (10dcaro) /me really interested on the option for dropping the 3rd party component on the 1.26 upgrade [14:59:07] 10Quarry: Error 500 when clicking "stop query" - https://phabricator.wikimedia.org/T362213#9704103 (10github-toolforge-bot) vivian-rook opened https://github.com/toolforge/quarry/pull/38 [14:59:27] vivian-rook opened https://github.com/toolforge/quarry/pull/38 [15:00:22] 10Striker: Make it easier to apply Striker schema changes in production - https://phabricator.wikimedia.org/T362237 (10taavi) 03NEW [15:02:58] (PuppetAgentNoResources) firing: (3) No Puppet resources found on instance metricsinfra-controller-2 on project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [15:03:34] 10Quarry: Error 500 when clicking "stop query" - https://phabricator.wikimedia.org/T362213#9704136 (10rook) I'm guessing this issue is from the threads being in separate pods. The attached PR removes the feature which may be the way to go with it. [15:07:33] 10Striker: Make it easier to apply Striker schema changes in production - https://phabricator.wikimedia.org/T362237#9704166 (10bd808) I have a hacky script for this kind of stuff at cloudweb1003:/home/bd808/projects/striker/debug-striker.sh. It was inspired by the hacky script I had to make to manage #toolhub. `... [15:17:09] 10Wikibugs: 14Skip black color for wikibugs task updates - 14https://phabricator.wikimedia.org/T362230#9704238 (10bd808) →14Duplicate dup:03T360353 [15:17:14] 10Wikibugs: Hashar does not like grey foreground color for distinguishing closed status events - https://phabricator.wikimedia.org/T360353#9704235 (10bd808) [15:42:50] 06cloud-services-team, 10Toolforge: Decision Request - Toolforge policy agent - https://phabricator.wikimedia.org/T362233#9704409 (10aborrero) [15:47:41] (CloudVPSDesignateLeaks) firing: (3) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:52:41] (CloudVPSDesignateLeaks) firing: (3) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:52:58] (PuppetAgentNoResources) resolved: No Puppet resources found on instance metricsinfra-meta-monitor-1 on project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [15:55:22] 10PAWS: 14jupyterlab to 4.1.6 - 14https://phabricator.wikimedia.org/T362232#9704458 (10github-toolforge-bot) 14vivian-rook closed https://github.com/toolforge/paws/pull/398 [15:55:35] (03PS1) 10Btullis: Add dummy data for the new matomo service. [labs/private] - 10https://gerrit.wikimedia.org/r/1018739 (https://phabricator.wikimedia.org/T351552) [15:55:38] vivian-rook closed https://github.com/toolforge/paws/pull/398 [15:55:55] 10PAWS: 14jupyterlab to 4.1.6 - 14https://phabricator.wikimedia.org/T362232#9704459 (10rook) 05Open→03Resolved a:03rook [15:56:13] (03CR) 10Btullis: [V:03+2 C:03+2] Add dummy data for the new matomo service. [labs/private] - 10https://gerrit.wikimedia.org/r/1018739 (https://phabricator.wikimedia.org/T351552) (owner: 10Btullis) [15:57:41] (CloudVPSDesignateLeaks) resolved: (3) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:39:26] (03CR) 10Muehlenhoff: [C:03+1] "LGTM" [labs/private] - 10https://gerrit.wikimedia.org/r/1018724 (https://phabricator.wikimedia.org/T360414) (owner: 10Andrea Denisse) [16:40:14] (03CR) 10Dzahn: [C:03+1] ssl: Delete dummy TLS key for the Prometheus hosts [labs/private] - 10https://gerrit.wikimedia.org/r/1018724 (https://phabricator.wikimedia.org/T360414) (owner: 10Andrea Denisse) [17:09:21] (03CR) 10Andrea Denisse: [V:03+2 C:03+2] ssl: Delete dummy TLS key for the Prometheus hosts [labs/private] - 10https://gerrit.wikimedia.org/r/1018724 (https://phabricator.wikimedia.org/T360414) (owner: 10Andrea Denisse) [17:36:35] 06cloud-services-team, 10wikitech.wikimedia.org, 07Epic: Set up a bitu instance for codfw1dev - https://phabricator.wikimedia.org/T360795#9704885 (10MoritzMuehlenhoff) What is the intended use case here? We already have a staging instance for Bitu [17:47:08] 06cloud-services-team, 10wikitech.wikimedia.org, 07Epic: Set up a bitu instance for codfw1dev - https://phabricator.wikimedia.org/T360795#9704931 (10bd808) >>! In T360795#9704885, @MoritzMuehlenhoff wrote: > What is the intended use case here? We already have a staging instance for Bitu Management of the LD... [18:01:23] 06cloud-services-team, 10wikitech.wikimedia.org, 07Epic: Set up a bitu instance for codfw1dev - https://phabricator.wikimedia.org/T360795#9705004 (10MoritzMuehlenhoff) Ah, right. Forgot about the local LDAP data base on it. [18:13:13] !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_etcd_node [18:16:26] !log andrew@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=99) [18:22:28] (PuppetAgentStaleLastRun) firing: Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-etcd-26 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [18:24:06] (ProbeDown) firing: (2) Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:29:06] (ProbeDown) resolved: (2) Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:36:26] !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [18:43:24] !log andrew@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [18:45:43] !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_etcd_node [18:47:28] (PuppetAgentStaleLastRun) resolved: Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-etcd-26 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [18:48:55] !log andrew@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=99) [18:54:10] !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [18:57:28] (PuppetAgentStaleLastRun) firing: Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-etcd-26 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [19:01:45] !log andrew@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [19:07:28] (PuppetAgentStaleLastRun) resolved: Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-etcd-26 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [19:15:17] 06cloud-services-team, 10Cloud-VPS, 07Documentation, 10Puppet (Puppet 7.0): 14Update Wikitech documentation on per-project Puppet servers - 14https://phabricator.wikimedia.org/T351509#9705123 (10Andrew) 05Open→03Resolved a:03Andrew [20:05:34] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack [20:10:15] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [21:14:13] 06cloud-services-team, 10Cloud-VPS, 10Puppet (Puppet 7.0): 14Build new Bullseye and Bookworm base images with Puppet 7 - 14https://phabricator.wikimedia.org/T351510#9705389 (10Andrew) 05Open→03Resolved a:03Andrew [21:15:28] 06cloud-services-team, 10Cloud-VPS, 10Puppet (Puppet 7.0): 14Update designate-sink cert cleaning hook to work with Puppet 7 CA changes - 14https://phabricator.wikimedia.org/T351455#9705392 (10Andrew) 05Open→03Resolved a:03Andrew [23:04:52] 10Toolforge (Toolforge iteration 08): [builds-api, envvars-api] add oapi-codegen installation to makefile - https://phabricator.wikimedia.org/T362290 (10Raymond_Ndibe) 03NEW [23:04:55] 10Toolforge (Toolforge iteration 08): [builds-api, envvars-api] add oapi-codegen installation to makefile - https://phabricator.wikimedia.org/T362290#9705554 (10Raymond_Ndibe) a:03Raymond_Ndibe [23:13:51] 10Toolforge (Toolforge iteration 08), 13Patch-For-Review: [builds-api, envvars-api] add oapi-codegen installation to makefile - https://phabricator.wikimedia.org/T362290#9705563 (10CodeReviewBot) raymond-ndibe opened https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/86 [builds-api... [23:15:33] 10Toolforge (Toolforge iteration 08), 13Patch-For-Review: [builds-api, envvars-api] add oapi-codegen installation to makefile - https://phabricator.wikimedia.org/T362290#9705568 (10Raymond_Ndibe) 05Open→03In progress