[00:16:28] (InstanceDown) firing: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:21:28] (InstanceDown) resolved: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:30:37] (03CR) 10BryanDavis: [C: 04-2] "Done" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/1008016 (https://phabricator.wikimedia.org/T90594) (owner: 10BryanDavis) [00:33:54] (03CR) 10BryanDavis: [C: 04-2] "test" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/1008016 (https://phabricator.wikimedia.org/T90594) (owner: 10BryanDavis) [01:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [02:46:41] (CloudVPSDesignateLeaks) firing: (5) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [04:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [06:38:58] 10Toolforge (Quota-requests): Request increased quota for pm20-* Toolforge tool - https://phabricator.wikimedia.org/T359785 (10Jneubert) 03NEW [06:39:25] 10Toolforge (Quota-requests): Request increased quota for pm20-* Toolforge tool - https://phabricator.wikimedia.org/T359785#9618891 (10Jneubert) [06:46:56] (CloudVPSDesignateLeaks) firing: (5) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [07:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [09:11:04] 05Grid-Engine-to-K8s-Migration: Migrate ganfilter from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T357554#9619091 (10dcaro) >>! In T357554#9617905, @coldchrist wrote: > The bot is now running under the toolforge command suggested at the VPT thread. I'll work on some other c... [09:31:15] 10Toolforge: New upstream release for Pywikibot - https://phabricator.wikimedia.org/T359676#9619164 (10dcaro) a:03dcaro [09:31:29] 10Toolforge (Toolforge iteration 07): New upstream release for Pywikibot - https://phabricator.wikimedia.org/T359676#9619166 (10dcaro) [09:32:23] 10Toolforge (Toolforge iteration 07): New upstream release for Pywikibot - https://phabricator.wikimedia.org/T359676#9619167 (10dcaro) 05Open→03In progress [10:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [10:29:07] 10Toolforge (Toolforge iteration 07): New upstream release for Pywikibot - https://phabricator.wikimedia.org/T359676#9619276 (10dcaro) Done: ` [step-results] 2024-03-11T10:14:38.043100355Z Built image tools-harbor.wmcloud.org/tool-pywikibot/pywikibot-scripts-stable:latest@sha256:ac606c940e9488670af9899b986d00d97... [10:29:56] 10Toolforge (Toolforge iteration 07): New upstream release for Pywikibot - https://phabricator.wikimedia.org/T359676#9619278 (10dcaro) 05In progress→03Resolved [10:30:09] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 15User-aborrero: refresh kube-state-metrics - https://phabricator.wikimedia.org/T359798 (10aborrero) 03NEW [10:30:13] 10Toolforge (Quota-requests): Request increased memory quota for wd-shex-infer Toolforge tool - https://phabricator.wikimedia.org/T357209#9619293 (10dcaro) [10:30:22] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 15User-aborrero: Upgrade Toolforge Kubernetes to version 1.24 - https://phabricator.wikimedia.org/T307651#9619296 (10dcaro) [10:30:46] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4), 05Cloud-Services-Origin-Team, 07Cloud-Services-Worktype-Project, 15User-dcaro: [maintain-kubeusers] Allow setting the requests cpu and mem quota - https://phabricator.wikimedia.org/T357881#9619292 (10dcaro) 05Declined→03R... [10:30:51] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 15User-aborrero: toolforge: upgrade k8s etcd nodes to debian bookworm - https://phabricator.wikimedia.org/T359620#9619295 (10dcaro) 05duplicate→03Resolved [10:30:55] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4), 05Cloud-Services-Origin-Team, 07Cloud-Services-Worktype-Project, 15User-dcaro: [maintain-kubeusers] Allow setting the requests cpu and mem quota - https://phabricator.wikimedia.org/T357881#9619304 (10taavi) 05Resolved→03D... [10:30:57] 10Toolforge (Quota-requests): Request increased memory quota for wd-shex-infer Toolforge tool - https://phabricator.wikimedia.org/T357209#9619305 (10taavi) [10:32:58] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review: [jobs-cli,jobs-api] Allow using file logs with build service images - https://phabricator.wikimedia.org/T353537#9619301 (10dcaro) 05In progress→03Resolved [10:35:45] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 15User-aborrero: refresh kube-state-metrics - https://phabricator.wikimedia.org/T359798#9619328 (10aborrero) [10:36:08] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 15User-aborrero: refresh kube-state-metrics version for k8s 1.24 - https://phabricator.wikimedia.org/T359798#9619329 (10aborrero) [10:36:54] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 15User-aborrero: Upgrade Toolforge Kubernetes to version 1.24 - https://phabricator.wikimedia.org/T307651#9619334 (10aborrero) [10:38:25] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 15User-aborrero: refresh kube-state-metrics version for k8s 1.24 - https://phabricator.wikimedia.org/T359798#9619332 (10aborrero) 05Open→03In progress p:05Triage→03Medium [10:46:32] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 13Patch-For-Review, 15User-aborrero: refresh kube-state-metrics version for k8s 1.24 - https://phabricator.wikimedia.org/T359798#9619337 (10CodeReviewBot) aborrero opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge... [10:46:57] (CloudVPSDesignateLeaks) firing: (5) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [10:52:18] 10Toolforge (Toolforge iteration 07): [harbor, maintain-harbor] We seem to be cleaning up image tags that should not be cleaned up for the toolforge project - https://phabricator.wikimedia.org/T359052#9619350 (10dcaro) 05In progress→03Resolved [11:03:04] 06cloud-services-team, 10wikitech.wikimedia.org, 07Epic: Make Wikitech an SUL wiki - https://phabricator.wikimedia.org/T161859#9619368 (10taavi) [11:09:12] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9619387 (10dcaro) > I built projects again, but after that both tools return the same error Could not find file '/workspace/pages-wo-iwiki.html' After several tries, I stopped... [11:16:27] 10Toolforge: [jobs-api] Store user specified command in a label or similar - https://phabricator.wikimedia.org/T359650#9619392 (10dcaro) I remember crossplane, we don't really need a whole new framewrok, just a database to store the actual abstractions we have (that's what the custom resource on k8s would become... [11:18:42] 10Toolforge, 07Epic, 15User-Raymond_Ndibe: Run webservices via the jobs framework - https://phabricator.wikimedia.org/T348755#9619395 (10dcaro) [11:18:46] 10Toolforge: [jobs-api] Store user specified command in a label or similar - https://phabricator.wikimedia.org/T359650#9619396 (10dcaro) [11:21:41] (CloudVPSDesignateLeaks) firing: (5) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [11:25:16] 10Toolforge, 07Epic, 15User-Raymond_Ndibe: Run webservices via the jobs framework - https://phabricator.wikimedia.org/T348755#9619399 (10dcaro) [11:26:42] (CloudVPSDesignateLeaks) resolved: (5) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [11:41:45] 10Toolforge, 15User-Raymond_Ndibe: [jobs-api] Refactor before webservice support - https://phabricator.wikimedia.org/T359804 (10dcaro) 03NEW p:05Triage→03High [11:42:08] 10Toolforge, 07Epic, 15User-Raymond_Ndibe: seperate jobs-framework k8s object templates from code - https://phabricator.wikimedia.org/T358815#9619469 (10dcaro) [11:42:12] 10Toolforge, 15User-Raymond_Ndibe: [jobs-api] Refactor before webservice support - https://phabricator.wikimedia.org/T359804#9619470 (10dcaro) [11:42:17] 10Toolforge, 07Epic, 15User-Raymond_Ndibe: Run webservices via the jobs framework - https://phabricator.wikimedia.org/T348755#9619471 (10dcaro) [11:43:17] 10Toolforge, 07Epic, 15User-Raymond_Ndibe: [jobs-api,webservice] Run webservices via the jobs framework - https://phabricator.wikimedia.org/T348755#9619475 (10dcaro) [11:43:18] 10Toolforge: [jobs-api] Store user specified command in a label or similar - https://phabricator.wikimedia.org/T359650#9619476 (10dcaro) [11:43:20] 10Toolforge, 15User-Raymond_Ndibe: [jobs-api] Refactor before webservice support - https://phabricator.wikimedia.org/T359804#9619477 (10dcaro) [11:43:28] 10Toolforge, 07Epic, 15User-Raymond_Ndibe: [jobs-api,webservice] Run webservices via the jobs framework - https://phabricator.wikimedia.org/T348755#9619478 (10dcaro) [11:43:47] 10Toolforge, 15User-Raymond_Ndibe: [jobs-api] seperate jobs-framework k8s object templates from code - https://phabricator.wikimedia.org/T358815#9619472 (10dcaro) p:05Triage→03High [11:45:33] 10Toolforge: [jobs-api] Save business models in a DB - https://phabricator.wikimedia.org/T359650#9619489 (10dcaro) [11:45:37] 10Toolforge: [jobs-api] Save business models in a DB - https://phabricator.wikimedia.org/T359650#9619485 (10dcaro) [11:48:09] 10Toolforge (Toolforge iteration 07), 10Toolforge Jobs framework, 13Patch-For-Review: [jobs-api,jobs-cli] Support job health checks - https://phabricator.wikimedia.org/T335592#9619504 (10dcaro) [11:48:33] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review: [jobs-api,jobs-cli] Support job health checks - https://phabricator.wikimedia.org/T335592#9619505 (10dcaro) [11:50:15] 10Toolforge: [jobs-api] Remove flask-restful - https://phabricator.wikimedia.org/T359806 (10dcaro) 03NEW [11:57:56] 10Toolforge, 15User-Raymond_Ndibe: [jobs-api] Split the API, business, and k8s models - https://phabricator.wikimedia.org/T359808 (10dcaro) 03NEW [11:58:32] 10Toolforge, 15User-Raymond_Ndibe: [jobs-api] Split the API, business, and k8s models - https://phabricator.wikimedia.org/T359808#9619544 (10dcaro) a:05Raymond_Ndibe→03None [11:59:18] 10Toolforge Jobs framework, 15User-aborrero: [jobs-api,jobs-cli] Support services in jobs - https://phabricator.wikimedia.org/T348758#9619545 (10dcaro) [11:59:22] 10Toolforge, 15User-Raymond_Ndibe: [jobs-api] Refactor before webservice support - https://phabricator.wikimedia.org/T359804#9619546 (10dcaro) [12:01:15] (03CR) 10Majavah: [C: 03+2] Require SUL/Phab links before applying for access [labs/striker] - 10https://gerrit.wikimedia.org/r/1008960 (https://phabricator.wikimedia.org/T172899) (owner: 10Majavah) [12:03:04] 10Toolforge: [jobs-api,jobs-cli] Support multiple replicas of continuous jobs - https://phabricator.wikimedia.org/T341066#9619556 (10dcaro) p:05Medium→03High [12:03:52] (03Merged) 10jenkins-bot: Require SUL/Phab links before applying for access [labs/striker] - 10https://gerrit.wikimedia.org/r/1008960 (https://phabricator.wikimedia.org/T172899) (owner: 10Majavah) [12:04:07] 10Toolforge: [jobs-api,jobs-cli] Support multiple replicas of continuous jobs - https://phabricator.wikimedia.org/T341066#9619550 (10dcaro) >>! In T341066#9474531, @Tacsipacsi wrote: > I committed https://github.com/toollabs/Rotatebot/commit/8938d15165acc2c8cd4689da48a61dbeb84b1d80 on the assumption that this wo... [12:04:17] 10Toolforge: [jobs-api,jobs-cli] Support multiple replicas of continuous jobs - https://phabricator.wikimedia.org/T341066#9619551 (10dcaro) p:05Triage→03Medium [12:06:04] 10Toolforge: [jobs-api] Save business models in a DB - https://phabricator.wikimedia.org/T359650#9619562 (10aborrero) I would be happy to talk about this re-architecture idea. I can share a bit more info about what I tested in the past, and what architecture I had in mind when I first created this, although the... [12:07:50] (ProbeDown) firing: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:11:56] 10VPS-Projects, 06cloud-services-team, 10Puppet (Puppet 7.0): Are clouddb-wikireplicas-query-1 and the cloudb-services project still useful? - https://phabricator.wikimedia.org/T359810 (10Andrew) 03NEW [12:12:19] 10VPS-Projects, 06cloud-services-team, 10Puppet (Puppet 7.0): Are clouddb-wikireplicas-query-1 and the cloudb-services project still useful? - https://phabricator.wikimedia.org/T359810#9619594 (10taavi) a:05taavi→03None [12:12:50] (ProbeDown) resolved: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:14:27] 10VPS-Projects, 06cloud-services-team, 10Puppet (Puppet 7.0): Are clouddb-wikireplicas-query-1 and the cloudb-services project still useful? - https://phabricator.wikimedia.org/T359810#9619596 (10Andrew) Pinging @dr0ptp4kt as I heard a rumor that he might know something. [12:14:52] 10Toolforge: [jobs-api] Save business models in a DB - https://phabricator.wikimedia.org/T359650#9619597 (10dcaro) >>! In T359650#9619562, @aborrero wrote: > I would be happy to talk about this re-architecture idea. I can share a bit more info about what I tested in the past, and what architecture I had in mind... [12:19:20] 10VPS-Projects, 06cloud-services-team, 10Puppet (Puppet 7.0): Migrate Puppet servers in Cloud Services team managed projects to Puppet 7 - https://phabricator.wikimedia.org/T351453#9619600 (10Andrew) [12:21:25] 10Striker, 13Patch-For-Review: Require a Phabricator account as a prerequisite to getting Toolforge access - https://phabricator.wikimedia.org/T172899#9619607 (10taavi) 05Open→03Resolved [12:35:55] 10VPS-Projects, 06cloud-services-team, 10Puppet (Puppet 7.0): Migrate Puppet servers in Cloud Services team managed projects to Puppet 7 - https://phabricator.wikimedia.org/T351453#9619673 (10Andrew) [12:36:28] (PuppetAgentStaleLastRun) firing: Last Puppet run was over 24 hours ago on instance paws-puppetserver-1 in project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [12:52:17] 10Toolforge, 07Epic, 15User-Raymond_Ndibe: [jobs-api,webservice] Run webservices via the jobs framework - https://phabricator.wikimedia.org/T348755#9619707 (10dcaro) [12:52:19] 10Toolforge: [jobs-api,infra] upgrade all the existing toolforge jobs to the latest job version - https://phabricator.wikimedia.org/T359649#9619708 (10dcaro) [12:52:48] 10Toolforge, 15User-Raymond_Ndibe: [jobs-api] Split the API, business, and k8s models - https://phabricator.wikimedia.org/T359808#9619709 (10dcaro) p:05Triage→03High [12:52:57] 10Toolforge: [jobs-api] Remove flask-restful - https://phabricator.wikimedia.org/T359806#9619711 (10dcaro) p:05Triage→03High [12:53:38] 10Toolforge, 15User-aborrero: [jobs-api,jobs-cli] Support services in jobs - https://phabricator.wikimedia.org/T348758#9619714 (10dcaro) [12:58:58] dcaro: The session I followed today was courtesy of a link in slack. i don't think it was the same as on the calendar because they used the same google session for multiple conference sessions [12:59:20] Hm, now I'm in the wrong channel too [13:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [13:01:28] (PuppetAgentStaleLastRun) resolved: Last Puppet run was over 24 hours ago on instance paws-puppetserver-1 in project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [13:10:48] 10VPS-Projects, 06cloud-services-team, 10Puppet (Puppet 7.0): Migrate Puppet servers in Cloud Services team managed projects to Puppet 7 - https://phabricator.wikimedia.org/T351453#9619744 (10Andrew) [13:10:53] 10Toolforge, 06cloud-services-team, 15User-aborrero: toolforge: automate docker image caching workflow - https://phabricator.wikimedia.org/T359816 (10aborrero) 03NEW [13:13:19] 10Cloud Services Proposals, 10Toolforge, 15User-aborrero: Decision request - Toolforge external infrastructure domain usage - https://phabricator.wikimedia.org/T306039#9619765 (10taavi) Ok, so it seems like we agree that we want to go for the something `[something].toolforge.org` route, and `svc.toolforge.or... [13:13:42] 10Toolforge, 06cloud-services-team, 15User-aborrero: toolforge: automate docker image caching workflow - https://phabricator.wikimedia.org/T359816#9619758 (10aborrero) p:05Triage→03Low [13:20:37] 10Toolforge: [jobs-api,jobs-cli] Show a job status when a job is being deleted - https://phabricator.wikimedia.org/T348242#9619788 (10dcaro) p:05Triage→03Low [13:20:41] 10Toolforge: [jobs-api,jobs-cli] Show a job status when a job is being deleted - https://phabricator.wikimedia.org/T348242#9619790 (10dcaro) [13:20:48] 10Toolforge Jobs framework, 06cloud-services-team, 15User-aborrero: toolforge lima-kilo: toolforge-jobs fails to run because it can't create log files - https://phabricator.wikimedia.org/T338153#9619793 (10dcaro) This seems not to be an issue anymore, just tested: ` local.tf-test@lima-lima-kilo:~$ toolforge... [13:21:02] 10Toolforge, 06cloud-services-team, 13Patch-For-Review: Toolforge: improve local kubernetes development setup - https://phabricator.wikimedia.org/T326789#9619797 (10dcaro) [13:21:42] 10Toolforge Jobs framework, 06cloud-services-team, 15User-aborrero: toolforge lima-kilo: toolforge-jobs fails to run because it can't create log files - https://phabricator.wikimedia.org/T338153#9619795 (10dcaro) 05Open→03Resolved a:03dcaro [13:24:25] 10VPS-Projects, 06cloud-services-team, 10Puppet (Puppet 7.0): Migrate Puppet servers in Cloud Services team managed projects to Puppet 7 - https://phabricator.wikimedia.org/T351453#9619832 (10Andrew) [13:30:20] 10Toolforge Jobs framework, 06cloud-services-team, 15User-aborrero: toolforge lima-kilo: toolforge-jobs fails to run because it can't create log files - https://phabricator.wikimedia.org/T338153#9619844 (10aborrero) >>! In T338153#9619793, @dcaro wrote: > > It's interesting that they are created as root tho... [13:35:40] 06cloud-services-team, 10wikitech.wikimedia.org, 07Epic: Make Wikitech an SUL wiki - https://phabricator.wikimedia.org/T161859#9619879 (10taavi) [13:40:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:41:26] 10Toolforge, 06cloud-services-team, 15User-aborrero: [lima-kilo,jobs-api,infra] tool jobs run as root user in lima-kilo environment - https://phabricator.wikimedia.org/T346738#9619906 (10dcaro) [13:41:46] 10Toolforge, 06cloud-services-team, 15User-aborrero: [lima-kilo,jobs-api,infra] tool jobs run as root user in lima-kilo environment - https://phabricator.wikimedia.org/T346738#9619910 (10dcaro) p:05Triage→03Low [13:45:41] (CloudVPSDesignateLeaks) firing: (4) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:50:41] (CloudVPSDesignateLeaks) firing: (5) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:51:46] 10VPS-Projects, 06cloud-services-team, 10Puppet (Puppet 7.0): Migrate Puppet servers in Cloud Services team managed projects to Puppet 7 - https://phabricator.wikimedia.org/T351453#9619987 (10Andrew) [14:06:27] 10Toolforge, 15User-aborrero: Allow exporting jobs list in YAML format - https://phabricator.wikimedia.org/T320575#9620078 (10dcaro) This should be relatively easy to implement, we will move to a different format (that will include other things like webservices, components, and such), but might take a while to... [14:06:32] 10Toolforge, 15User-aborrero: [jobs-cli] Allow exporting jobs list in YAML format - https://phabricator.wikimedia.org/T320575#9620080 (10dcaro) [14:07:34] 10Toolforge, 15User-aborrero: [jobs-cli] Allow exporting jobs list in YAML format - https://phabricator.wikimedia.org/T320575#9620081 (10dcaro) [14:16:10] 10Toolforge: Use a higher `startingDeadlineSeconds` for less frequent jobs - https://phabricator.wikimedia.org/T338134#9620105 (10dcaro) p:05Triage→03Medium [14:20:41] (CloudVPSDesignateLeaks) firing: (5) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:20:56] (CloudVPSDesignateLeaks) firing: (5) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:21:30] 10Toolforge, 07Kubernetes: Transient cronjob scheduling failures on Toolforge k8s - https://phabricator.wikimedia.org/T338006#9620120 (10dcaro) [14:21:37] 10Toolforge, 07Kubernetes: Transient cronjob scheduling failures on Toolforge k8s - https://phabricator.wikimedia.org/T338006#9620117 (10dcaro) p:05Triage→03Medium [14:21:52] 10Toolforge, 07Kubernetes: [jobs-api,jobs-cli,infra] Transient cronjob scheduling failures on Toolforge k8s - https://phabricator.wikimedia.org/T338006#9620122 (10dcaro) [14:23:39] 10Toolforge: [jobs-api] load restarting jobs even with no change - https://phabricator.wikimedia.org/T335664#9620123 (10dcaro) [14:25:41] (CloudVPSDesignateLeaks) firing: (5) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:26:22] 10Toolforge: [jobs-api] load restarting jobs even with no change - https://phabricator.wikimedia.org/T335664#9620140 (10dcaro) 05Open→03Resolved a:03dcaro As I understand, this has been fixed right? Please reopen if it's not the case, thanks! [14:26:35] 10Toolforge: Expose hidden quota errors more clearly - https://phabricator.wikimedia.org/T333976#9620143 (10dcaro) p:05Triage→03Medium [14:26:46] 10Toolforge: [jobs-api,jobs-cli] Expose hidden quota errors more clearly - https://phabricator.wikimedia.org/T333976#9620145 (10dcaro) [14:29:39] 10Toolforge, 06cloud-services-team: [jobs-cli,jobs-api] make API and CLI key/values coherent - https://phabricator.wikimedia.org/T327280#9620147 (10dcaro) [14:30:41] (CloudVPSDesignateLeaks) resolved: (5) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:31:50] 10Toolforge, 06cloud-services-team: [jobs-cli,jobs-api] make API and CLI key/values coherent - https://phabricator.wikimedia.org/T327280#9620158 (10dcaro) p:05Triage→03Low [14:33:07] 10Toolforge, 06cloud-services-team: [jobs-api,jobs-emailer] Prometheus monitoring toolforge-jobs server side components - https://phabricator.wikimedia.org/T320284#9620163 (10dcaro) p:05Triage→03High [14:34:10] 10Toolforge (Toolforge iteration 07): [Toolforge CLI consolidation] Explore OpenAPI SDK tooling - https://phabricator.wikimedia.org/T356261#9620171 (10Slst2020) @dcaro What is our plan for creating the Toolforge CLI from the autogenerated SDK? Manually? Automatically with a post-generation script? [14:37:54] 10Toolforge Jobs framework, 06cloud-services-team, 10Pywikibot, 15User-Raymond_Ndibe: Add config sub-parser to toolforge-jobs command - https://phabricator.wikimedia.org/T316166#9620177 (10dcaro) 05Open→03Resolved I think that nowadays the buildservice pywikibot image replaces this, please reopen if I'... [14:42:25] 10Toolforge Jobs framework, 06cloud-services-team, 10Pywikibot, 15User-Raymond_Ndibe: Add config sub-parser to toolforge-jobs command - https://phabricator.wikimedia.org/T316166#9620188 (10taavi) 05Resolved→03Declined [14:45:21] 10Toolforge (Toolforge iteration 07): [Toolforge CLI consolidation] Explore OpenAPI SDK tooling - https://phabricator.wikimedia.org/T356261#9620196 (10dcaro) >>! In T356261#9620170, @Slst2020 wrote: > @dcaro What is our plan for creating the Toolforge CLI from the autogenerated SDK? Manually? Automatically with... [14:48:28] 10Tool-Global-user-contributions, 06Stewards-and-global-tools, 10Temporary accounts, 10XTools, 07Design: [Design] Update wireframes with user testing learnings - https://phabricator.wikimedia.org/T359827 (10KColeman-WMF) 03NEW [14:55:10] 10Tool-Global-user-contributions, 06Stewards-and-global-tools, 10Temporary accounts, 10XTools, and 2 others: [Design] Synthesise user testing results - https://phabricator.wikimedia.org/T358098#9620253 (10KColeman-WMF) [14:55:25] 10Tool-Global-user-contributions, 06Stewards-and-global-tools, 10Temporary accounts, 10XTools, 07Design: [Design EPIC] Global User Contributions - https://phabricator.wikimedia.org/T349901#9620243 (10KColeman-WMF) [14:56:10] 10Tool-Global-user-contributions, 06Stewards-and-global-tools, 10Temporary accounts, 10XTools, and 2 others: [Design] Update wireframes with user testing learnings - https://phabricator.wikimedia.org/T359827#9620261 (10KColeman-WMF) [14:57:15] 10Toolforge (Toolforge iteration 07): [Toolforge CLI consolidation] Explore OpenAPI SDK tooling - https://phabricator.wikimedia.org/T356261#9620268 (10Slst2020) >>! In T356261#9620196, @dcaro wrote: >>>! In T356261#9620170, @Slst2020 wrote: >> @dcaro What is our plan for creating the Toolforge CLI from the autog... [15:05:13] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4), 05Cloud-Services-Origin-User, 07Cloud-Services-Worktype-Maintenance, 15User-dcaro: [webservice] Error shown when restarting buildpack-based tool - https://phabricator.wikimedia.org/T348312#9620347 (10dcaro) 05Open→03Inval... [15:06:01] 10Toolforge (Toolforge iteration 07): [Toolforge CLI consolidation] Explore OpenAPI SDK tooling - https://phabricator.wikimedia.org/T356261#9620395 (10dcaro) > Even assuming that the generated SDK does not need any manual changes, we would still have to create/modify the CLI for each change to the API. So if the... [15:06:30] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4), 05Cloud-Services-Origin-User, 07Cloud-Services-Worktype-Maintenance, 15User-dcaro: [webservice] Error shown when restarting buildpack-based tool - https://phabricator.wikimedia.org/T348312#9620350 (10dcaro) 05Invalid→03Re... [15:15:23] 10wikitech.wikimedia.org, 06Content-Transform-Team-WIP, 10DiscussionTools, 10Parsoid-Read-Views (Phase 1 - DiscussionTools support): Use Parsoid for DiscussionTools on wikitech - https://phabricator.wikimedia.org/T355374#9620584 (10ssastry) [15:22:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance toolsbeta-sgecron-02 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [15:27:28] (PuppetAgentNoResources) firing: (7) No Puppet resources found on instance toolsbeta-acme-chief-01 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [16:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [16:02:41] 10Wikibugs, 15User-bd808: GitLab CI tests fail for MRs from forks because of missing secrets - https://phabricator.wikimedia.org/T358775#9620900 (10bd808) https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/16 gave me a test case for the https://docs.gitlab.com/ee/ci/pipelines/merge_request... [16:05:48] 10wikitech.wikimedia.org: Install Extension:DynamicPageList on wikitech.wikimedia.org - https://phabricator.wikimedia.org/T359866 (10TBurmeister) 03NEW [16:06:47] 06cloud-services-team, 10wikitech.wikimedia.org: Install Extension:DynamicPageList on wikitech.wikimedia.org - https://phabricator.wikimedia.org/T359866#9620923 (10TBurmeister) p:05Triage→03Low [16:08:31] 06cloud-services-team, 10wikitech.wikimedia.org: Install Extension:DynamicPageList on wikitech.wikimedia.org - https://phabricator.wikimedia.org/T359866#9620927 (10taavi) [16:09:05] 10Striker: Tune gitlab container in dev environment for lower resource deployments - https://phabricator.wikimedia.org/T359867 (10bd808) 03NEW [16:09:46] 06cloud-services-team, 10wikitech.wikimedia.org: Install Extension:DynamicPageList on wikitech.wikimedia.org - https://phabricator.wikimedia.org/T359866#9620929 (10taavi) 05Open→03Stalled [16:09:51] 06cloud-services-team, 10wikitech.wikimedia.org: Install Extension:DynamicPageList on wikitech.wikimedia.org - https://phabricator.wikimedia.org/T359866#9620941 (10taavi) [16:14:49] 06cloud-services-team, 10wikitech.wikimedia.org: Install Extension:DynamicPageList on wikitech.wikimedia.org - https://phabricator.wikimedia.org/T359866#9620955 (10JJMC89) [16:15:38] 10wikitech.wikimedia.org, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests: Enable dpl on https://wikitech.wikimedia.org/ - https://phabricator.wikimedia.org/T284813#9620957 (10JJMC89) [16:27:13] 10Toolforge (Toolforge iteration 07), 15User-aborrero: [toolforge] several tools get periods of connection refused (104) when connecting to wikis - https://phabricator.wikimedia.org/T356164#9621019 (10dcaro) [16:29:39] 10Toolforge (Toolforge iteration 07), 15User-aborrero: [toolforge] several tools get periods of connection refused (104) when connecting to wikis - https://phabricator.wikimedia.org/T356164#9621026 (10dcaro) 05Open→03Resolved >>! In T356164#9559316, @aborrero wrote: > Maybe an idea: have a per-tool network... [16:29:43] 10Toolforge (Toolforge iteration 07), 15User-aborrero: [toolforge] several tools get periods of connection refused (104) when connecting to wikis - https://phabricator.wikimedia.org/T356164#9621028 (10dcaro) [16:40:41] (CloudVPSDesignateLeaks) firing: Detected 3 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:45:41] (CloudVPSDesignateLeaks) firing: (4) Detected 3 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:45:58] !log aborrero@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component wmcs-k8s-metrics [16:46:10] !log aborrero@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component wmcs-k8s-metrics [16:46:17] !log aborrero@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component wmcs-k8s-metrics [16:46:28] !log aborrero@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component wmcs-k8s-metrics [16:50:41] (CloudVPSDesignateLeaks) firing: (5) Detected 3 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:55:30] !log aborrero@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component wmcs-k8s-metrics [16:55:33] !log aborrero@cloudcumin1001 toolsbeta END (ERROR) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=97) for component wmcs-k8s-metrics [16:55:44] !log aborrero@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component wmcs-k8s-metrics [16:55:57] !log aborrero@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component wmcs-k8s-metrics [16:56:39] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 13Patch-For-Review, 15User-aborrero: refresh kube-state-metrics version for k8s 1.24 - https://phabricator.wikimedia.org/T359798#9621088 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_re... [16:58:11] 14Toolforge Build Service: webservice buildservice not starting Rust webservice - https://phabricator.wikimedia.org/T359870 (10Magnus) 03NEW [17:00:52] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 15User-aborrero: Upgrade Toolforge Kubernetes to version 1.24 - https://phabricator.wikimedia.org/T307651#9621109 (10aborrero) [17:02:08] 10Toolforge: webservice buildservice not starting Rust webservice - https://phabricator.wikimedia.org/T359870#9621116 (10dcaro) [17:02:37] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 13Patch-For-Review, 15User-aborrero: refresh kube-state-metrics version for k8s 1.24 - https://phabricator.wikimedia.org/T359798#9621107 (10aborrero) 05In progress→03Resolved This has been deployed in toolsbeta, should be ready for tools. [17:06:21] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9621128 (10MBH) OK, that's what I know about all of this now. * After building, webservice should be stopped and started again, otherwise old compiled version of tools will wo... [17:07:28] (PuppetAgentNoResources) firing: (7) No Puppet resources found on instance toolsbeta-acme-chief-01 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [17:08:17] 10Toolforge: webservice buildservice not starting Rust webservice - https://phabricator.wikimedia.org/T359870#9621130 (10taavi) 05Open→03Resolved a:03taavi `kubectl describe pod` was showing this: ` Warning Failed 7s (x3 over 28s) kubelet Error: failed to create containerd task: failed t... [17:11:37] 10Toolforge: webservice buildservice not starting Rust webservice - https://phabricator.wikimedia.org/T359870#9621142 (10dcaro) Just tried starting it on a test tool and it worked for me: https://dcaro-test11.toolforge.org/ ` tools.dcaro-test11@tools-sgebastion-10:~$ toolforge webservice buildservice start --mo... [17:12:28] (PuppetAgentNoResources) firing: (7) No Puppet resources found on instance toolsbeta-acme-chief-01 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [17:18:47] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9621181 (10dcaro) >>! In T319883#9621128, @MBH wrote: > OK, that's what I know about all of this now. > * After building, webservice should be stopped and started again, other... [17:30:25] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9621213 (10dcaro) Found the issue with the compilation, the logs are a bit hidden in the pile of logs as the colors get stripped out too :/ ` [step-build] 2024-03-11T17:26:31... [17:40:21] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9621234 (10dcaro) >>! In T319883#9621213, @dcaro wrote: > Found the issue with the compilation, the logs are a bit hidden in the pile of logs as the colors get stripped out to... [17:50:32] 06cloud-services-team, 10wikitech.wikimedia.org, 07Epic: Make Wikitech an SUL wiki - https://phabricator.wikimedia.org/T161859#9621255 (10bd808) >>! In T161859#9233732, @Tgr wrote: > I'm skeptical about the cost/benefit ratio of making Wikitech a CentralAuth SUL wiki. It had multiple unfortunate naming polic... [18:04:45] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9621336 (10MBH) Thanks. But I added clusters5 and test to all.sln, replaced .csproj of both projects with code from your last message, runned building and it failed with the s... [19:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [19:01:39] (ProbeDown) firing: Service toolsbeta-test-k8s-haproxy-3:30000 has failed probes (http_this_tool_does_not_exist_beta_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolsbeta-test-k8s-haproxy-3:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [19:06:39] (ProbeDown) resolved: Service toolsbeta-test-k8s-haproxy-3:30000 has failed probes (http_this_tool_does_not_exist_beta_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolsbeta-test-k8s-haproxy-3:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:06:28] (PuppetAgentFailure) firing: Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [20:12:28] (PuppetAgentNoResources) firing: (2) No Puppet resources found on instance toolsbeta-sgegrid-shadow on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [20:14:44] 10Wikibugs, 15User-bd808: wikibugs having a hard time staying connected to libera.chat IRC network - https://phabricator.wikimedia.org/T357729#9621722 (10bd808) 05Stalled→03Resolved Let's try calling this "fixed", or at least remediated, for the moment. We didn't have strong criteria for what "better" look... [20:19:18] 10Cloud-VPS: [trove] define process for updating docker images - https://phabricator.wikimedia.org/T359534#9621732 (10Andrew) The technical bits are mostly tracked here: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Trove#guest_containers. I wrote those docs assuming someone would be reading the up... [20:33:28] (PuppetAgentStaleLastRun) firing: Last Puppet run was over 24 hours ago on instance toolsbeta-puppetdb-03 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [20:38:28] (PuppetAgentStaleLastRun) resolved: Last Puppet run was over 24 hours ago on instance toolsbeta-puppetdb-03 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [20:42:18] 10Wikibugs: Convert `wikibugs2 phorge` to asyncio - https://phabricator.wikimedia.org/T359883 (10bd808) 03NEW [20:42:47] 10Wikibugs: Convert `wikibugs2 phorge` to asyncio - https://phabricator.wikimedia.org/T359883#9621813 (10bd808) p:05Triage→03Medium [20:50:41] (CloudVPSDesignateLeaks) firing: (5) Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [21:03:01] 06cloud-services-team, 10wikitech.wikimedia.org, 07Epic: Make Wikitech an SUL wiki - https://phabricator.wikimedia.org/T161859#9621853 (10Tgr) >>! In T161859#9621255, @bd808 wrote: > In my mind it just needs an account migration pretty much exactly like SUL unification did back in the day. The `~labswiki` su... [21:23:22] 06cloud-services-team, 10wikitech.wikimedia.org, 07Epic: Make Wikitech an SUL wiki - https://phabricator.wikimedia.org/T161859#9621929 (10bd808) >>! In T161859#9621853, @Tgr wrote: >>>! In T161859#9621255, @bd808 wrote: >> In my mind it just needs an account migration pretty much exactly like SUL unification... [21:25:32] 06cloud-services-team, 10wikitech.wikimedia.org, 07Epic: Make Wikitech an SUL wiki - https://phabricator.wikimedia.org/T161859#9621932 (10Ameisenigel) foundationwiki has been SUL-ified in T205347 and that was not such a long time ago. But Wikitech might be a bit more difficult. [21:41:28] (PuppetAgentFailure) firing: (2) Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [22:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [22:03:43] 10Wikibugs, 15User-bd808: Hide User-XX projects from wikibugs output - https://phabricator.wikimedia.org/T180293#9622050 (10bd808) 05Open→03In progress a:03bd808 [23:12:28] (PuppetAgentNoResources) firing: (2) No Puppet resources found on instance toolsbeta-sgegrid-shadow on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [23:18:04] 10Wikibugs, 15User-bd808: Wikibugs should not accidentally ping SREs by sending text "# page" - https://phabricator.wikimedia.org/T281105#9622154 (10bd808) 05Open→03In progress a:03bd808 [23:31:40] 10Wikibugs, 15User-bd808: Use case-insensitive sort for tags added to the irc log - https://phabricator.wikimedia.org/T90339#9622162 (10bd808) 05Open→03In progress a:03bd808