[00:16:28] FIRING: InstanceDown: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:21:28] RESOLVED: InstanceDown: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:33:28] FIRING: InstanceDown: Project tools instance tools-k8s-worker-nfs-64 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:38:28] RESOLVED: InstanceDown: Project tools instance tools-k8s-worker-nfs-64 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [01:50:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [02:05:56] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [02:15:47] 10Cloud-VPS (Project-requests), 10wikimedia-irc-libera: Request creation of ircwebchat VPS project - https://phabricator.wikimedia.org/T283791#10101609 (10Gryllida) Hi All @Frostly @Tgr @bd808 @Andrew @stwalkerster @Legoktm @Bstorm I was in #wikipedia-en-help today and was frustrated, kiwiirc.com button... [02:50:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [03:00:56] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [03:08:28] FIRING: InstanceDown: Project tools instance tools-prometheus-7 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [03:10:28] 10Cloud-VPS (Project-requests), 10wikimedia-irc-libera: Request creation of ircwebchat VPS project - https://phabricator.wikimedia.org/T283791#10101627 (10Andrew) > How could I re-request the vm now? You can open a new project request ticket (with a different project name, please) at https://phabricator.w... [03:16:21] FIRING: HarborComponentDown: No data about Harbor components found. #page - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborComponentDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborComponentDown [03:18:28] RESOLVED: InstanceDown: Project tools instance tools-prometheus-7 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [03:21:13] 10Cloud-VPS (Project-requests), 10wikimedia-irc-libera: Request creation of ircwebchat VPS project - https://phabricator.wikimedia.org/T283791#10101628 (10Gryllida) Hi @Andrew thank you for your reply. Wouldn't that start a new phab ticket and become disconnected from this ticket? Can it be reactivated fr... [03:21:21] RESOLVED: HarborComponentDown: No data about Harbor components found. #page - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborComponentDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborComponentDown [04:50:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:00:56] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:09:28] FIRING: InstanceDown: Project tools instance tools-prometheus-7 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [05:15:00] FIRING: HarborProbeUnknown: Harbor might be down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborProbeUnknown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborProbeUnknown [05:15:00] FIRING: HarborComponentDown: No data about Harbor components found. #page - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborComponentDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborComponentDown [05:20:00] RESOLVED: HarborProbeUnknown: Harbor might be down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborProbeUnknown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborProbeUnknown [05:20:00] RESOLVED: HarborComponentDown: No data about Harbor components found. #page - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborComponentDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborComponentDown [06:14:28] RESOLVED: InstanceDown: Project tools instance tools-prometheus-7 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [07:59:19] !log sstefanova@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component ingress-nginx [08:00:33] !log sstefanova@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component ingress-nginx [08:17:53] 10Cloud-VPS (Project-requests), 10wikimedia-irc-libera: Request creation of ircwebchat VPS project - https://phabricator.wikimedia.org/T283791#10101950 (10Aklapper) New requests require new Phab tasks. You can mention and link a task in/from another task. [08:20:08] (03CR) 10David Caro: [C:03+2] openstack: security and server group list [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1067227 (owner: 10David Caro) [08:23:45] (03Merged) 10jenkins-bot: openstack: security and server group list [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1067227 (owner: 10David Caro) [08:26:52] 06cloud-services-team, 10Toolforge (Toolforge iteration 14): [infra,k8s] Upgrade Tools to k8s version 1.26 - https://phabricator.wikimedia.org/T370249#10101967 (10Slst2020) [08:40:10] 10Toolforge: Toolforge Aptfile still not producing working copy of `ffmpeg` - https://phabricator.wikimedia.org/T373565#10102002 (10dcaro) I think that's because there's no `Procfile` there. The ffmpeg fix is bound to the buildpack that parses the Procfile right now (I'll see if I can change it to use the Aptfil... [08:43:41] 10Toolforge (Toolforge iteration 14): Toolforge Aptfile still not producing working copy of `ffmpeg` - https://phabricator.wikimedia.org/T373565#10102031 (10dcaro) p:05Triage→03High [08:43:43] 10Toolforge (Toolforge iteration 14): Toolforge Aptfile still not producing working copy of `ffmpeg` - https://phabricator.wikimedia.org/T373565#10102034 (10dcaro) a:03dcaro [08:43:56] (03open) 10dcaro: fix_imagemagick: bind to the apt buildpack [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/60 (https://phabricator.wikimedia.org/T373565) [08:44:01] (03update) 10sstefanova: utils: add components to get_versions [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/492 [08:44:03] (03merge) 10sstefanova: utils: add components to get_versions [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/492 [08:46:55] (03update) 10sstefanova: bump ingress-nginx to v1.11.2 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/491 (https://phabricator.wikimedia.org/T373043) [08:49:36] 14Grid-Engine-to-K8s-Migration, 10Wiki-Loves-Monuments-Database, 13Patch-For-Review: Migrate heritage from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319787#10102043 (10dcaro) >>! In T319787#10101032, @JeanFred wrote: >>>! In T319787#10091157, @dcaro wrote: >> * You can... [08:49:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [08:54:01] (03update) 10sstefanova: bump ingress-nginx to v1.11.2 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/491 (https://phabricator.wikimedia.org/T373043) [09:04:56] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:39:28] FIRING: PuppetAgentFailure: Puppet agent failure detected on instance cloudinfra-idp-1 in project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [10:47:07] 10Toolforge (Toolforge iteration 14), 13Patch-For-Review: Toolforge Aptfile still not producing working copy of `ffmpeg` - https://phabricator.wikimedia.org/T373565#10102358 (10dcaro) That's not the *only* issue though, looking, I suspect that the other packages you have in the Aptfile might be colliding with... [10:50:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [11:00:20] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "etytree" project Buster deprecation - https://phabricator.wikimedia.org/T367529#10102406 (10Epantaleo) Hello, how can I access the etytree-a VM that was shut down? Can I just launch it from horizon? I need to copy folder /srv/datasets/dbnary/20170920/ Thanks, EP [11:00:56] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [12:15:06] (03approved) 10sstefanova: bump ingress-nginx to v1.11.2 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/491 (https://phabricator.wikimedia.org/T373043) [12:15:09] (03merge) 10sstefanova: bump ingress-nginx to v1.11.2 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/491 (https://phabricator.wikimedia.org/T373043) [12:15:10] (03update) 10sstefanova: bump ingress-nginx to v1.11.2 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/491 (https://phabricator.wikimedia.org/T373043) [13:21:34] 10Toolforge: [k8s, kube-proxy] "udpIdleTimeout" KubeProxyConfiguration deprecation - https://phabricator.wikimedia.org/T373537#10102687 (10Slst2020) [13:33:57] 10Toolforge: [k8s, cookbooks] Transient error during Toolsbeta k8s 1.25 -> 1.26 upgrade - https://phabricator.wikimedia.org/T373533#10102712 (10Slst2020) [13:37:42] (03update) 10dcaro: fix_imagemagick: bind to the apt buildpack [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/60 (https://phabricator.wikimedia.org/T373565) [14:07:59] (03update) 10sstefanova: fix_imagemagick: bind to the apt buildpack [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/60 (https://phabricator.wikimedia.org/T373565) (owner: 10dcaro) [14:07:59] (03approved) 10sstefanova: fix_imagemagick: bind to the apt buildpack [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/60 (https://phabricator.wikimedia.org/T373565) (owner: 10dcaro) [14:19:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:29:56] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:35:25] 06cloud-services-team: PuppetFailure Puppet failure on cloudcontrol2004-dev:9100 - https://phabricator.wikimedia.org/T373547#10103055 (10Andrew) 05Open→03Resolved a:03Andrew [14:43:01] (03approved) 10dcaro: fix_imagemagick: bind to the apt buildpack [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/60 (https://phabricator.wikimedia.org/T373565) [14:43:05] (03merge) 10dcaro: fix_imagemagick: bind to the apt buildpack [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/60 (https://phabricator.wikimedia.org/T373565) [14:44:37] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: builds-builder: bump to 0.0.116-20240829144318-0543cb47 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/493 (https://phabricator.wikimedia.org/T373565) [14:45:46] 06cloud-services-team, 10wikitech.wikimedia.org, 07LDAP: Striker: use idm for 2fa validation instead of wikitech - https://phabricator.wikimedia.org/T373461#10103093 (10Andrew) p:05Triage→03Low I'm not quite ready to close this as invalid but I'm dropping the priority since we are probably not doing it! [14:45:59] 06cloud-services-team, 10wikitech.wikimedia.org, 07LDAP, 13Patch-For-Review: Horizon: use idm for 2fa validation instead of wikitech - https://phabricator.wikimedia.org/T373462#10103090 (10Andrew) p:05Triage→03High a:03Andrew [14:46:58] 06cloud-services-team, 10wikitech.wikimedia.org, 07LDAP, 13Patch-For-Review: Horizon: use idm for 2fa validation instead of wikitech - https://phabricator.wikimedia.org/T373462#10103095 (10Andrew) p:05High→03Low We are probably skipping ahead to idp auth. [14:46:59] 06cloud-services-team: SystemdUnitDown Unit backup_vms.service on node cloudbackup1003 has been down for long. - https://phabricator.wikimedia.org/T373292#10103100 (10Andrew) 05Open→03Resolved a:03Andrew I believe this was caused by some cinder images in an inconsistent state (they were there and not... [14:52:05] 06cloud-services-team: NovafullstackSustainedFailures The automated tests were unable to create, provision and decommission a VM in the last 5h - https://phabricator.wikimedia.org/T373155#10103114 (10Andrew) 05Open→03Resolved a:03Andrew This was a side-effect of an upgrade in progress, now complete an... [14:52:34] 06cloud-services-team: SystemdUnitDown Unit neutron-openvswitch-agent.service on node cloudvirt1062 has been down for long. - https://phabricator.wikimedia.org/T373214#10103118 (10Andrew) 05Open→03Resolved a:03Andrew Side effect of an overly-zealous upgrade cookbook, now resolved. That host doesn't a... [14:53:13] 06cloud-services-team, 10Cloud-VPS: wmcs.ceph.osd.bootstrap_and_add cookbook should add fewer osds at once - https://phabricator.wikimedia.org/T372821#10103121 (10Andrew) a:03Andrew [15:02:29] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops: Memcached: Migrate wikitech to main memcached - https://phabricator.wikimedia.org/T371608#10103142 (10jijiki) [15:09:17] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops, 13Patch-For-Review: MVP: Privately serve wikitech via mwdebug1001 - https://phabricator.wikimedia.org/T371537#10103156 (10jijiki) [15:10:11] FIRING: SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1003 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [15:13:43] 06cloud-services-team, 10Cloud-VPS, 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Provisioning of Kubernetes cluster via Magnum stopped working around time of OpenStack upgrade - https://phabricator.wikimedia.org/T373227#10103203 (10Andrew) 05Open→03Resolved [15:15:11] RESOLVED: [2x] SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [15:17:28] FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance tools-prometheus-6 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [15:20:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:30:56] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:32:06] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops, 13Patch-For-Review: MVP: Privately serve wikitech via mwdebug1001 - https://phabricator.wikimedia.org/T371537#10103278 (10jijiki) [15:32:24] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops: Memcached: Migrate wikitech to main memcached - https://phabricator.wikimedia.org/T371608#10103286 (10jijiki) [15:32:28] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops, 13Patch-For-Review: mediawiki-config: consolidate labswiki - https://phabricator.wikimedia.org/T371374#10103287 (10jijiki) [15:32:38] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops: ☂ Migrate Wikitech to Kubernetes - https://phabricator.wikimedia.org/T292707#10103288 (10jijiki) [15:32:53] 06cloud-services-team, 10wikitech.wikimedia.org, 06Infrastructure-Foundations, 06serviceops, 13Patch-For-Review: LdapAuthentication: Disable extension from Wikitech - https://phabricator.wikimedia.org/T371592#10103309 (10jijiki) [15:32:59] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops, 13Patch-For-Review: mediawiki-config: consolidate labswiki - https://phabricator.wikimedia.org/T371374#10103310 (10jijiki) [15:33:43] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops, 13Patch-For-Review: mediawiki-config: consolidate labswiki - https://phabricator.wikimedia.org/T371374#10103316 (10jijiki) [15:33:50] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops, 13Patch-For-Review: MVP: Privately serve wikitech via mwdebug1001 - https://phabricator.wikimedia.org/T371537#10103273 (10jijiki) [15:38:12] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops: ☂ Migrate Wikitech to Kubernetes - https://phabricator.wikimedia.org/T292707#10103341 (10jijiki) [15:46:34] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops: Add a banner on wikitech regarding upcoming authentication changes - https://phabricator.wikimedia.org/T373615 (10jijiki) 03NEW [16:08:25] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops: Add a banner on wikitech regarding upcoming authentication changes - https://phabricator.wikimedia.org/T373615#10103490 (10bd808) Happy to help with this as needed. https://wikitech.wikimedia.org/wiki/MediaWiki:Sitenotice is the common method of running th... [16:19:11] 10Toolforge (Toolforge iteration 14), 13Patch-For-Review: Toolforge Aptfile still not producing working copy of `ffmpeg` - https://phabricator.wikimedia.org/T373565#10103515 (10dcaro) >>! In T373565#10102358, @dcaro wrote: > That's not the *only* issue though, looking, I suspect that the other packages you hav... [16:19:46] !log dcaro@urcuchillay toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-builder [16:19:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [16:24:38] !log dcaro@urcuchillay toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-builder [16:24:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [16:26:18] !log dcaro@urcuchillay tools START - Cookbook wmcs.toolforge.component.deploy for component builds-builder [16:26:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:29:12] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Remove or replace deployment-restbase04.deployment-prep.eqiad1.wikimedia.cloud (Buster deprecation) - https://phabricator.wikimedia.org/T370460#10103567 (10Jdlrobson) @Jgiannelos https://e... [16:30:10] vivian-rook closed https://github.com/toolforge/paws/pull/451 [16:31:23] 10PAWS: Upgrade to k8s 1.27 - https://phabricator.wikimedia.org/T373372#10103575 (10github-toolforge-bot) vivian-rook closed https://github.com/toolforge/paws/pull/451 [16:31:58] 10PAWS: Upgrade to k8s 1.27 - https://phabricator.wikimedia.org/T373372#10103576 (10rook) 05Open→03Resolved a:03rook [16:32:34] !log dcaro@urcuchillay tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-builder [16:32:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:43:55] (03approved) 10dcaro: builds-builder: bump to 0.0.116-20240829144318-0543cb47 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/493 (https://phabricator.wikimedia.org/T373565) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [16:43:57] (03update) 10dcaro: builds-builder: bump to 0.0.116-20240829144318-0543cb47 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/493 (https://phabricator.wikimedia.org/T373565) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [16:43:58] (03merge) 10dcaro: builds-builder: bump to 0.0.116-20240829144318-0543cb47 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/493 (https://phabricator.wikimedia.org/T373565) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [16:45:10] 10Toolforge (Toolforge iteration 14), 13Patch-For-Review: Toolforge Aptfile still not producing working copy of `ffmpeg` - https://phabricator.wikimedia.org/T373565#10103643 (10dcaro) Deployed, can you try building again? [17:12:15] (03open) 10dcaro: local: remove requests/limits from k8s [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/494 [17:15:06] (03update) 10dcaro: local: remove requests/limits from k8s [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/494 [17:17:14] (03open) 10dcaro: lima-vm: use 16 cpus and remove workers [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/186 [17:20:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:30:56] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:48:23] 06Toolforge-standards-committee: Adoption request for Yapperbot - https://phabricator.wikimedia.org/T361426#10103883 (10DavidTornheim) Feedback Request Service is running again. Sohom_Datta fix the problem: https://en.wikipedia.org/w/index.php?title=User_talk%3ANovem_Linguae&diff=1242910382&oldid=1242903226 I... [17:48:46] 06Toolforge-standards-committee: Adoption request for Yapperbot - https://phabricator.wikimedia.org/T361426#10103893 (10DavidTornheim) p:05Triage→03Low [17:54:14] 10Toolforge (Toolforge iteration 14), 13Patch-For-Review: Toolforge Aptfile still not producing working copy of `ffmpeg` - https://phabricator.wikimedia.org/T373565#10103917 (10derenrich) i just build again with the code at i'm getting the error (when using kubectl to get into the pod) ` I have no name!@vid... [18:15:45] (03PS1) 10Andrew Bogott: ceph.osd.bootstrap_and_add: default to only adding 2 osds at once [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1068839 (https://phabricator.wikimedia.org/T372821) [18:19:27] (03CR) 10Andrew Bogott: [C:03+2] ceph.osd.bootstrap_and_add: default to only adding 2 osds at once [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1068839 (https://phabricator.wikimedia.org/T372821) (owner: 10Andrew Bogott) [18:23:06] (03Merged) 10jenkins-bot: ceph.osd.bootstrap_and_add: default to only adding 2 osds at once [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1068839 (https://phabricator.wikimedia.org/T372821) (owner: 10Andrew Bogott) [18:33:13] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add [18:35:44] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: wmcs.ceph.osd.bootstrap_and_add cookbook should add fewer osds at once - https://phabricator.wikimedia.org/T372821#10104034 (10Andrew) 05Open→03Resolved [18:40:07] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) [18:40:24] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [18:47:20] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add [18:56:21] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "etytree" project Buster deprecation - https://phabricator.wikimedia.org/T367529#10104095 (10Andrew) Yes, you can just restart it on horizon. [18:57:15] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "etytree" project Buster deprecation - https://phabricator.wikimedia.org/T367529#10104096 (10Andrew) btw, you might want to move that data into a cinder volume and then mount it on the new VM. That will make the copying easier and also give you plenty of space.... [19:02:24] FIRING: CephSlowOps: Ceph cluster in eqiad has 7 slow ops - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephSlowOps - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephSlowOps [19:02:36] 06cloud-services-team: CephSlowOps Ceph cluster in eqiad has slow ops, which might be blocking some writes - https://phabricator.wikimedia.org/T373632 (10phaultfinder) 03NEW [19:10:28] FIRING: [2x] InstanceDown: Project tools instance tools-k8s-worker-105 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [19:15:28] RESOLVED: [2x] InstanceDown: Project tools instance tools-k8s-worker-105 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [19:17:24] RESOLVED: CephSlowOps: Ceph cluster in eqiad has 72 slow ops - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephSlowOps - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephSlowOps [19:31:54] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [19:57:38] 10Quarry, 10superset.wmcloud.org: Analysis and metrics collection for quarry and superset adoption - https://phabricator.wikimedia.org/T369150#10104242 (10rook) Since 2024-04-02 it would appear that superset has had 103 unique users and quarry has had 917 unique users. Between the two there is an overlap of 70... [19:59:01] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#10104246 (10wiki_willy) Hi @dcaro - just following up on this to see if you were ok with shipping these WMCS drives... [20:44:52] 06Toolforge-standards-committee, 07User-notice: Refresh membership of Toolforge standards committee - https://phabricator.wikimedia.org/T370474#10104411 (10bd808) [21:01:46] 06Toolforge-standards-committee, 07User-notice: Refresh membership of Toolforge standards committee - https://phabricator.wikimedia.org/T370474#10104443 (10bd808) I have just emailed the [[https://toolsadmin.wikimedia.org/tools/id/admin|members of the admin tool]] to solicit their feedback on the candidates as... [21:07:32] 06Toolforge-standards-committee, 07User-notice: Refresh membership of Toolforge standards committee - https://phabricator.wikimedia.org/T370474#10104453 (10bd808) [21:11:10] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops: Add a banner on wikitech regarding upcoming authentication changes - https://phabricator.wikimedia.org/T373615#10104462 (10Bugreporter) Most users do not active edit Wikitech wiki. Only those who want to edit Wikitech in this short transitional period need... [21:50:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [22:00:56] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [23:50:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks