[00:07:55] FIRING: MaxConntrack: Max conntrack at 81.81% on cloudvirt1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:10:50] RESOLVED: TfInfraTestDestroyFailed: Terraform failed to destroy the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [00:57:55] RESOLVED: MaxConntrack: Max conntrack at 80.77% on cloudvirt1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [01:45:25] 10Tools, 10VPS-Projects: Investigate why Wikifactmine ElasticSearch has stopped - https://phabricator.wikimedia.org/T198690#10037368 (10Bugreporter) 05Open→03Declined Project no longer active. [01:48:20] 10VPS-Projects, 10Wikidata, 10Wikidata-primary-sources: Request to return 405 on POST calls to SPARQL endpoint, Wikidata primary sources tool VPS project - https://phabricator.wikimedia.org/T192292#10037373 (10Bugreporter) 05Open→03Declined VPS project no longer active. [01:51:31] 10VPS-Projects: Migrate packagist-mirror and php-security-checker to Debian Buster - https://phabricator.wikimedia.org/T277250#10037377 (10Bugreporter) 05Open→03Invalid Buster is deprecated, please file a new task if you want to update them to newer versions. [02:33:46] 10Tools: Transfer ownership of SpinachBot from HTriedman (WMF) to HTriedman - https://phabricator.wikimedia.org/T371644#10037394 (10Htriedman) @Dzahn Great questions! I'm planning on rejoining the Foundation as a contractor under WME in early October. I'll be work on data products in and around the analytics inf... [03:04:22] 10Tools: Transfer ownership of SpinachBot from HTriedman (WMF) to HTriedman - https://phabricator.wikimedia.org/T371644#10037425 (10Dzahn) Hey @Htriedman If you would like to keep ssh access to analytics infra for that transitional period you can do that as volunteer. You could start that process by emailing @KF... [05:16:27] 10Tool-Kuvaton: Never ending webservice shell processes - https://phabricator.wikimedia.org/T371660 (10Zache) 03NEW [05:33:21] 10Tool-Kuvaton: Never ending webservice shell processes - https://phabricator.wikimedia.org/T371660#10037541 (10JJMC89) You should be able to delete them using `kubectl delete pod`, e.g. `kubectl delete pod shell-1722481685`. [05:37:16] 10Tool-Kuvaton, 10Toolforge: Never ending webservice shell processes - https://phabricator.wikimedia.org/T371660#10037544 (10taavi) [05:42:45] 10Tool-Kuvaton, 10Toolforge: Never ending webservice shell processes - https://phabricator.wikimedia.org/T371660#10037547 (10Zache) @JJMC89 Thank you for your answer, problem solved! [05:43:56] 10Tool-Kuvaton, 10Toolforge: Never ending webservice shell processes - https://phabricator.wikimedia.org/T371660#10037550 (10Zache) 05Open→03Resolved [05:48:45] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Remove or replace poolcounter06.deployment-prep.eqiad1.wikimedia.cloud (Buster deprecation) - https://phabricator.wikimedia.org/T370458#10037556 (10akosiaris) Found this from T332015 and a simple question on my sid... [05:49:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:29:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [10:11:56] FIRING: SystemdUnitDown: The service unit rsync_enterprise_htmldumps.service is in failed status on host clouddumps1001. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=clouddumps1001 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [10:38:34] FIRING: DiskSpace: Disk space cloudbackup1004:9100:/srv 5.996% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [12:06:56] FIRING: SystemdUnitDown: The systemd unit rsync_enterprise_htmldumps.service on node clouddumps1001 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=clouddumps1001 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [12:07:01] 06cloud-services-team: SystemdUnitDown Unit rsync_enterprise_htmldumps.service on node clouddumps1001 has been down for long. - https://phabricator.wikimedia.org/T371690 (10phaultfinder) 03NEW [12:47:37] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Quarry: Allow Quarry to query its own database - https://phabricator.wikimedia.org/T367415#10038538 (10fnegri) @bd808 I'm interested in your opinion on this one. I created a pull request, but I'm also wondering if anybody still wants it and maybe we should just igno... [12:51:28] RESOLVED: PuppetSyncFailure: Failed to update Puppet repository /srv/git/operations/puppet on instance gitlab-runners-puppetserver-01 in project gitlab-runners - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetSyncFailure [13:11:57] 06cloud-services-team, 10Wikidata, 10Wikidata Analytics (Kanban): Delete the wmdeanalytics Cloud VPS machine - https://phabricator.wikimedia.org/T371696 (10AndrewTavis_WMDE) 03NEW [13:14:28] FIRING: PuppetSyncFailure: Failed to update Puppet repository /srv/git/operations/puppet on instance gitlab-runners-puppetserver-01 in project gitlab-runners - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetSyncFailure [13:38:56] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops, 13Patch-For-Review: MVP: Privately serve wikitech via mw-on-k8s - https://phabricator.wikimedia.org/T371537#10038647 (10jijiki) [14:38:49] FIRING: DiskSpace: Disk space cloudbackup1004:9100:/srv 5.587% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [14:44:25] 06cloud-services-team, 10Wikidata, 10Wikidata Analytics: Delete the wmdeanalytics Cloud VPS machine - https://phabricator.wikimedia.org/T371696#10038869 (10AndrewTavis_WMDE) [15:21:41] (03PS2) 10Krinkle: frontend: Enable php-opcache, Debian 11 Bullseye to 12 Bookworm, PHP 8.1 to 8.3 [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1058203 [15:32:10] 10Tools: Transfer ownership of SpinachBot from HTriedman (WMF) to HTriedman - https://phabricator.wikimedia.org/T371644#10038962 (10Htriedman) Just sent @KFrancis an email, and I can tag @CDanis here (picked a random engineer from the infra foundations team page), just to bring this to his attention! [15:43:13] 10Tools, 06Infrastructure-Foundations: Transfer ownership of SpinachBot from HTriedman (WMF) to HTriedman - https://phabricator.wikimedia.org/T371644#10039024 (10CDanis) [15:43:30] 10Tools, 06Infrastructure-Foundations: Requested offboarding-to-volunteer of HTriedman // Transfer ownership of SpinachBot from HTriedman (WMF) to HTriedman - https://phabricator.wikimedia.org/T371644#10039029 (10CDanis) [15:46:56] RESOLVED: SystemdUnitDown: The service unit rsync_enterprise_htmldumps.service is in failed status on host clouddumps1001. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=clouddumps1001 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [15:46:56] RESOLVED: SystemdUnitDown: The systemd unit rsync_enterprise_htmldumps.service on node clouddumps1001 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=clouddumps1001 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [16:00:55] 06cloud-services-team, 10Cloud-VPS: Update designate sink plugins to work with caracal - https://phabricator.wikimedia.org/T371707 (10Andrew) 03NEW [16:01:36] 06cloud-services-team, 10Cloud-VPS: Update designate sink plugins to work with caracal - https://phabricator.wikimedia.org/T371707#10039072 (10Andrew) [16:15:34] 06cloud-services-team, 10Cloud-VPS: Update designate sink plugins to work with caracal - https://phabricator.wikimedia.org/T371707#10039088 (10Andrew) Oh, good news, I already did this all but one (possibly tricky) case. [16:18:22] 06cloud-services-team, 10wikitech.wikimedia.org, 06Trust-and-Safety: Account recovery help needed for Developer account tjones - https://phabricator.wikimedia.org/T371709 (10TJones) 03NEW [16:19:34] 06cloud-services-team, 10wikitech.wikimedia.org, 06Trust-and-Safety: Account recovery help needed for Developer account tjones - https://phabricator.wikimedia.org/T371709#10039113 (10TJones) [16:29:04] 06cloud-services-team, 10wikitech.wikimedia.org, 06Trust-and-Safety: Account recovery help needed for Developer account tjones - https://phabricator.wikimedia.org/T371709#10039120 (10Andrew) 05Open→03Resolved a:03Andrew OK, done. [16:31:44] 06cloud-services-team, 10wikitech.wikimedia.org, 06Trust-and-Safety: Account recovery help needed for Developer account tjones - https://phabricator.wikimedia.org/T371709#10039132 (10TJones) Thanks! [16:53:34] RESOLVED: DiskSpace: Disk space cloudbackup1004:9100:/srv 5.984% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [17:05:35] 10Tool-yearinreview, 10New-Engagement-Experiments: Improve the year in review tool for next year - https://phabricator.wikimedia.org/T362898#10039200 (10Jdlrobson) >>! In T362898#10020315, @Reputation22 wrote: > Hi @Jdlrobson Currently there are multiple copies of the same repo! > Should we merge them all in o... [17:40:40] 10Tools: [qrcode-generator] QR code generation image download failing - https://phabricator.wikimedia.org/T371715 (10ttaylor) 03NEW [17:49:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:59:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:09:48] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation), 10video2commons: Replace or remove Debian Buster VMs in 'video' cloud-vps project - https://phabricator.wikimedia.org/T360711#10039324 (10Don-vip) 05In progress→03Resolved Update completed! [18:11:23] 10Cloud-VPS (Quota-requests), 10video2commons: Request: add +16 cpu / +16 Gb ram to video project quota - https://phabricator.wikimedia.org/T371047#10039326 (10Don-vip) Thank you! I need this quota even after the migration to fix T365154: one of the many errors causing the permanent outages was too many ce... [18:30:13] 10Cloud-VPS (Debian Buster Deprecation), 06Infrastructure-Foundations, 10Puppet CI, 13Patch-For-Review: Cloud VPS "puppet-diffs" project Buster deprecation - https://phabricator.wikimedia.org/T367547#10039347 (10Dzahn) >>! In T367547#10015192, @jhathaway wrote: > I think this code creates a chicken and egg... [19:19:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:59:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks