[01:26:35] 10Release-Engineering-Team (They Live πŸ•ΆοΈπŸ§Ÿ), 10Anti-Harassment, 10Security-Team, 10iPoid-Service: Use Gitlab Security Pipeline for ipoid - https://phabricator.wikimedia.org/T338238 (10Mstyles) @kostajh alternatively we can continue to use the nodejs osv security template and allow it to fail until T309772 i... [05:04:47] 10Project-Admins: Create project tag for HotCat - https://phabricator.wikimedia.org/T340711 (10Unite_together) [07:48:26] 10Release-Engineering-Team (Radar), 10Developer-Advocacy, 10User-AKlapper: Develop a workflow to recommend / propose developers who should have +2 rights in Gerrit - https://phabricator.wikimedia.org/T199385 (10Aklapper) Adding #releng-radar as #Developer-Advocacy ceased to exist. [07:49:43] 10Phabricator, 10Developer-Advocacy, 10Documentation: Create second revision of Phab tutorial videos (smaller improvements etc based on feedback) - https://phabricator.wikimedia.org/T263480 (10Aklapper) [07:51:33] 10Release-Engineering-Team, 10Developer-Advocacy: [January 2024] Publish some Phabricator (and/or Gerrit) end-of-year stats for 2023 to wikitech-l@ - https://phabricator.wikimedia.org/T326562 (10Aklapper) p:05Lowestβ†’03Low [07:51:44] 10Release-Engineering-Team, 10Developer-Advocacy: [January 2024] Publish some Phabricator (and/or Gerrit) end-of-year stats for 2023 to wikitech-l@ - https://phabricator.wikimedia.org/T326562 (10Aklapper) Adding #Release-Engineering-Team as #Developer-Advocacy ceased to exist. [07:59:00] 10Release-Engineering-Team (They Live πŸ•ΆοΈπŸ§Ÿ), 10Anti-Harassment, 10Security-Team, 10iPoid-Service: Use Gitlab Security Pipeline for ipoid - https://phabricator.wikimedia.org/T338238 (10kostajh) >>! In T338238#8975051, @Mstyles wrote: > @kostajh alternatively we can continue to use the nodejs osv security tem... [08:09:22] hashar: hey, I won't make it to our meeting today. Sorry. Talk to you next week! [08:11:37] duesen: sure :] have a good day! [08:21:13] 10Project-Admins: Create project tag for HotCat - https://phabricator.wikimedia.org/T340711 (10Peachey88) @Unite_together Just confirming this has been discussed and agreed to by the HotCat development team? [08:23:48] 10Project-Admins: Create project tag for HotCat - https://phabricator.wikimedia.org/T340711 (10Unite_together) @Peachey88 Not yet. I thought it's sensible because this gadget has been used widely. [08:51:25] 10Project-Admins: Create project tag for HotCat - https://phabricator.wikimedia.org/T340711 (10Aklapper) 05Openβ†’03Declined Hi, this must have agreement by the development team. Creating another place that a team is supposed to watch and where a team is supposed to track and plan its work without the team kno... [10:13:49] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 3 others: Migrate group0 to Kubernetes - https://phabricator.wikimedia.org/T337490 (10Clement_Goubert) [10:21:18] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Machine-Learning-Team: Python torch fills disk of CI Jenkins instances - https://phabricator.wikimedia.org/T338317 (10elukey) @hashar is there a clean up command that we (ML SREs) can run to help the clean up while we get bigger partition... [10:43:57] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536 (10Clement_Goubert) [10:44:34] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536 (10Clement_Goubert) [10:44:44] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 3 others: Migrate group0 to Kubernetes - https://phabricator.wikimedia.org/T337490 (10Clement_Goubert) 05In progressβ†’03Resolved [10:46:29] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 3 others: Migrate group1 to Kubernetes - https://phabricator.wikimedia.org/T340549 (10Clement_Goubert) [10:46:39] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536 (10Clement_Goubert) [10:46:51] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 3 others: Migrate group1 to Kubernetes - https://phabricator.wikimedia.org/T340549 (10Clement_Goubert) 05Openβ†’03In progress [10:50:49] maintenance-disconnect-full-disks build 504275 integration-agent-docker-1038 (/: 29%, /srv: 17%, /var/lib/docker: 95%): OFFLINE due to disk space [10:55:34] maintenance-disconnect-full-disks build 504276 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 95%): RECOVERY disk space OK [11:06:14] maintenance-disconnect-full-disks build 504278 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 98%): OFFLINE due to disk space [11:15:35] maintenance-disconnect-full-disks build 504280 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 100%): still OFFLINE due to disk space [11:40:32] maintenance-disconnect-full-disks build 504285 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 100%): still OFFLINE due to disk space [11:43:42] 12:34:59 1) ForeignResourceStructureTest::testVerifyIntegrity [11:43:42] 12:34:59 Exception: Failed to download resource at https://raw.githubusercontent.com/harvesthq/chosen/v1.8.2/LICENSE.md [11:43:44] booo [11:48:41] URL loads for me [11:48:44] did github have a unicorn moment? [11:50:21] github/ci/labs/whatever [11:50:22] yeah [11:50:25] just +2 again :P [12:05:34] maintenance-disconnect-full-disks build 504290 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 100%): still OFFLINE due to disk space [12:22:06] 10Gerrit, 10SRE: setup/install gerrit1001 - https://phabricator.wikimedia.org/T231046 (10Jclark-ctr) [12:31:16] maintenance-disconnect-full-disks build 504295 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 100%): still OFFLINE due to disk space [12:38:58] 10Release-Engineering-Team (Priority Backlog πŸ“₯), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.41.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T340243 (10Reedy) [13:01:30] maintenance-disconnect-full-disks build 504300 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 100%): still OFFLINE due to disk space [13:21:18] maintenance-disconnect-full-disks build 504305 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 100%): still OFFLINE due to disk space [13:39:11] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Machine-Learning-Team: Python torch fills disk of CI Jenkins instances - https://phabricator.wikimedia.org/T338317 (10hashar) I more or less fire fought some of them by heading to the instance and issuing `sudo docker buildx prune --force... [13:41:29] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Machine-Learning-Team: Python torch fills disk of CI Jenkins instances - https://phabricator.wikimedia.org/T338317 (10hashar) >>! In T338317#8971453, @isarantopoulos wrote: > There isn't any cache in the image since only the specific file... [13:45:42] maintenance-disconnect-full-disks build 504310 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 100%): still OFFLINE due to disk space [13:45:58] 10GitLab (Project Migration), 10Release-Engineering-Team (Priority Backlog πŸ“₯), 10API Platform, 10Anti-Harassment, and 18 others: Migrate PipelineLib repos to GitLab - https://phabricator.wikimedia.org/T332953 (10JArguello-WMF) [13:51:37] 10Release-Engineering-Team, 10Scap, 10Voice & Tone: scap: inconsistent use of dashes and underscores in step names - https://phabricator.wikimedia.org/T340745 (10taavi) p:05Triageβ†’03Low [13:55:12] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Cloud-VPS (Quota-requests): Rebuild WMCS integration instances to larger flavor - https://phabricator.wikimedia.org/T340070 (10hashar) >>! In T340070#8974504, @rook wrote: > Oh I see, sorry I did not understand that you were seeking a new... [14:00:39] 10GitLab (Pipeline Services Migration🐀), 10serviceops-collab, 10Patch-For-Review: Move micro sites from Ganeti to Kubernetes and from Gerrit to GitLab - https://phabricator.wikimedia.org/T300171 (10LSobanski) [14:06:31] 10Continuous-Integration-Config: integration-agent-docker-1035 free disk space flapping, causing Gerrit patches to not merge - https://phabricator.wikimedia.org/T340569 (10hashar) [14:06:58] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Cloud-VPS (Quota-requests): Rebuild WMCS integration instances to larger flavor - https://phabricator.wikimedia.org/T340070 (10hashar) [14:07:51] 10Continuous-Integration-Config: integration-agent-docker-1035 free disk space flapping, causing Gerrit patches to not merge - https://phabricator.wikimedia.org/T340569 (10hashar) The root cause is `pytorch` creating a 14GB image layer: {T338317} The fix is to have a larger disk space on the instances: {T340070} [14:09:12] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Machine-Learning-Team: Python torch fills disk of CI Jenkins instances - https://phabricator.wikimedia.org/T338317 (10hashar) I reopened it merely to investigate whether maybe pytorch layer could be shrunk somehow but that does not seem t... [14:09:19] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Machine-Learning-Team: Python torch fills disk of CI Jenkins instances - https://phabricator.wikimedia.org/T338317 (10hashar) 05Openβ†’03Resolved [14:10:15] 10Release-Engineering-Team (Priority Backlog πŸ“₯), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.41.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T340243 (10matmarex) [14:10:59] maintenance-disconnect-full-disks build 504315 integration-agent-docker-1030 (/: 29%, /srv: 52%, /var/lib/docker: 98%): OFFLINE due to disk space [14:10:59] maintenance-disconnect-full-disks build 504315 integration-agent-docker-1031 (/: 29%, /srv: 33%, /var/lib/docker: 99%): OFFLINE due to disk space [14:10:59] maintenance-disconnect-full-disks build 504315 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 100%): still OFFLINE due to disk space [14:12:11] 10Continuous-Integration-Config: integration-agent-docker-1035 free disk space flapping, causing Gerrit patches to not merge - https://phabricator.wikimedia.org/T340569 (10Jdlrobson) Thanks for fixing! [14:15:51] maintenance-disconnect-full-disks build 504316 integration-agent-docker-1030 (/: 29%, /srv: 39%, /var/lib/docker: 97%): RECOVERY disk space OK [14:25:55] maintenance-disconnect-full-disks build 504318 integration-agent-docker-1031 (/: 29%, /srv: 19%, /var/lib/docker: 97%): RECOVERY disk space OK [14:30:44] maintenance-disconnect-full-disks build 504319 integration-agent-docker-1031 (/: 29%, /srv: 30%, /var/lib/docker: 97%): OFFLINE due to disk space [14:35:46] maintenance-disconnect-full-disks build 504320 integration-agent-docker-1031 (/: 29%, /srv: 19%, /var/lib/docker: 97%): RECOVERY disk space OK [14:35:46] maintenance-disconnect-full-disks build 504320 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 100%): still OFFLINE due to disk space [14:45:52] maintenance-disconnect-full-disks build 504322 integration-agent-docker-1030 (/: 29%, /srv: 56%, /var/lib/docker: 100%): OFFLINE due to disk space [14:51:12] maintenance-disconnect-full-disks build 504323 integration-agent-docker-1030 (/: 29%, /srv: 53%, /var/lib/docker: 96%): RECOVERY disk space OK [14:55:53] maintenance-disconnect-full-disks build 504324 integration-agent-docker-1031 (/: 29%, /srv: 29%, /var/lib/docker: 97%): OFFLINE due to disk space [15:00:48] maintenance-disconnect-full-disks build 504325 integration-agent-docker-1031 (/: 29%, /srv: 29%, /var/lib/docker: 97%): still OFFLINE due to disk space [15:00:48] maintenance-disconnect-full-disks build 504325 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 100%): still OFFLINE due to disk space [15:05:56] maintenance-disconnect-full-disks build 504326 integration-agent-docker-1031 (/: 29%, /srv: 19%, /var/lib/docker: 97%): RECOVERY disk space OK [15:09:30] !log integration: sudo cumin --force 'name:docker' 'docker buildx prune --force' [15:09:32] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:10:31] maintenance-disconnect-full-disks build 504327 integration-agent-docker-1038 (/: 29%, /srv: 12%, /var/lib/docker: 3%): RECOVERY disk space OK [15:18:44] 10Release-Engineering-Team (Priority Backlog πŸ“₯), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.41.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T340243 (10matmarex) [15:20:00] !log Adding new columns for the CampaignEvents extension in beta wikishared # T340694 [15:20:02] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:20:02] T340694: Create new participant questions columns in beta DB - https://phabricator.wikimedia.org/T340694 [15:22:44] !log Not adding new columns because the table definition is wrong [15:22:45] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:31:53] 10Beta-Cluster-Infrastructure, 10CampaignEvents, 10Campaign-Registration, 10Campaign-Tools (Campaign-Tools-Current-Sprint): Create new participant questions columns in beta DB - https://phabricator.wikimedia.org/T340694 (10Daimona) The table definition is actually wrong -- the index should not be on `cep_a... [16:10:46] 10Beta-Cluster-Infrastructure, 10CampaignEvents, 10Campaign-Registration, 10Campaign-Tools (Campaign-Tools-Current-Sprint): Create new participant questions columns in beta DB - https://phabricator.wikimedia.org/T340694 (10Daimona) [16:27:20] (03PS1) 10Ssingh: Zuul: [operations/debs/dnsdist] Add debian-glue CI [integration/config] - 10https://gerrit.wikimedia.org/r/934379 [20:06:36] NOOOOOOOOOOOOOOOOOOOOOOOOOOOO [20:06:56] * hashar flees at the view of `Exec['apt-get update']` [20:07:48] that Exec was always there :) taavi's just trying to make it fire at a better moment [20:13:01] TIL that there are automatic tags based on the class where a resource is declared [20:13:38] from /var/log/cloud-init-output.log: very early on: /usr/bin/cloud-init-per: 63: puppet: not found [20:14:03] though maybe that is the run from when the image got created [20:14:34] 10Continuous-Integration-Infrastructure, 10serviceops-collab: allow mwmaint/cumin hosts to connect to http on contint - https://phabricator.wikimedia.org/T340788 (10Dzahn) [20:14:43] yeah June 8th .. [20:15:03] 10Continuous-Integration-Infrastructure, 10serviceops-collab: allow mwmaint/cumin hosts to connect to http on contint - https://phabricator.wikimedia.org/T340788 (10Dzahn) [20:18:46] https://phabricator.wikimedia.org/P49494 from June 8th [20:18:54] +# -updates, previously known as 'volatile' [20:18:54] +deb http://mirrors.wikimedia.org/debian/ bullseye-updates main contrib non-free [20:18:54] +deb-src http://mirrors.wikimedia.org/debian/ bullseye-updates main contrib non-free [20:19:02] which comes from [20:19:05] Notice: /Stage[main]/Apt/File[/etc/apt/sources.list]/content: [20:20:36] maintenance-disconnect-full-disks build 504389 integration-agent-docker-1026 (/: 29%, /srv: 20%, /var/lib/docker: 97%): OFFLINE due to disk space [20:25:32] maintenance-disconnect-full-disks build 504390 integration-agent-docker-1026 (/: 29%, /srv: 15%, /var/lib/docker: 95%): RECOVERY disk space OK [20:28:12] taavi: /etc/cloud/templates/sources.list.debian.tmpl has: `deb {{security}} {{codename}}/updates main` which is from January 15 2021 [20:28:30] and comes from the cloud-init package [20:29:08] looks like the template hasn't been updated when they did the switch [20:30:21] there is code in our puppet to fix that by overwriting the file. there was jsut some drift in when that overwrite happens that needs to be fixed so that the apt-get update will work. [20:33:11] yeah I can see the fix [20:33:49] then when the instance boot for the first time, cloud init kicks in and regenerate the /etc/apt/sources.list from the faulty template [20:34:12] Cloud-init v. 20.4.1 running 'modules:config' at Thu, 29 Jun 2023 18:36:26 +0000. Up 16.85 seconds. [20:34:13] Err:11 http://security.debian.org bullseye/updates Release [20:50:56] I dumped my finding on https://gerrit.wikimedia.org/r/c/operations/puppet/+/934409 :] [20:53:14] * hashar sleeps [21:10:34] maintenance-disconnect-full-disks build 504399 integration-agent-docker-1026 (/: 29%, /srv: 19%, /var/lib/docker: 100%): OFFLINE due to disk space [21:15:32] maintenance-disconnect-full-disks build 504400 integration-agent-docker-1026 (/: 29%, /srv: 19%, /var/lib/docker: 100%): still OFFLINE due to disk space [21:20:29] maintenance-disconnect-full-disks build 504401 integration-agent-docker-1026 (/: 29%, /srv: 7%, /var/lib/docker: 9%): RECOVERY disk space OK [21:23:13] (03CR) 10Hashar: [C: 03+2] Zuul: [operations/debs/dnsdist] Add debian-glue CI [integration/config] - 10https://gerrit.wikimedia.org/r/934379 (owner: 10Ssingh) [21:25:00] (03Merged) 10jenkins-bot: Zuul: [operations/debs/dnsdist] Add debian-glue CI [integration/config] - 10https://gerrit.wikimedia.org/r/934379 (owner: 10Ssingh) [21:25:41] (03CR) 10Hashar: [C: 03+2] "Deployed" [integration/config] - 10https://gerrit.wikimedia.org/r/934379 (owner: 10Ssingh) [21:48:52] 10Beta-Cluster-Infrastructure, 10Data-Engineering, 10Event-Platform Value Stream, 10MW-1.41-notes (1.41.0-wmf.12; 2023-06-06): cirrusSearchCheckerJob JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322491 (10JArguello-WMF) [21:50:15] 10Continuous-Integration-Config, 10Release-Engineering-Team (Radar), 10ChangeProp, 10Data-Engineering, and 5 others: Run EventBus tests in MediaWiki core CI - https://phabricator.wikimedia.org/T257583 (10JArguello-WMF) [21:51:53] 10Release-Engineering-Team (Radar), 10Data-Engineering, 10Event-Platform Value Stream: Stop using puppet + git pull for auto deployment of schema repos - https://phabricator.wikimedia.org/T274901 (10JArguello-WMF) [22:03:52] 10GitLab (CI & Job Runners), 10Performance Issue: Improve speed of Gitlab CI - https://phabricator.wikimedia.org/T311111 (10JArguello-WMF) [22:47:17] 10phabricator maintenance bot: Make it possible to add reviewers automatically to patches uploaded by Maintenance bot - https://phabricator.wikimedia.org/T340796 (10Urbanecm) [22:49:51] 10phabricator maintenance bot: Make it possible to add reviewers automatically to patches uploaded by Maintenance bot - https://phabricator.wikimedia.org/T340796 (10Dzahn) It might be an option to resolve this by just editing https://www.mediawiki.org/wiki/Git/Reviewers and adding the right regex.