[05:53:53] 10Release-Engineering-Team (They Live 🕶️🧟), 10Patch-For-Review, 10Release, 10Train Deployments: 1.41.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T337525 (10Aklapper) [06:35:48] 10Project-Admins: Add puppet-core, puppet-infra tags - https://phabricator.wikimedia.org/T336153 (10Aklapper) @joanna_borun ping [06:36:40] 10Project-Admins: Create project tag for User-aborrero - https://phabricator.wikimedia.org/T337737 (10Aklapper) 05Open→03Resolved a:03Aklapper Requested public project #User-aborrero has been created: https://phabricator.wikimedia.org/project/view/6583/ (In case you need to edit the project or project wor... [07:11:25] hashar: hey, C.Scott commented on https://gerrit.wikimedia.org/r/c/integration/config/+/923562 [07:12:13] are you ok with trying to have parsoid as a dependency for the Flow? We are currently blocked on this [07:15:11] 10Release-Engineering-Team (They Live 🕶️🧟), 10Patch-For-Review, 10Release, 10Train Deployments: 1.41.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T337525 (10Legoktm) [07:22:19] duesen: what ever C.Scott says :] [07:22:22] he is the expert! [07:25:33] (03CR) 10Hashar: [C: 03+2] "+2 per C. Scott" [integration/config] - 10https://gerrit.wikimedia.org/r/923562 (owner: 10Daniel Kinzler) [07:25:41] hashar: ty! [07:25:58] duesen: sorry for the delay, but I really wanted C.Scott to review that one [07:26:41] 10Project-Admins: Create a #wmcs-superset tag - https://phabricator.wikimedia.org/T336940 (10Aklapper) 05Open→03Resolved a:03Aklapper Thanks. I have updated Data Engineering's https://phabricator.wikimedia.org/project/profile/5683/ accordingly and decided to differentiate by URL name (as that does not requ... [07:26:43] 10Project-Admins: Create project tag for wmcs-superset - https://phabricator.wikimedia.org/T337024 (10Aklapper) [07:27:15] (03Merged) 10jenkins-bot: Enable the parsoid extension when testing Flow [integration/config] - 10https://gerrit.wikimedia.org/r/923562 (owner: 10Daniel Kinzler) [07:27:23] 10Project-Admins: Create a #wmcs-superset tag - https://phabricator.wikimedia.org/T336940 (10Aklapper) [07:27:27] 10Project-Admins: Create project tag for wmcs-superset - https://phabricator.wikimedia.org/T337024 (10Aklapper) [07:29:08] 10Project-Admins: Create a #wmcs-superset tag - https://phabricator.wikimedia.org/T336940 (10Aklapper) [07:31:31] duesen: deployed! [07:31:54] 10Release-Engineering-Team (They Live 🕶️🧟), 10Patch-For-Review, 10Release, 10Train Deployments: 1.41.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T337525 (10Legoktm) [08:01:59] 10GitLab: Pressing rebase on MR gives "Something went wrong. Please try again." - https://phabricator.wikimedia.org/T337816 (10Legoktm) [08:04:23] 10Release-Engineering-Team (Seen), 10Data-Persistence (work done), 10Security-Team, 10iPoid-Service, and 2 others: Determine CI best practices for service which connects to MySQL - https://phabricator.wikimedia.org/T308789 (10kostajh) >>! In T308789#8890752, @thcipriani wrote: >>>! In T308789#8890456, @kos... [08:10:05] 10Project-Admins: Create project tag for google-chrome-user-agent-deprecation - https://phabricator.wikimedia.org/T337819 (10kostajh) [08:12:54] 10Project-Admins: Create project tag for User-aborrero - https://phabricator.wikimedia.org/T337737 (10aborrero) Thanks! [08:13:10] 10Project-Admins: Create project tag for client-hints - https://phabricator.wikimedia.org/T337821 (10kostajh) [08:17:54] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Zuul: Zuul jenkins-bot user holding open SSH sessions - https://phabricator.wikimedia.org/T309376 (10hashar) If my theory about a race condition between the source and reporter stands true, we might workaround it by using two diffe... [08:26:04] hashar: big <3 for still trying to figure out T309376 :D not to jinx it but it doesn't seem to have happened for a while.. [08:26:05] T309376: Zuul jenkins-bot user holding open SSH sessions - https://phabricator.wikimedia.org/T309376 [08:33:05] 10GitLab (Project Migration), 10Release-Engineering-Team (Priority Backlog 📥), 10API Platform, 10Anti-Harassment, and 19 others: Migrate PipelineLib repos to GitLab - https://phabricator.wikimedia.org/T332953 (10jnuche) When you're ready to publish your docs/coverage with [[ https://gitlab.wikimedia.org/re... [08:36:25] FYI, I'm going to decommission doc1002.eqiad and doc2001.codfw in the next 30 minutes. These are the old buster machines which have been removed from the group of updated servers since last Wednesday's change (https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/5601f68b493bbc782f9682a0e97139696899d137). [08:36:27] TheresNoTime: yeah I am pretty convinced that is a race condition between two threads using the same instance/object [08:37:15] TheresNoTime: which would happen when Zuul tries to reestablish a connection to receive events while at the same time it tries to send a report. Both would establish a ssh connection but one of them is lost/replaced (while still being established to the server) [08:37:25] I have a theory to fix it :] [08:37:32] eoghan: wonderful thank you :] [08:38:06] Well I'm glad you know what's going on 😅 [08:38:58] eoghan: I had a quick look at the home directories on doc1002 and there is nothing in them so yeah I think it is all good to drop them [08:40:20] TheresNoTime: it is something like: `client = open_ssh()` being run twice in parallel. That establishes two connections but since the `client` variable is on the same instance only one is stored and used later. The other established connection is still established but not used further [08:40:38] TheresNoTime: but our Gerrit imposes a max of 4 connections per user, so that ghost connection goes against the quota [08:40:59] the four connections are: 1 for receiving events, 1 for emitting reviews back to Gerrit changes, 2 for the zuul-merger [08:41:24] Will it take much to "fix"? [08:41:45] with the ghost connection established, the zuul-merger lacks a slot to do its merge which results in "This change failed to merge" cause it got a ssh denied when doing a git fetch [08:42:18] i have send a puppet patch to add an exta connection with a different name to avoid the race condition [08:42:32] then the Zuul layout would need to be adjusted to send reports over that new connection [08:42:36] and I think it might fix it :] [08:43:03] fun thing is the race condition has been around for years and years and I think it is still present in the modern upstream Zuul [08:43:18] but they use different connection names instead of the legacy shared `gerrit` so are probably not affected [08:43:26] anyway. That was a large brain dump. I think I finally got the fix [08:49:55] ~~is the fix "uninstall Gerrit and move to GitLab"?~~ [08:50:35] Are we still going to be using Zuul/Jenkins when we move to GitLab by the way? [09:19:17] 10GitLab (Project Migration), 10Release-Engineering-Team (They Live 🕶️🧟), 10serviceops-collab: Provide mechanism to publish to doc.wikimedia.org from GitLab CI - https://phabricator.wikimedia.org/T336168 (10jnuche) Thanks a lot for that @Dzahn! [09:20:19] 10GitLab (Project Migration), 10Release-Engineering-Team (They Live 🕶️🧟), 10serviceops-collab: Provide mechanism to publish to doc.wikimedia.org from GitLab CI - https://phabricator.wikimedia.org/T336168 (10jnuche) 05Open→03Resolved @hashar Deleting the old docs is part of the steps of migrating a repo t... [09:20:53] 10GitLab (Project Migration), 10Release-Engineering-Team (Priority Backlog 📥), 10API Platform, 10Anti-Harassment, and 19 others: Migrate PipelineLib repos to GitLab - https://phabricator.wikimedia.org/T332953 (10jnuche) [09:21:05] 10GitLab (Project Migration), 10Release-Engineering-Team (They Live 🕶️🧟), 10serviceops-collab: Provide mechanism to publish to doc.wikimedia.org from GitLab CI - https://phabricator.wikimedia.org/T336168 (10jnuche) As part of teams migrating to use docpub, from the team side they should update any references... [09:35:18] (03PS1) 10Hashar: zuul: report using a different connection [integration/config] - 10https://gerrit.wikimedia.org/r/924900 (https://phabricator.wikimedia.org/T309376) [09:36:33] (03CR) 10CI reject: [V: 04-1] zuul: report using a different connection [integration/config] - 10https://gerrit.wikimedia.org/r/924900 (https://phabricator.wikimedia.org/T309376) (owner: 10Hashar) [09:39:11] those bots are never happy [09:39:59] (03PS2) 10Hashar: zuul: report using a different connection [integration/config] - 10https://gerrit.wikimedia.org/r/924900 (https://phabricator.wikimedia.org/T309376) [09:41:11] (03CR) 10CI reject: [V: 04-1] zuul: report using a different connection [integration/config] - 10https://gerrit.wikimedia.org/r/924900 (https://phabricator.wikimedia.org/T309376) (owner: 10Hashar) [10:10:26] (03PS3) 10Hashar: zuul: report using a different connection [integration/config] - 10https://gerrit.wikimedia.org/r/924900 (https://phabricator.wikimedia.org/T309376) [10:11:14] 10Continuous-Integration-Infrastructure, 10serviceops-collab, 10Patch-For-Review: Migrate doc hosts to Bullseye - https://phabricator.wikimedia.org/T319477 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by eoghan@cumin1001 for hosts: `doc1002.eqiad.wmnet` - doc1002.eqiad.wmnet (**PASS**)... [10:11:58] (03CR) 10CI reject: [V: 04-1] zuul: report using a different connection [integration/config] - 10https://gerrit.wikimedia.org/r/924900 (https://phabricator.wikimedia.org/T309376) (owner: 10Hashar) [10:12:25] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Zuul, 10Patch-For-Review: Zuul jenkins-bot user holding open SSH sessions - https://phabricator.wikimedia.org/T309376 (10hashar) a:03hashar [10:13:59] that is never ending [10:22:17] (03CR) 10Hashar: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/924900 (https://phabricator.wikimedia.org/T309376) (owner: 10Hashar) [10:26:52] (03CR) 10Hashar: "That failed previously because I forgot to add the connection in the test suite (which is done via ConfigParser). I have also adjusted the" [integration/config] - 10https://gerrit.wikimedia.org/r/924900 (https://phabricator.wikimedia.org/T309376) (owner: 10Hashar) [10:27:00] I think that will do it [10:27:01] lunch & [10:42:32] 10GitLab (Integrations), 10Release-Engineering-Team (Priority Backlog 📥), 10translatewiki.net: Set up TranslateWiki.net exports to push (and merge) to Wikimedia GitLab - https://phabricator.wikimedia.org/T334419 (10Nikerabbit) I'm curious about the efficiency of the proposed manual review workflow. How many... [10:55:52] PROBLEM - Check systemd state on doc1003 is CRITICAL: CRITICAL - degraded: The following units failed: rsync-doc-doc1002.eqiad.wmnet.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [11:03:01] 10Project-Admins: Create project tag for google-chrome-user-agent-deprecation - https://phabricator.wikimedia.org/T337819 (10kostajh) Thinking about it some more, maybe a regular "Component" or "Tag" would work as well. There's been a fair amount of discussion in {T295073} and subtasks about how the Google Chro... [11:03:36] 10Project-Admins: Create project tag for google-chrome-user-agent-deprecation - https://phabricator.wikimedia.org/T337819 (10kostajh) [11:14:19] 10Continuous-Integration-Infrastructure, 10serviceops-collab, 10Patch-For-Review: Migrate doc hosts to Bullseye - https://phabricator.wikimedia.org/T319477 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by eoghan@cumin1001 for hosts: `doc2001.codfw.wmnet` - doc2001.codfw.wmnet (**PASS**)... [11:37:37] RECOVERY - Check systemd state on doc1003 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [11:47:32] 10Continuous-Integration-Infrastructure, 10serviceops-collab, 10Patch-For-Review: Migrate doc hosts to Bullseye - https://phabricator.wikimedia.org/T319477 (10eoghan) 05Open→03Resolved The two remaining buster hosts were decommissioned and old rsync timer jobs have been removed. [13:19:33] 10Continuous-Integration-Infrastructure, 10serviceops-collab, 10Patch-For-Review: Migrate doc hosts to Bullseye - https://phabricator.wikimedia.org/T319477 (10hashar) Thank you @eoghan! [13:28:33] hashar: it worked, thank you! the patch is passing now: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Flow/+/923354 [13:36:28] duesen: yeah, we'd need some logic to have Parsoid enabled automagically for the api testing [13:36:35] but meanwhile that hacks work [13:37:04] until mediawiki/services/parsoid.git@master receives some change not compatible with whatever version is used by mediawiki/core in which case the jobs will be broken [13:38:09] hashar: we can easily enable the extension per default, using the code in the vendor directly. Composer already pulls it in, there is no reason to clone it. [13:38:24] But really, it should simply not be in the same repo. [13:38:32] But really really, it shouldn't exist... [13:59:54] 10Release-Engineering-Team (Deployment Training Requests): Deployment training request for **YOUR USERNAME** - https://phabricator.wikimedia.org/T337861 (10isarantopoulos) [14:02:27] `git blame` yielding broken stuff from 2009 :-( [14:06:25] 10Release-Engineering-Team (Deployment Training Requests): Deployment training request for Ilias Sarantopoulos (isaranto) - https://phabricator.wikimedia.org/T337861 (10isarantopoulos) [14:12:02] 10Release-Engineering-Team (They Live 🕶️🧟), 10Patch-For-Review, 10Release, 10Train Deployments: 1.41.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T337525 (10Func) [14:30:39] maintenance-disconnect-full-disks build 495967 integration-agent-docker-1032 (/: 29%, /srv: 95%, /var/lib/docker: 45%): OFFLINE due to disk space [14:35:35] maintenance-disconnect-full-disks build 495968 integration-agent-docker-1032 (/: 29%, /srv: 14%, /var/lib/docker: 42%): RECOVERY disk space OK [14:47:44] TheresNoTime: We will not be using Jenkins with GitLab. [14:49:10] 10Release-Engineering-Team (Deployment Training Requests): Deployment training request for Ilias Sarantopoulos (isaranto) - https://phabricator.wikimedia.org/T337861 (10isarantopoulos) [15:02:14] 10GitLab (Integrations), 10Release-Engineering-Team (Priority Backlog 📥), 10translatewiki.net: Set up TranslateWiki.net exports to push (and merge) to Wikimedia GitLab - https://phabricator.wikimedia.org/T334419 (10dancy) @Nikerabbit Using https://gitlab.wikimedia.org/search?scope=merge_requests&search=Local... [15:37:21] 10Release-Engineering-Team (Priority Backlog 📥), 10Scap, 10Research, 10Patch-For-Review: article-recommender: clean up git-fat removal - https://phabricator.wikimedia.org/T317212 (10demon) I need someone to review the linked patch :) [16:00:20] 10Release-Engineering-Team (They Live 🕶️🧟), 10Patch-For-Review, 10Release, 10Train Deployments: 1.41.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T330216 (10demon) 05Open→03Resolved [16:09:03] 10Release-Engineering-Team (They Live 🕶️🧟), 10Scap: `scap backport` dependency check for cherry-picked mediawiki commits - https://phabricator.wikimedia.org/T334984 (10jeena) 05Open→03In progress a:03jeena [16:11:58] 10Release-Engineering-Team, 10SRE, 10Security-Team, 10Wikimedia-GitHub, and 3 others: Add github.com/wikimedia as an SCM for Semgrep Cloud - https://phabricator.wikimedia.org/T337561 (10Kappakayala) @LSobanski will connect on this to gather more information and discuss on next steps. [16:25:22] 10GitLab (Integrations), 10Release-Engineering-Team (Priority Backlog 📥), 10translatewiki.net: Set up TranslateWiki.net exports to push (and merge) to Wikimedia GitLab - https://phabricator.wikimedia.org/T334419 (10Nikerabbit) I would prefer to review the output from the CI which is usually only a few lines... [16:30:19] dduvall, ^demon: significant issues with at least editing unrelated to the train - https://phabricator.wikimedia.org/T337700 [16:30:49] RhinosF1: ah, thank you [16:31:40] dduvall: if you can prod people with knowledge, that would be useful. Discussion in -tech but at least 1 wiki is uneditable for people using its own language. [16:35:03] 10GitLab (Account Approval), 10Release-Engineering-Team, 10Abstract Wikipedia team: Requesting GitLab account activation for Nik.xyz.in - https://phabricator.wikimedia.org/T337534 (10brennen) 05Open→03Resolved a:03brennen [16:35:15] 10GitLab (Account Approval), 10Release-Engineering-Team, 10Abstract Wikipedia team: Requesting GitLab account activation for Allan Jeremy - https://phabricator.wikimedia.org/T337770 (10brennen) 05Open→03Resolved a:03brennen [16:37:05] dduvall: I'm in minded to suggest holding the train even though it's unrelated. The impact seems fairly major as users are seeing errors even logging in. [16:37:08] 10GitLab (Account Approval), 10Release-Engineering-Team, 10Abstract Wikipedia team: Requesting GitLab account activation for Nik.xyz.in - https://phabricator.wikimedia.org/T337534 (10brennen) [16:46:18] 10Phabricator: Remove "Prototype" suffix from "Reports" menu item on Project pages - https://phabricator.wikimedia.org/T337876 (10Aklapper) p:05Triage→03Low [16:48:26] 10Phabricator, 10Patch-For-Review: Remove "Prototype" suffix from "Reports" menu item on Project pages - https://phabricator.wikimedia.org/T337876 (10CodeReviewBot) aklapper opened https://gitlab.wikimedia.org/repos/phabricator/deployment/-/merge_requests/14 Remove "Prototype" suffix from "Reports" menu item... [16:48:38] 10Phabricator, 10Patch-For-Review: Remove "Prototype" suffix from "Reports" menu item on Project pages - https://phabricator.wikimedia.org/T337876 (10CodeReviewBot) [16:59:33] 10GitLab (Integrations), 10Release-Engineering-Team (Priority Backlog 📥), 10translatewiki.net: Set up TranslateWiki.net exports to push (and merge) to Wikimedia GitLab - https://phabricator.wikimedia.org/T334419 (10dancy) That seems doable. [17:15:53] 10Release-Engineering-Team, 10SRE, 10Security-Team, 10Wikimedia-GitHub, and 3 others: Add github.com/wikimedia as an SCM for Semgrep Cloud - https://phabricator.wikimedia.org/T337561 (10Dzahn) Maybe the admins listed here would be able to help: https://wikitech.wikimedia.org/wiki/Techblog.wikimedia.org#Git... [17:37:02] 10Release-Engineering-Team, 10SRE, 10Security-Team, 10Wikimedia-GitHub, and 3 others: Add github.com/wikimedia as an SCM for Semgrep Cloud - https://phabricator.wikimedia.org/T337561 (10Aklapper) > because there doesn't seem to be a list of who has access in Github For the records, https://github.com/orgs... [17:45:30] 10Project-Admins: Create project MediaWiki-extensions-WarnNotRecentlyUpdated for extension WarnNotRecentlyUpdated - https://phabricator.wikimedia.org/T337879 (10Dereckson) [17:52:45] 10Project-Admins: Create project MediaWiki-extensions-WarnNotRecentlyUpdated for extension WarnNotRecentlyUpdated - https://phabricator.wikimedia.org/T337879 (10Aklapper) 05Open→03Resolved a:03Aklapper Requested public project #WarnNotRecentlyUpdated has been created: https://phabricator.wikimedia.org/proj... [18:06:39] 10Project-Admins: Create project MediaWiki-extensions-WarnNotRecentlyUpdated for extension WarnNotRecentlyUpdated - https://phabricator.wikimedia.org/T337879 (10Dereckson) Thanks :) [18:28:18] RhinosF1: i'm not sure holding the train will help solve it, as it seems to be affecting all deployed versions, but i will see about prodding folks [18:28:59] looks to be getting some traction now [18:30:04] dduvall: I'm more concerned about something else affecting debugging by changing the situation [18:30:16] We have no idea why it suddenly started happening yesterday [18:30:28] And so far no one has any clue to that [18:31:06] i see. ok, do you happen to know when the first instance of the error was? [18:31:45] dduvall: around 9:20am yesterday [18:32:11] https://logstash.wikimedia.org/goto/0f7536ab4b6e4ed1e843c3f929ff8cb2 [18:32:16] https://phabricator.wikimedia.org/T337700#8893339 [18:32:25] i saw it prior to rolling group0 and it was already filtered out on the new-errors dashboard, so i took it to be a known error [18:32:44] is it a dupe of https://phabricator.wikimedia.org/T321234 ? [18:33:02] that's the task mentioned on the new-errors dashboard filter [18:33:52] dduvall: it's the same error but somethings dramatically changed in the rate [18:34:01] That was affecting the odd page [18:34:58] Numerous site messages were impacted which is causing a mess and rendering quite a few critical features (login/Prefrences/editing) all down [18:37:18] RhinosF1: this is the backport deployment that occurred closest to that time https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/924361/1/includes/Util.php [18:37:47] i lack the knowledge to say whether that in any way could have interacted with the bug, however :) [18:38:23] dduvall: most definitely not [18:39:35] well, i can try rolling back wmf.11 completely if you think that would give any clues to this issue [18:39:55] I doubt it [18:54:13] RhinosF1: so the first instance i see of the error is from May 29, 2023 @ 09:22:45. not yesterday but Monday. given that, i don't think this week's train has anything to do with it [18:55:18] sadly i don't see much in the SAL around that time either [18:55:54] dduvall: yesterday wasn't Monday was it. Stupid bank holiday. [18:56:17] :D [19:52:26] brennen: did that idea you had to make a dashboard of all the extensions and where they are active go anywhere yet? I'm reading some stuff from Birgit and I think she and others would like to have that data to base some other work on (workflow analysis). [19:53:01] * bd808 must be stuck on something if he's looking for more side projects to poke at... [19:53:51] bd808: notes here - https://gitlab.wikimedia.org/brennen/extloc [19:54:26] i created the tool and as i recall there's a little code somewhere, have it on my "when there's an afternoon here..." [19:54:40] one sec while i login and see if i actually got anywhere with it [19:57:51] brennen: cool. I think I saw this stub README at some point but had forgotten where it was. :) [19:58:27] looks like i got as far as a versions.json that has the data, i'll push that tiny bit of hackery to the repo [20:03:41] re-reading the stuff that led me to ask and it looks like the rough plans would want the data by October, so no rush at all but maybe an interesting hack day project for sure. [20:04:52] * bd808 goes back to hunting for why a python tool is behaving differently in two deploys of what is supposed to be the same code [20:10:25] Good luck! [20:30:33] I feel like I'm looking for the "enable non-deterministic behavior" switch buried in the code. It really is maddening at this point. I guess I'm going to try a 3rd deploy next to see if I get a 3rd behavior or one of the prior two... [21:15:31] 10GitLab (Project Migration), 10Release-Engineering-Team (They Live 🕶️🧟), 10serviceops-collab: Provide mechanism to publish to doc.wikimedia.org from GitLab CI - https://phabricator.wikimedia.org/T336168 (10bd808) @jnuche I looked quickly for a [[Docpub]] page on mw.o and wikitech, but found neither. I'm not... [22:11:25] kostajh: apologies if you're getting gitlab permission spam, figuring some stuff out with the mediawiki group [22:15:43] 10GitLab (Project Migration), 10Release-Engineering-Team (They Live 🕶️🧟), 10User-brennen: Define a permissions model for the /repos/mediawiki/ namespace on GitLab - https://phabricator.wikimedia.org/T336807 (10brennen) > I've altered the permissions model on wiki here to accurately reflect my experience of w... [22:16:50] PROBLEM - Check systemd state on doc1003 is CRITICAL: CRITICAL - degraded: The following units failed: rsync-doc-doc2002.codfw.wmnet.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [22:19:54] RECOVERY - Check systemd state on doc1003 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [22:25:36] bd808: https://gitlab.wikimedia.org/brennen/extloc/-/commit/ff62ffbe1867ac81382c2b0f0fa4eb502f457cff [22:31:29] added you to https://toolsadmin.wikimedia.org/tools/id/extloc in case you want to glance at it in context. anyway, yeah, i will pick this back up... one of these days. [22:34:43] that infinite free time must be just around the corner, right? [23:29:49] usr/bin/git clone https://gerrit.wikimedia.org/r/repos/releng/release /srv/mediawiki/release-tools' returned 128 [23:29:58] ^ broken on releases machines [23:30:06] because it was moved to gitlab.. right [23:30:36] heh [23:30:52] yea..releases* servers need adjustment if release-tools are moved [23:31:21] I dunno if the command always clones new... or whether it also does git pull/similar if it exists... [23:31:44] As the gerrit repo has a couple of commits that mean it doesn't share a completely common history with the version now on gitlab [23:32:20] that depends if it's set to "ensure latest" or only "ensure present" [23:32:33] yeah [23:32:56] not sure I understand why it doesnt share history [23:33:12] repo import should import history [23:33:35] have both been committed to after migration? [23:33:41] yeah [23:33:41] https://gerrit.wikimedia.org/g/mediawiki/tools/release [23:33:52] ... damn! why [23:33:58] at least https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/tools/release/+/cc4d33fd4d8ef365f426c1eb3cf833706074b08c exists only on gerrit [23:34:11] oh man [23:34:49] oh, and I think master becomes main [23:34:57] I just wanted to merge a harmless unrelated change on those servers [23:35:28] presumably if you update the clone repo (in puppet?).. delete the current repos on disk and let puppet checkout new... that should work? [23:36:05] yea, it's just that famous version of "should" [23:36:23] you can move the old checkout rather than delete if you want :D [23:37:46] seems a bad idea to start doing that at 5pm without coordination with ohters working on the hosts [23:38:32] either way I have to finish first what I was _actually_ doing [23:40:11] Looks like it was done in one place... https://github.com/wikimedia/operations-puppet/commit/59e869968b45060e14e627519b29c45d0eaf3b6d [23:40:32] Ah, hang on [23:40:36] Yeah, that patch actually broke it [23:40:51] it didn't set "source => 'gitlab'," in modules/releases/manifests/init.pp [23:42:06] wait..what.. on Jan 18 [23:50:21] 10GitLab (Project Migration), 10Release-Engineering-Team (GitLab V: Event Horizon 🌄), 10Patch-For-Review, 10User-brennen: Migrate mediawiki/tools/release/ to GitLab - https://phabricator.wikimedia.org/T290260 (10Dzahn) noticed on releases machines when doing unrelated deploy: ` Error: '/usr/bin/git clon... [23:53:30] so.. who is actually deploying release-tools [23:53:39] because puppet is not told to "ensure latest" [23:53:50] and nobody has manually pulled [23:54:24] wonders how changes ever got on on releases* hosts before then [23:55:17] (03PS3) 10Reedy: Support Follow-Up footer [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/925034 [23:55:41] wait, none of my comments make sense now. the patch does have a "latest" in it [23:55:50] which is bad though and we are supposed to not do [23:56:15] I +2ed it but I dont get why I did :) [23:56:27] (03PS4) 10Reedy: Support Follow-Up footer [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/925034 [23:57:49] (03PS3) 10Reedy: GerritMessageValidator: Alphasort CORRECT_FOOTERS [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/925036 [23:57:57] (03PS5) 10Reedy: Support Follow-Up footer [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/925034 [23:58:03] (03PS6) 10Reedy: GerritMessageValidator: Support Follow-Up footer [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/925034