[00:01:32] (03PS1) 10Jeena Huneidi: scap backport: deploy to mwdebug [tools/scap] - 10https://gerrit.wikimedia.org/r/803370 (https://phabricator.wikimedia.org/T308476) [07:29:58] 10Release-Engineering-Team, 10Gerrit-Privilege-Requests: Request for Gerrit Managers permissions for karapayneWMDE - https://phabricator.wikimedia.org/T302262 (10WMDE-leszek) Thanks @thcipriani, much appreciated! [09:20:51] (03CR) 10Jaime Nuche: [C: 03+1] "LGTM. Just one question, what error(s) are you getting exactly on restart without this change?" [tools/train-dev] - 10https://gerrit.wikimedia.org/r/803352 (owner: 10Ahmon Dancy) [09:31:09] (03CR) 10Aklapper: "This is ready to merge" [integration/docroot] - 10https://gerrit.wikimedia.org/r/791111 (https://phabricator.wikimedia.org/T302809) (owner: 10Aklapper) [09:39:22] 10Phabricator, 10Release-Engineering-Team: Months of history missing from https://phabricator.wikimedia.org/source/phabricator-translations.git - https://phabricator.wikimedia.org/T309910 (10hashar) 05Open→03Resolved a:03Nikerabbit Looks like the repository has received a notification today rPHTR5f8cadd4... [10:17:12] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Relocate Jenkins agents root directory to /srv/jenkins - https://phabricator.wikimedia.org/T309698 (10hashar) Done for most hosts. The remaining ones are: | deployment-deploy03 | /srv/jenkins/home/jenkins-deploy | pcc-worker1001.puppet-diffs... [10:18:38] Good morning, I'm seeking some help with a Gerrit permissions issue please, when convenient. I'm trying to get +2 access on https://gerrit.wikimedia.org/r/admin/repos/eventgate-wikimedia [10:21:34] I thought I could just use the UI to suggest the new permissions (submit for Analytics group) and then get this approved: https://gerrit.wikimedia.org/r/c/eventgate-wikimedia/+/802950 [10:22:00] That was merged, but I still don't have +2 rights. What should I do? Many thanks. [10:31:52] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Relocate Jenkins agents root directory to /srv/jenkins - https://phabricator.wikimedia.org/T309698 (10hashar) a:03hashar [10:32:02] btullis: I am checking, but I'd say that this is because Analytics is not the owner of the repo, nor has +/-2 rights over there [10:32:17] you can only submit once a change has +2 [10:32:38] so submit without +2 is likely not going to work [10:34:01] Ah, right. Thanks. I had assumed that requesting the Submit permission *was* requesting +2 rights. What permission should I have requested? [10:34:02] I am no longer a Gerrit Manager so I cannot adjust the group permissions, but I'd say adding Analytics as the Owner of the group will fix the issue [10:34:22] Owner -> Allow -> Analytics [10:34:33] I'd leave Gerrit Managers too just in case [10:34:52] OK, thanks. I'll ask a Gerrit Manager or Gerrit Administrator to do that. 👍 [10:35:07] and probably Analytics can be removed from the Submit group as it's a dupe function of the Owners [10:35:20] NP :) [10:39:46] It might be possible to switch Rights Inherit From "All-Projects" to "Analytics" too if wished or convenient. An admin would know better. [10:40:47] Perfect, thanks hauskatze. [10:52:59] 10Release-Engineering-Team, 10Scap, 10SRE, 10serviceops: Deploy Scap version 4.8.2 - https://phabricator.wikimedia.org/T309116 (10JMeybohm) 05Open→03Resolved 4.8.2 deployed fleet wide [10:58:59] 10Release-Engineering-Team (Radar), 10Scap, 10Patch-For-Review, 10User-jijiki: Update Scap to perform rolling restart for all MW deploy - https://phabricator.wikimedia.org/T266055 (10JMeybohm) >>! In T266055#7967750, @Joe wrote: > Current status is: > - Each deployment will restart php-fpm > - api and apps... [11:23:33] (03PS1) 10Kosta Harlan: Remove hardcoded references to suite.xml and phpunit.php [integration/config] - 10https://gerrit.wikimedia.org/r/803487 (https://phabricator.wikimedia.org/T90875) [11:31:45] btullis: if you file a task tagged with Gerrit-Permission-Requests and Releng, someone will work out what needs to be changed and actioned it [11:32:17] p858snake: OK, will do. Thanks. [11:48:04] Project beta-scap-sync-world build #54458: 04FAILURE in 3 min 11 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/54458/ [11:58:19] Project beta-scap-sync-world build #54459: 04STILL FAILING in 3 min 22 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/54459/ [12:12:35] Project beta-scap-sync-world build #54460: 04STILL FAILING in 7 min 41 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/54460/ [12:16:09] Project beta-scap-sync-world build #54461: 04STILL FAILING in 1 min 5 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/54461/ [12:19:13] Yippee, build fixed! [12:19:13] Project beta-scap-sync-world build #54462: 09FIXED in 1 min 24 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/54462/ [12:19:19] hmm [12:19:26] 13:58:16 11:58:16 sudo -u mwdeploy -n -- /usr/bin/scap cdb-rebuild (ran as mwdeploy@deployment-parsoid12.deployment-prep.eqiad1.wikimedia.cloud) returned [255]: Connection closed by 172.16.4.125 port 22 [12:19:38] I have logged on deployment-parsoid12.deployment-prep.eqiad1.wikimedia.cloud just fine [12:19:41] 10Continuous-Integration-Config, 10Toolhub, 10Documentation, 10User-Slst2020: Set up publish-docs pipeline job - https://phabricator.wikimedia.org/T308632 (10Slst2020) 05Open→03In progress [12:19:43] 10Continuous-Integration-Config, 10Toolhub, 10Documentation, 10User-Slst2020: Publish docs on doc.wikimedia.org - https://phabricator.wikimedia.org/T305914 (10Slst2020) [12:20:20] and on /var/log/auth.log there is: [12:20:23] Jun 7 12:19:09 deployment-parsoid12 sshd[24964]: pam_unix(sshd:session): session opened for user mwdeploy by (uid=0) [12:20:23] Jun 7 12:19:09 deployment-parsoid12 sshd[24964]: pam_systemd(sshd:session): Failed to create session: Maximum number of sessions (8192) reached, refusing further sessions. [12:20:23] Jun 7 12:19:09 deployment-parsoid12 sshd[24964]: User child is on pid 24988 [12:26:04] Jun 7 11:41:33 deployment-parsoid12 diamond[16966]: Took too long to run! Killed! [12:31:48] and oom-kiler got invoked: [Tue Jun 7 11:21:20 2022] Killed process 22133 (php-fpm7.2) total-vm:2254756kB, anon-rss:814080kB, file-rss:0kB, shmem-rss:37932kB [12:34:06] yeah it had a lot of memory usage starting at 11:36 until ~ 12:15 https://grafana-labs.wikimedia.org/d/000000059/cloud-vps-project-board?orgId=1&var-project=deployment-prep&var-server=deployment-parsoid12&from=now-3h&to=now [12:51:02] filed as https://phabricator.wikimedia.org/T310069 [13:14:08] 10Continuous-Integration-Config, 10Toolhub, 10Documentation, 10User-Slst2020: Set up publish-docs pipeline job - https://phabricator.wikimedia.org/T308632 (10Slst2020) [13:15:33] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Doing): Relocate Jenkins agents root directory to /srv/jenkins - https://phabricator.wikimedia.org/T309698 (10hashar) [13:16:36] 10Release-Engineering-Team, 10NFDI: Do mediawiki docker images from wmf branch exist - https://phabricator.wikimedia.org/T309458 (10hashar) a:03dancy [13:16:43] 10Release-Engineering-Team (Doing), 10NFDI: Do mediawiki docker images from wmf branch exist - https://phabricator.wikimedia.org/T309458 (10hashar) [13:17:11] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Radar), 10SRE: deployment-deploy03 ran out of memory twice while trying to perform a WikiLambda db migration - https://phabricator.wikimedia.org/T309413 (10hashar) [13:19:09] (03CR) 10Jforrester: "This patch represents about three hours of careful deployment work; I won't have time to do this this week, sorry. If this is a blocker to" [integration/config] - 10https://gerrit.wikimedia.org/r/803487 (https://phabricator.wikimedia.org/T90875) (owner: 10Kosta Harlan) [13:19:20] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Radar), 10SRE: deployment-deploy03 ran out of memory twice while trying to perform a WikiLambda db migration - https://phabricator.wikimedia.org/T309413 (10hashar) deployment-parsoid12 went out of memory this morning which I have filed as T310069. It... [13:51:14] (03CR) 10Kosta Harlan: Remove hardcoded references to suite.xml and phpunit.php (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/803487 (https://phabricator.wikimedia.org/T90875) (owner: 10Kosta Harlan) [13:58:20] 10Beta-Cluster-Infrastructure, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech, and 4 others: Move Termbox SSR for Beta Wikidata into deployment-prep project - https://phabricator.wikimedia.org/T304328 (10ItamarWMDE) @Lucas_Werkmeister_WMDE Thank you for all the patches, but I am currently assigned to this tas... [14:01:47] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Radar), 10SRE: deployment-deploy03 ran out of memory twice while trying to perform a WikiLambda db migration - https://phabricator.wikimedia.org/T309413 (10Zabe) >>! In T309413#7985917, @hashar wrote: > deployment-parsoid12 went out of memory this mo... [14:07:52] 10Gerrit: The UI "rebase" feature is working incorrectly since the last gerrit update for chained patches - https://phabricator.wikimedia.org/T310077 (10Daimona) [14:17:22] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Radar), 10SRE: deployment-deploy03 ran out of memory twice while trying to perform a WikiLambda db migration - https://phabricator.wikimedia.org/T309413 (10TheresNoTime) >>! In T309413#7986067, @Zabe wrote: > Could you (or someone else) add me to tha... [14:58:03] (03PS2) 10Kosta Harlan: Remove hardcoded references to suite.xml and phpunit.php [integration/config] - 10https://gerrit.wikimedia.org/r/803487 (https://phabricator.wikimedia.org/T90875) [14:59:36] (03PS3) 10Kosta Harlan: dockerfiles: Remove hardcoded references to suite.xml and phpunit.php [integration/config] - 10https://gerrit.wikimedia.org/r/803487 (https://phabricator.wikimedia.org/T90875) [15:01:36] (03PS1) 10Kosta Harlan: jjb: Use composer phpunit:entrypoint [integration/config] - 10https://gerrit.wikimedia.org/r/803525 (https://phabricator.wikimedia.org/T90875) [15:02:23] (03CR) 10Kosta Harlan: dockerfiles: Remove hardcoded references to suite.xml and phpunit.php (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/803487 (https://phabricator.wikimedia.org/T90875) (owner: 10Kosta Harlan) [15:03:54] (03CR) 10Ahmon Dancy: Mount tmpfs at /run for db,deploy,www containers (031 comment) [tools/train-dev] - 10https://gerrit.wikimedia.org/r/803352 (owner: 10Ahmon Dancy) [15:05:05] 10Release-Engineering-Team, 10Scap, 10SRE, 10serviceops: Deploy Scap version 4.8.2 - https://phabricator.wikimedia.org/T309116 (10dancy) Thanks @JMeybohm ! [15:24:53] 10Gerrit: fatal: couldn't find remote ref on git-review -d - https://phabricator.wikimedia.org/T310080 (10TheresNoTime) [15:34:58] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Create list of performance-related improvements for Jenkins jobs - https://phabricator.wikimedia.org/T423 (10Krinkle) [15:37:28] 10Gerrit: fatal: couldn't find remote ref on git-review -d - https://phabricator.wikimedia.org/T310080 (10dancy) [[ https://gerrit.wikimedia.org/r/c/mediawiki/core/+/803371 | 803371 ]] is a mediawiki/core change but you said you ran `git-review -d 803371` in a clone of the mediawiki/extensions/CheckUser repo. T... [15:38:07] 10Gerrit: fatal: couldn't find remote ref on git-review -d - https://phabricator.wikimedia.org/T310080 (10Zabe) 05Open→03Invalid [15:38:09] 10Gerrit: fatal: couldn't find remote ref on git-review -d - https://phabricator.wikimedia.org/T310080 (10Zabe) I guess you mixed up the patch number, 803371 is a core patch. [15:42:47] (03PS1) 10Jaime Nuche: startup: warn user if not running from a Python virtual environment [tools/scap] - 10https://gerrit.wikimedia.org/r/803556 (https://phabricator.wikimedia.org/T303559) [15:49:30] (03CR) 10Ahmon Dancy: "I assume this commit should wait until we've fully stopped using the deb packaged scap." [tools/scap] - 10https://gerrit.wikimedia.org/r/803556 (https://phabricator.wikimedia.org/T303559) (owner: 10Jaime Nuche) [15:51:18] I'd like to merge this patch against rpc/RunJobs.php: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/793837 [15:51:21] Any objections?= [15:58:29] None from me. :-) [15:59:37] (03PS2) 10Jaime Nuche: startup: warn user if not running from a Python virtual environment [tools/scap] - 10https://gerrit.wikimedia.org/r/803556 (https://phabricator.wikimedia.org/T303559) [16:00:19] (03PS3) 10Jaime Nuche: startup: warn user if not running from a Python virtual environment [tools/scap] - 10https://gerrit.wikimedia.org/r/803556 (https://phabricator.wikimedia.org/T303559) [16:00:53] (03CR) 10Jaime Nuche: startup: warn user if not running from a Python virtual environment (032 comments) [tools/scap] - 10https://gerrit.wikimedia.org/r/803556 (https://phabricator.wikimedia.org/T303559) (owner: 10Jaime Nuche) [16:01:12] 10Gerrit: fatal: couldn't find remote ref on git-review -d - https://phabricator.wikimedia.org/T310080 (10TheresNoTime) {meme, src="person-facepalming", below=oh.} [16:04:22] 10Gerrit: fatal: couldn't find remote ref on git-review -d - https://phabricator.wikimedia.org/T310080 (10dancy) I've made this mistake myself and been confused by it many times. [16:09:10] so is now a Puppet request window? [16:09:57] how does this work? it says I should be available on IRC, but who will be going over gerrit changes? jbond? rzl? [16:11:05] You'll want to be on #wikimedia-operations. Over there rzl said he's stuck in a meeting for 30 minutes so things will have to wait until that's finished, or you can probe jbond for help. [16:11:49] oh, I misread the channel name, thanks! [16:12:03] oh, it is +r [16:13:11] Mitar: sorry was working on a different Cr let me look now [16:15:42] I am now getting my nick to register to get to that channel [16:15:59] jbond: thanks [16:18:28] Mitar: how critical is this patch, i would like to get apergos to take a look at it first [16:18:37] as im not too familure with the service [16:18:42] or impact [16:19:02] we did but comment that you want our +1 (Hannah/me) and we'll do that tomorrow [16:20:31] it is not too critical, I just want to have things moving so it does not get stuck, Lucas_WMDE tested it (https://phabricator.wikimedia.org/T301104#7985175) and it seems it works fine, but we will not really know until it runs and does the whole dump [16:21:02] apergos: ack sounds good to me, Mitar happy to merge this through once apergos has give the +1 (dont need to wait for the next window) [16:21:25] perfect! [16:21:34] (this is my work I did during the hackathon) [16:27:45] Thanks Mitar looks like we should be bable to get this merged tomorrow [16:52:36] 10Release-Engineering-Team (Doing), 10Scap: The `logstash_checker.py` script should accept a float for `--delay` - https://phabricator.wikimedia.org/T310089 (10dduvall) [16:54:12] 10Release-Engineering-Team, 10Scap: The `logstash_checker.py` script should accept a float for `--delay` - https://phabricator.wikimedia.org/T310089 (10dduvall) p:05Triage→03High a:03dduvall [16:54:49] 10Release-Engineering-Team (Doing), 10Scap: The `logstash_checker.py` script should accept a float for `--delay` - https://phabricator.wikimedia.org/T310089 (10dduvall) [16:55:19] 10Release-Engineering-Team (Doing), 10Scap: The `logstash_checker.py` script should accept a float for `--delay` - https://phabricator.wikimedia.org/T310089 (10dduvall) [16:55:23] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments: 1.39.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T308068 (10dduvall) [17:31:24] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments: 1.39.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T308068 (10dduvall) [17:31:26] 10Release-Engineering-Team (Doing), 10Scap: The `logstash_checker.py` script should accept a float for `--delay` - https://phabricator.wikimedia.org/T310089 (10dduvall) 05Open→03Resolved https://gerrit.wikimedia.org/r/c/operations/puppet/+/803583 was merged and `scap stage-train` now successfully executes... [17:36:24] 10Beta-Cluster-Infrastructure, 10service-runner: Provide a means of shipping logs from Docker-run services in Beta to logstash - https://phabricator.wikimedia.org/T309319 (10ori) 05Open→03Resolved a:03ori [18:05:09] 10Beta-Cluster-Infrastructure, 10Abstract Wikipedia team, 10Patch-For-Review: Create a Beta Cluster version of Wikifunctions.org - https://phabricator.wikimedia.org/T284162 (10ori) (Some) documentation at https://wikitech.wikimedia.org/wiki/Wikifunctions/Beta_Cluster. [18:42:57] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments: 1.39.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T308068 (10dduvall) [18:53:02] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments: 1.39.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T308068 (10Zabe) [19:29:19] 10Gerrit: fatal: couldn't find remote ref on git-review -d - https://phabricator.wikimedia.org/T310080 (10hashar) We had T38170: {T38170}. I once wrote a patch https://review.opendev.org/#/c/222166/ to `git-review` in order to make it a bit more helpful when using the wrong change. The patch was straightforwar... [19:59:13] 10Continuous-Integration-Config, 10WMF-General-or-Unknown, 10Patch-For-Review: Tidy up references to REL1_36 now it is EOL - https://phabricator.wikimedia.org/T309864 (10Jdforrester-WMF) 05Open→03Resolved a:03Jdforrester-WMF [19:59:24] 10Beta-Cluster-Infrastructure, 10Inuka-Team (Kanban), 10Wikistories (MVP): Call to undefined method ForeignDBFile::getExtendedMetadata() - https://phabricator.wikimedia.org/T309668 (10SBisson) p:05Triage→03Medium a:03SBisson [20:18:17] (03CR) 10Jeena Huneidi: [C: 03+1] "LGTM" [tools/train-dev] - 10https://gerrit.wikimedia.org/r/803352 (owner: 10Ahmon Dancy) [20:22:14] (03PS1) 10Ahmon Dancy: updated help text for --max-layers [tools/release] - 10https://gerrit.wikimedia.org/r/803599 [20:22:41] (03CR) 10Ahmon Dancy: [C: 03+2] updated help text for --max-layers [tools/release] - 10https://gerrit.wikimedia.org/r/803599 (owner: 10Ahmon Dancy) [20:23:26] (03Merged) 10jenkins-bot: updated help text for --max-layers [tools/release] - 10https://gerrit.wikimedia.org/r/803599 (owner: 10Ahmon Dancy) [20:23:35] (03PS1) 10Ahmon Dancy: build-mv-image: Restore new-train-version logic [tools/release] - 10https://gerrit.wikimedia.org/r/803600 [20:24:31] (03PS2) 10Ahmon Dancy: build-mv-image: Restore new-train-version logic [tools/release] - 10https://gerrit.wikimedia.org/r/803600 [20:25:14] (03PS3) 10Ahmon Dancy: build-mv-image: Restore new-train-version logic [tools/release] - 10https://gerrit.wikimedia.org/r/803600 [20:25:47] (03PS4) 10Ahmon Dancy: build-mv-image: Restore new-train-version logic [tools/release] - 10https://gerrit.wikimedia.org/r/803600 [20:26:30] (03CR) 10Ahmon Dancy: [C: 03+2] build-mv-image: Restore new-train-version logic [tools/release] - 10https://gerrit.wikimedia.org/r/803600 (owner: 10Ahmon Dancy) [20:27:13] (03Merged) 10jenkins-bot: build-mv-image: Restore new-train-version logic [tools/release] - 10https://gerrit.wikimedia.org/r/803600 (owner: 10Ahmon Dancy) [21:10:18] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments: 1.39.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T308068 (10matmarex) [21:28:56] 10Release-Engineering-Team, 10Data-Persistence (Consultation), 10Security-API-Service, 10Security-Team, and 3 others: Determine CI best practices for service which connects to MySQL - https://phabricator.wikimedia.org/T308789 (10sbassett) [21:29:51] 10Release-Engineering-Team, 10Data-Persistence (Consultation), 10Security-API-Service, 10Security-Team, and 3 others: Determine CI best practices for service which connects to MySQL - https://phabricator.wikimedia.org/T308789 (10sbassett) The [[ https://gerrit.wikimedia.org/r/plugins/gitiles/integration/co... [21:31:32] Hey releng folks - was hoping to get a bit of guidance on the best practices for the above bug ^ [21:32:27] We basically need a node12-ish image that can run mysql/mariadb, a node setup script and then npm test in ci. [22:01:50] sbassett: Hi, I was looking at your abandoned patch and it looks like the tests passed so I'm a bit confused about why it's not a good solution? [22:10:23] 10Project-Admins: Create project tag for DSE-Kubernetes-Cluster (DSE-K8S) - https://phabricator.wikimedia.org/T309095 (10JArguello-WMF) Hi @Aklapper, I tried to create the board for this project, but a legend appears that says, "Unable to Create Workboard The workboard for this project has not been created yet,...