[00:06:06] editing via gerrit ui is a pretty terrible user experience [00:09:54] s/editing via//g [00:15:43] bd808: Apologies. When you edit a patch via gerrit, if you don't save the changes and hit cancel, if you try to edit the file again gerrit will restore your previous edit [00:16:13] so to avoid this you have to save it without publishing and in the main window hit 'delete edit' [00:16:22] it's not very intutive but now I know :) [00:16:55] disagree with bawolff though; in my case gerrit editting has been perfectly pleasant [00:17:02] including creating patches from scratch [01:00:34] maintenance-disconnect-full-disks build 474212 integration-agent-docker-1038 (/: 29%, /srv: 98%, /var/lib/docker: 60%): OFFLINE due to disk space [01:10:30] maintenance-disconnect-full-disks build 474214 integration-agent-docker-1038 (/: 29%, /srv: 11%, /var/lib/docker: 56%): RECOVERY disk space OK [10:28:52] 10GitLab (Administration, Settings & Policy): cloudsub123@gmail.com - https://phabricator.wikimedia.org/T332382 (10Cloudduck1233) [10:36:59] !log disabled phab user 'Cloudduck1233' [10:37:01] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:38:33] 10GitLab (Administration, Settings & Policy): cloudsub123@gmail.com - https://phabricator.wikimedia.org/T332382 (10Dzahn) 05Open→03Invalid [11:16:17] 10Continuous-Integration-Config, 10Quibble: Quibble: Update nodeJS to at least v14.18.0 - https://phabricator.wikimedia.org/T332387 (10kostajh) [11:16:54] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10Quibble: Quibble: Update nodeJS to at least v14.18.0 - https://phabricator.wikimedia.org/T332387 (10kostajh) [11:17:09] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10Quibble: Quibble: Update nodeJS to at least v14.18.0 - https://phabricator.wikimedia.org/T332387 (10kostajh) From the other task: >>! In T332386#8704881, @hashar wrote: > NodeJS 16 breaks the `fibers` node module which is a dependency of WebDriv... [11:20:34] 10Continuous-Integration-Config, 10Release-Engineering-Team (Seen), 10User-zeljkofilipin: Upgrade all CI jobs for WMF-deployed projects from Node 12 to Node 14 LTS - https://phabricator.wikimedia.org/T267890 (10hashar) As far as CI is concerned, the node12 images have been removed in February 2023 by 9253595... [12:05:36] jnuche: I had to revert https://gerrit.wikimedia.org/r/q/f50c75e6fd2154411353236fcaf0cb9dff6867bd because there's a bug in docker-gc [12:05:41] Reason for revert: docker-gc errors out ValueError: time data '2023-03-17T11:5800Z' does not match format '%Y-%m-%dT%H:%M:%S%z' [12:08:16] 10Release-Engineering-Team (GitLab V: Event Horizon 🌄), 10Patch-For-Review: Run docker-gc on deploy servers - https://phabricator.wikimedia.org/T329678 (10Clement_Goubert) ` cgoubert@deploy2002:~$ sudo docker images | wc -l 320 cgoubert@deploy2002:~$ sudo systemctl start docker-image-prune-old.service cgouber... [12:14:41] claime: thanks, I'll look into that [12:16:17] ack thx :) [13:19:09] hashar: do you have any idea why https://gerrit.wikimedia.org/r/c/mediawiki/core/+/866386 would cause both selenium jobs to time out after 60 minutes?! [13:19:37] I've ran the patch locally in a fresh container using both node v14 and v16, it worked fine in both. [13:20:11] no idea! [13:20:18] but it can be filed under #ci-test-error [13:20:27] will do, thankss [13:20:29] or look at the captured video maybe? [13:20:40] just looking [13:20:40] https://integration.wikimedia.org/ci/job/wmf-quibble-selenium-php74-docker/26821/artifact/log/ [13:20:55] but there's nothing, looks like the tests didn't even start [13:22:17] maybe wdio can be run in debug/verbose mode to give more details? [13:22:28] good idea, will do [13:22:55] I've created T332393, I'll be adding details [13:22:56] T332393: wmf-quibble-selenium-php74-docker and mediawiki-quibble-selenium-vendor-mysql-php74-docker fail after updating wdio to v8 - https://phabricator.wikimedia.org/T332393 [13:41:50] claime: I fixed the bug in docker-gc and tested it on one of the gitlab runners, I think it's good to go now [13:41:58] please take another look when you get the chance: https://gerrit.wikimedia.org/r/c/operations/puppet/+/900651 [13:43:12] jnuche: taking a look [13:43:34] Let's try it then [13:44:56] 👍 [13:50:46] 10Continuous-Integration-Config, 10Browser-Tests, 10Quality-and-Test-Engineering-Team (QTE) (Test engineering): Drop Selenium tests from gate-and-submit-wmf - https://phabricator.wikimedia.org/T307180 (10hashar) 05Open→03Declined Originally we had the Selenium tests solely running on a daily basis and th... [13:50:49] Deployed, looks good (at least the systemd service doesn't fail) [13:51:04] Mar 17 13:48:30 gitlab-runner1004 docker[513102]: Status: Downloaded newer image for docker-registry.wikimedia.org/repos/releng/docker-gc/docker-gc:1.1.2 [13:51:06] Mar 17 13:48:31 gitlab-runner1004 docker[513102]: [2023-03-17 13:48:31,303] Running df() [13:51:08] Mar 17 13:48:31 gitlab-runner1004 docker[513102]: [2023-03-17 13:48:31,359] df() returned 31 images, 1 volumes [13:51:10] Mar 17 13:48:31 gitlab-runner1004 docker[513102]: [2023-03-17 13:48:31,360] volumes usage is 2.68 KB, below high water mark of 135 GB. [13:51:12] Mar 17 13:48:31 gitlab-runner1004 docker[513102]: [2023-03-17 13:48:31,418] images usage is 5.06 GB, below high water mark of 135 GB. [13:51:14] Mar 17 13:48:31 gitlab-runner1004 systemd[1]: docker-gc.service: Succeeded. [13:51:33] great, thank you! [13:52:37] 10Continuous-Integration-Config, 10Wikidata, 10wdwb-tech: Investigate moving some Wikibase tests to @group Standalone - https://phabricator.wikimedia.org/T285950 (10hashar) Related is {T287582} which is to run the Wikibase Selenium tests to a dedicated job in order to stop triggering them for every other rep... [13:54:54] yw [13:58:52] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Yak Shaving 🐃🪒), 10SRE: Have linters/tests results show up as comments in files on gerrit - https://phabricator.wikimedia.org/T209149 (10hashar) Gerrit is going to remove the robot comments entirely: https://groups.google.com/g/repo-discuss... [13:59:14] 10Release-Engineering-Team (GitLab V: Event Horizon 🌄), 10Patch-For-Review: Run docker-gc on deploy servers - https://phabricator.wikimedia.org/T329678 (10jnuche) In the end I opted to automate the cleanup of images using a systemd unit calling `docker image prune`. The outcome of the the task then is: * Sys... [13:59:37] 10Release-Engineering-Team (GitLab V: Event Horizon 🌄), 10Patch-For-Review: Run docker-gc on deploy servers - https://phabricator.wikimedia.org/T329678 (10jnuche) 05Open→03Resolved [14:00:42] oh joy my PHP is broken :( [14:04:00] somehow `readfile($_SERVER["SCRIPT_FILENAME"]);` does not work anymore :) [15:04:55] hashar: no luck, increasing wdio logging to 11 didn't provide more data :/ https://gerrit.wikimedia.org/r/c/mediawiki/core/+/866386/15..16 [15:05:57] looks like the tests start and never finish 🤷 [15:10:59] 10Continuous-Integration-Infrastructure, 10Gerrit, 10Release-Engineering-Team (Seen), 10Zuul, 10Patch-For-Review: Display Zuul status of jobs for a change on Gerrit UI - https://phabricator.wikimedia.org/T214068 (10hashar) (this one is slowly progressing here and there, I am revisiting the code when time... [17:25:07] 10Phabricator: Request to add JMcLeod_WMF to acl*phabricator - https://phabricator.wikimedia.org/T332418 (10JMcLeod_WMF) [18:16:02] hi releng folks, where is the mapping of which LDAP groups are allowed to act against the Jenkins API configured? specifically asking about the case of puppet-compiler being triggered by a script, and also, this URL [18:16:04] requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://integration.wikimedia.org/ci/api/python?tree=jobs%5Bname%2Ccolor%2Curl%5D [18:34:19] cdanis: there's some documentation here that may be relevant: https://wikitech.wikimedia.org/wiki/SRE/LDAP/Groups [18:34:54] but tbh i have no idea. i'm afk for a bit but can probably dig a bit when i'm back. [19:02:02] it's configured inside jenkins itself IIRC [19:03:32] https://integration.wikimedia.org/ci/manage/configureSecurity/ is what I was thinking of, but it doesn't say anything about the API, hm [19:10:28] legoktm: brennen: false alarm, turns out the user in question didn't have their local environment set up right [19:10:31] thanks for looking :) [19:10:44] :)) [19:19:39] oh good. :) [22:03:41] 10Beta-Cluster-Infrastructure, 10Discovery-Search (Current work): [beta cluster] Search - "An error has occurred while searching" - https://phabricator.wikimedia.org/T332455 (10Etonkovidova) [22:21:49] 10Continuous-Integration-Infrastructure, 10Gerrit, 10Release-Engineering-Team (Seen), 10Zuul, 10Patch-For-Review: Display Zuul status of jobs for a change on Gerrit UI - https://phabricator.wikimedia.org/T214068 (10hashar) Progress from last time, querying `https://integration.wikimedia.org/zuul/status/c... [22:28:47] 10Beta-Cluster-Infrastructure, 10Discovery-Search (Current work): [beta cluster] Search - "An error has occurred while searching" - https://phabricator.wikimedia.org/T332455 (10Tgr) Apparently the SSL cert for some beta ElasticSearch server is invalid. [22:29:42] 10Beta-Cluster-Infrastructure, 10Discovery-Search (Current work): [beta cluster] Search - "An error has occurred while searching" - https://phabricator.wikimedia.org/T332455 (10EBernhardson) @bking has delt with these issues before, might have ideas [22:46:41] 10Beta-Cluster-Infrastructure, 10Discovery-Search (Current work): [beta cluster] Search - "An error has occurred while searching" - https://phabricator.wikimedia.org/T332455 (10Tgr) ` tgr@deployment-mwmaint02:~$ curl 'https://deployment-elastic09.deployment-prep.eqiad1.wikimedia.cloud:9243' curl: (60) SSL cert... [22:58:58] !log deployment-prep: Reboot deployment-elastic09 (T332455) [22:59:01] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:59:01] T332455: [beta cluster] Search - "An error has occurred while searching" - https://phabricator.wikimedia.org/T332455 [23:00:46] !log deployment-prep: Reboot deployment-elastic10, deployment-elastic11 (T332455) [23:00:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [23:05:06] 10Beta-Cluster-Infrastructure, 10Discovery-Search (Current work): [beta cluster] Search - "An error has occurred while searching" - https://phabricator.wikimedia.org/T332455 (10Urbanecm_WMF) 05Open→03Resolved a:03Urbanecm_WMF I rebooted it and newer cert was picked up. Seems to be working now.