[00:38:06] 10GitLab, 10serviceops-collab: ensure Gitlab logs end up in logstash - https://phabricator.wikimedia.org/T322261 (10Dzahn) I made https://wikitech.wikimedia.org/wiki/Logstash#Getting_logs_from_misc_systems_into_logstash to help with this. [05:51:32] !log shutdown deployment-ms-fe03 - T322554 [05:51:34] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [05:51:34] T322554: Create new deployment-ms-fe instance running Debian Bullseye - https://phabricator.wikimedia.org/T322554 [06:12:17] 10Beta-Cluster-Infrastructure, 10Cloud-VPS (Debian Stretch Deprecation): Cloud VPS "deployment-prep" project Stretch deprecation - https://phabricator.wikimedia.org/T306068 (10Vgutierrez) [06:12:45] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Radar): Migrate deployment-prep away from Debian Stretch to Buster/Bullseye - https://phabricator.wikimedia.org/T278641 (10Vgutierrez) [06:12:47] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Radar), 10Patch-For-Review: Create new deployment-ms-fe instance running Debian Bullseye - https://phabricator.wikimedia.org/T322554 (10Vgutierrez) 05Open→03Resolved [06:16:10] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Radar): Migrate deployment-prep away from Debian Stretch to Buster/Bullseye - https://phabricator.wikimedia.org/T278641 (10Vgutierrez) [06:30:53] !log downgrade to firejail 0.9.44.8-2 on deployment-imagescaler03 [06:30:53] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [06:35:19] taavi: I'm failing to find an open task for deployment-imagescaler03 issues, but it seems like T312722 was also impacting it [06:35:20] T312722: Thumbor units failing / service general slowness - https://phabricator.wikimedia.org/T312722 [06:36:42] !log delete deployment-ms-fe03 - T322554 [06:36:44] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [06:36:44] T322554: Create new deployment-ms-fe instance running Debian Bullseye - https://phabricator.wikimedia.org/T322554 [06:37:31] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Radar): Migrate deployment-prep away from Debian Stretch to Buster/Bullseye - https://phabricator.wikimedia.org/T278641 (10Vgutierrez) [06:39:41] !log delete deployment-ms-be05 - T322231 [06:39:43] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [06:39:43] T322231: Create new deployment-ms-be instances running Debian Bullseye - https://phabricator.wikimedia.org/T322231 [07:42:21] 10Release-Engineering-Team (Radar), 10serviceops, 10serviceops-collab: give releng access to logs to debug buildkit-to-wmf-registry publishing - https://phabricator.wikimedia.org/T322579 (10Joe) >>! In T322579#8377279, @dduvall wrote: > Thanks for filing this! > > This is what would be helpful for us in deb... [07:42:49] 10Release-Engineering-Team (Radar), 10serviceops-collab, 10serviceops-radar: give releng access to logs to debug buildkit-to-wmf-registry publishing - https://phabricator.wikimedia.org/T322579 (10Joe) [07:59:42] (03PS1) 10Robert Vogel: Add CI depenency to `skins/BlueSpiceDiscovery` [integration/config] - 10https://gerrit.wikimedia.org/r/854470 [08:37:28] 10Gerrit: org.apache.http.client.protocol.ResponseProcessCookies : Invalid cookie header for WMF-Last-Access - https://phabricator.wikimedia.org/T273605 (10hashar) [08:40:05] 10Gerrit: org.apache.http.client.protocol.ResponseProcessCookies : Invalid cookie header for WMF-Last-Access - https://phabricator.wikimedia.org/T273605 (10hashar) 05Open→03Resolved a:03hashar After T262996, the log spam has vanished. The last entry was received at Nov 3, 2022 @ 17:03:05.892 UTC I have ed... [08:41:51] 10Release-Engineering-Team (Radar), 10serviceops-collab, 10serviceops-radar: give releng access to logs to debug buildkit-to-wmf-registry publishing - https://phabricator.wikimedia.org/T322579 (10JMeybohm) >>! In T322579#8378397, @Joe wrote: > [...] > Also: both the registry and nginx keep access logs, so I... [08:44:16] (03PS2) 10Hashar: Add CI depenency to `skins/BlueSpiceDiscovery` [integration/config] - 10https://gerrit.wikimedia.org/r/854470 (owner: 10Robert Vogel) [08:44:48] (03CR) 10Hashar: [C: 03+2] "I have amended the commit message to remove:" [integration/config] - 10https://gerrit.wikimedia.org/r/854470 (owner: 10Robert Vogel) [08:45:12] (03PS3) 10Hashar: Add CI dependency to `skins/BlueSpiceDiscovery` [integration/config] - 10https://gerrit.wikimedia.org/r/854470 (owner: 10Robert Vogel) [08:45:21] (03CR) 10Hashar: [C: 03+2] Add CI dependency to `skins/BlueSpiceDiscovery` [integration/config] - 10https://gerrit.wikimedia.org/r/854470 (owner: 10Robert Vogel) [08:47:02] (03Merged) 10jenkins-bot: Add CI dependency to `skins/BlueSpiceDiscovery` [integration/config] - 10https://gerrit.wikimedia.org/r/854470 (owner: 10Robert Vogel) [09:36:17] I would like to retry a gitlab CI job (https://gitlab.wikimedia.org/repos/releng/blubber/-/pipelines/7235/failures) but I think I might not have the required permission (or I just can't find the button to push. Can someone with permissions take a look? [09:36:58] hashar: maybe (sorry for the ping 😇) [09:38:11] I've one of the docker registry nodes depooled for testing - it can't stay this way for long unfortunately [09:39:52] jayme: I did it :) [09:39:52] nice, thanks :-D [09:40:04] no idea how the permission works though [09:40:28] me neither...but you where listed as "Owner" of that repo so I thought it might be worth a shot :-) [09:40:44] yup and I am an admin as well [09:58:04] 10Release-Engineering-Team (Priority Backlog 📥), 10serviceops, 10Release Pipeline (Blubber): WMF container registry does not accept a manifest list (aka OCI manifest index, or "fat" manifest) - https://phabricator.wikimedia.org/T322453 (10JMeybohm) I took a quick look and AIUI our registry does support `appl... [10:03:45] hashar: I think its my fault that scribunto 1.39 tests are now failing [10:04:56] I backported a patch, which i think was fine for the actual patch, but the tests relied on another patch which wasn't backported [10:14:07] bawolff: no worries, thanks to have taken the time to file that task about @group Standalone tests not running [10:14:35] Once https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Scribunto/+/854075 is merged, it should be working again [10:15:40] was that one needed in master? [10:15:48] ah yeah [10:15:53] it is a backport [10:16:42] bawolff: I think you can now mark https://phabricator.wikimedia.org/T322506 solved :-] congratulations [10:17:40] thanks [10:18:06] 10Continuous-Integration-Config, 10MediaWiki-extensions-Scribunto, 10Patch-For-Review, 10ci-test-error: Scribunto does not run unit tests on REL branches - https://phabricator.wikimedia.org/T322506 (10Bawolff) 05Open→03Resolved [10:24:31] \o/ [10:52:48] 10Continuous-Integration-Config, 10Wikidata, 10wdwb-tech, 10wmde-wikidata-tech: Run CI tests daily on master for ungated extensions - https://phabricator.wikimedia.org/T285049 (10Manuel) [11:07:52] 10Continuous-Integration-Config, 10Wikidata, 10wdwb-tech, 10Browser-Tests, 10User-awight: Migrate Wikibase selenium tests to Quibble+Apache, enable concurrency - https://phabricator.wikimedia.org/T291476 (10Manuel) [11:08:23] 10Continuous-Integration-Config, 10Release-Engineering-Team (Priority Backlog 📥), 10Wikibase (3rd party installations), 10Wikidata, and 2 others: Move some Wikibase selenium tests to a standalone job - https://phabricator.wikimedia.org/T287582 (10Manuel) [13:18:07] 10Gerrit, 10Release-Engineering-Team (Seen), 10Machine-Learning-Team, 10Patch-For-Review: gerrit: scoring/ores/editquality takes a long time to git gc - https://phabricator.wikimedia.org/T237807 (10hashar) a:03hashar [14:20:53] hashar: is this an appropriate channel to ask about the mediawiki chart in operations/deployment-charts ? [14:21:34] kindrobot: hi, it is a shared repo to capture the state of our k8s deployment using `helm` [14:22:17] I'm trying to run the production mediawiki helm chart locally on minikube. I'm currently stuck insofar as when I go to http://CLUSTER_IP:8080, I simply get "File not found." I'm not sure where to get more detailed logs. Any tips? [14:23:11] Here's my configuration: https://gitlab.wikimedia.org/kindrobot/aw-test-env [14:23:59] For context, I'm trying to make a "production like" testing environment for abstract wikipedia (including all of its services) to run e2e tests against. [14:24:13] oh well I have absolutely no idea how that works unfortunately :-\ [14:25:23] No worries. Am I asking in the right channel? [14:26:06] I would assume the containers to write to syslog and minikube to capture them somehow [14:26:24] you might try #wikimedia-serviceops that is the channel for the SRE team in charge of Kubernetes [14:26:43] Ah, OK. Thank you! [14:27:22] there might be a channel for running mediawiki on kubernetes [14:27:49] then serviceops should be a good one [14:29:01] kindrobot: beside that channel, you can also try #wikimedia-mw-on-k8s which would have a subset of person from -releng and -serviceops [14:33:55] Oh, that sounds perfect. [15:21:00] (03PS1) 10Majavah: Configure CI for kube-container-updater [integration/config] - 10https://gerrit.wikimedia.org/r/854535 [15:21:37] !log shutdown deployment-ms-be06 - T322231 [15:21:38] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:21:39] T322231: Create new deployment-ms-be instances running Debian Bullseye - https://phabricator.wikimedia.org/T322231 [15:26:15] !log delete deployment-ms-be06 - T322231 [15:26:16] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:27:29] 10Beta-Cluster-Infrastructure, 10serviceops, 10Beta-Cluster-reproducible: Thumbnails on beta cluster return 503 Service Unavailable - https://phabricator.wikimedia.org/T321654 (10Vgutierrez) [15:27:33] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Radar): Migrate deployment-prep away from Debian Stretch to Buster/Bullseye - https://phabricator.wikimedia.org/T278641 (10Vgutierrez) [15:27:37] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Radar): Migrate deployment-prep away from Debian Stretch to Buster/Bullseye - https://phabricator.wikimedia.org/T278641 (10Vgutierrez) [15:27:47] 10Beta-Cluster-Infrastructure, 10Patch-For-Review: Create new deployment-ms-be instances running Debian Bullseye - https://phabricator.wikimedia.org/T322231 (10Vgutierrez) 05Stalled→03Resolved a:03Vgutierrez [15:28:47] 10Beta-Cluster-Infrastructure, 10Cloud-VPS (Debian Stretch Deprecation): Cloud VPS "deployment-prep" project Stretch deprecation - https://phabricator.wikimedia.org/T306068 (10Vgutierrez) [15:32:09] vgutierrez: taavi: congratulations on migrating Swift on beta ;-] [15:32:33] cheers [15:34:25] hashar: I don't think I did anything, you need to thank vgutierrez and Emperor [15:34:50] ah then thank you Emperor as well! :-] [15:43:23] (03PS1) 10Hashar: dockerfiles: update phpmetrics 2.7.3 > 2.8.1 [integration/config] - 10https://gerrit.wikimedia.org/r/854540 [15:44:39] (03CR) 10Hashar: [C: 03+2] dockerfiles: update phpmetrics 2.7.3 > 2.8.1 [integration/config] - 10https://gerrit.wikimedia.org/r/854540 (owner: 10Hashar) [15:45:14] (03CR) 10CI reject: [V: 04-1] dockerfiles: update phpmetrics 2.7.3 > 2.8.1 [integration/config] - 10https://gerrit.wikimedia.org/r/854540 (owner: 10Hashar) [15:46:07] (03PS2) 10Hashar: dockerfiles: update phpmetrics 2.7.3 > 2.8.1 [integration/config] - 10https://gerrit.wikimedia.org/r/854540 [15:46:25] (03CR) 10Hashar: [C: 03+2] dockerfiles: update phpmetrics 2.7.3 > 2.8.1 [integration/config] - 10https://gerrit.wikimedia.org/r/854540 (owner: 10Hashar) [15:47:12] (03PS1) 10Hashar: jjb: update jobs for phpmetrics 2.8.1 [integration/config] - 10https://gerrit.wikimedia.org/r/854543 [15:48:13] (03Merged) 10jenkins-bot: dockerfiles: update phpmetrics 2.7.3 > 2.8.1 [integration/config] - 10https://gerrit.wikimedia.org/r/854540 (owner: 10Hashar) [15:52:41] (03CR) 10Hashar: [C: 03+2] "Jobs updated" [integration/config] - 10https://gerrit.wikimedia.org/r/854543 (owner: 10Hashar) [15:54:54] (03Merged) 10jenkins-bot: jjb: update jobs for phpmetrics 2.8.1 [integration/config] - 10https://gerrit.wikimedia.org/r/854543 (owner: 10Hashar) [15:56:28] (03CR) 10Hashar: [C: 03+2] "INFO:jenkins_jobs.builder:Number of jobs generated: 2" [integration/config] - 10https://gerrit.wikimedia.org/r/854535 (owner: 10Majavah) [15:58:08] (03Merged) 10jenkins-bot: Configure CI for kube-container-updater [integration/config] - 10https://gerrit.wikimedia.org/r/854535 (owner: 10Majavah) [15:58:40] !log Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/854535 [15:58:41] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:58:59] (03CR) 10Hashar: [C: 03+2] "Deployed" [integration/config] - 10https://gerrit.wikimedia.org/r/854535 (owner: 10Majavah) [16:27:37] 10Phabricator, 10SRE: Access request to run bulk operations in phabricator for user lmata - https://phabricator.wikimedia.org/T322638 (10RhinosF1) [16:29:17] 10Phabricator, 10SRE: Access request to run bulk operations in phabricator for user lmata - https://phabricator.wikimedia.org/T322638 (10lmata) This was a dialogue/modal out of phabricator’s confirmation page for bulk change. I can find one and take a screengrab if that’s helpful. [16:30:37] 10Phabricator, 10SRE: Access request to run bulk operations in phabricator for user lmata - https://phabricator.wikimedia.org/T322638 (10RhinosF1) >>! In T322638#8380193, @lmata wrote: > This was a dialogue/modal out of phabricator’s confirmation page for bulk change. I can find one and take a screengrab if th... [16:55:13] 10GitLab (CI & Job Runners), 10Release-Engineering-Team (Priority Backlog 📥), 10serviceops-collab: Move cloud runner CI jobs to trusted runners - https://phabricator.wikimedia.org/T322344 (10dduvall) >>! In T322344#8377390, @Dzahn wrote: >` > ... > ERROR: Condition 'DA418C88A3219F7B' not fulfilled for '/srv/... [17:21:08] 10Release-Engineering-Team (Priority Backlog 📥), 10serviceops, 10Release Pipeline (Blubber): WMF container registry does not accept a manifest list (aka OCI manifest index, or "fat" manifest) - https://phabricator.wikimedia.org/T322453 (10dduvall) Thanks for debugging this further, @JMeybohm and @hashar. In... [17:28:13] 10Release-Engineering-Team (Priority Backlog 📥), 10serviceops, 10Release Pipeline (Blubber): Buildkit erroring with "cannot reuse body, request must be retried" upon multi-platform push - https://phabricator.wikimedia.org/T322453 (10dduvall) [17:50:20] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Priority Backlog 📥), 10serviceops, 10Datacenter-Switchover: Create a runbook for switching CI master - https://phabricator.wikimedia.org/T256396 (10jijiki) [17:56:00] 10Release-Engineering-Team-TODO (2020-04 to 2020-06 (Q4)), 10Parsoid: Undeploy ParsoidBatchAPI from the Wikimedia cluster - https://phabricator.wikimedia.org/T242430 (10Dzahn) [17:58:53] 10Release-Engineering-Team (Radar), 10serviceops-collab, 10serviceops-radar: give releng access to logs to debug buildkit-to-wmf-registry publishing - https://phabricator.wikimedia.org/T322579 (10Dzahn) I just added this small section to the Wikitech Logstash page how I got logs from "misc" systems into logs... [18:03:52] 10Continuous-Integration-Config, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata, 10wdwb-tech, and 2 others: Wikibase test failures on REL1_39 - https://phabricator.wikimedia.org/T322467 (10Reedy) p:05Triage→03High [18:06:09] 10Continuous-Integration-Config, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata, 10wdwb-tech, and 2 others: Wikibase test failures on REL1_39 - https://phabricator.wikimedia.org/T322467 (10Reedy) Should we just remove Wikibase (et al) from the Math dependancy tree on REL1_39? Or is there a trivial b... [18:46:49] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: bring new gitlab hardware servers into production - https://phabricator.wikimedia.org/T307142 (10jijiki) p:05High→03Unbreak! [18:47:03] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: bring new gitlab hardware servers into production - https://phabricator.wikimedia.org/T307142 (10jijiki) p:05Unbreak!→03High [18:47:53] 10GitLab (Infrastructure), 10serviceops-collab, 10Patch-For-Review: bring new gitlab hardware servers into production - https://phabricator.wikimedia.org/T307142 (10jijiki) [19:40:13] 10Release-Engineering-Team (Radar), 10serviceops-collab, 10serviceops-radar: give releng access to logs to debug buildkit-to-wmf-registry publishing - https://phabricator.wikimedia.org/T322579 (10dduvall) >>! In T322579#8378513, @JMeybohm wrote: >>>! In T322579#8378397, @Joe wrote: >> [...] >> Also: both the... [19:42:46] 10Release-Engineering-Team (Priority Backlog 📥), 10serviceops, 10Release Pipeline (Blubber): Buildkit erroring with "cannot reuse body, request must be retried" upon multi-platform push - https://phabricator.wikimedia.org/T322453 (10dduvall) @JMeybohm can you provide the nginx access log entries from that ti... [20:01:22] !log temporarily enabling buildkitd debug logging on gitlab-runner hosts (T322453) [20:01:24] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:01:24] T322453: Buildkit erroring with "cannot reuse body, request must be retried" upon multi-platform push - https://phabricator.wikimedia.org/T322453 [20:15:52] 10Release-Engineering-Team (Priority Backlog 📥), 10serviceops, 10Release Pipeline (Blubber): Buildkit erroring with "cannot reuse body, request must be retried" upon multi-platform push - https://phabricator.wikimedia.org/T322453 (10dduvall) I enabled debug logging for buildkitd on the gitlab-runner hosts an... [20:17:01] !log puppet re-enabled on gitlab-runner hosts (T322453) normal log level will be restored on next puppet run [20:17:03] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:17:03] T322453: Buildkit erroring with "cannot reuse body, request must be retried" upon multi-platform push - https://phabricator.wikimedia.org/T322453 [20:38:57] 10GitLab, 10Release-Engineering-Team (Priority Backlog 📥), 10serviceops: Build and import new release of jwt-authorizer (1.1.0) - https://phabricator.wikimedia.org/T322691 (10dduvall) [20:42:56] vgutierrez, could T322667 be related to your deployment-prep swift upgrade [20:42:57] T322667: Beta: Create an account pops up with an Internal Error - https://phabricator.wikimedia.org/T322667 [20:42:58] ? [20:44:58] Hmm yeah [20:45:19] If MediaWiki has the swift host in its configuration it needs to be updated for sure [20:46:49] https://noc.wikimedia.org/conf/highlight.php?file=LabsServices.php [20:47:12] deployment-ms-fe03 is coded [20:47:36] s/03/04/ ? [20:47:49] Almost [20:48:15] deployment-ms-fe04.deployment-prep.eqiad1.wikimedia.cloud [20:48:23] * Reedy fixe [20:48:33] Dns suffix got updated as well [20:48:41] Thx Reedy <3 [20:49:53] https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/854611 [20:52:52] nice [21:04:29] that fixed it right? [21:07:27] https://en.wikipedia.beta.wmflabs.org/w/index.php?title=Special:CreateAccount&returnto=Main+Page [21:07:34] the page isn't completely broken at least :P [21:07:43] and we see a craptcha [21:08:42] cheers