[00:01:36] (DatasourceError) firing: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [00:11:36] (DatasourceError) resolved: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [02:17:04] (03PS13) 10Krinkle: secnpm: initial commit [fresh] - 10https://gerrit.wikimedia.org/r/675346 [02:17:21] (03CR) 10Krinkle: "As first version, the install is opt-in." [fresh] - 10https://gerrit.wikimedia.org/r/675346 (owner: 10Krinkle) [03:21:17] 10Phabricator, 10LDAP-Access-Requests, 10SRE, 10SRE-Access-Requests: Migrate dev user accounts for bvibber - https://phabricator.wikimedia.org/T358044#9562208 (10Bugreporter) We need to move the task subscribation and assignments. After to prevent confusion we may consider disabling the brion account. [03:24:18] 10Phabricator, 10LDAP-Access-Requests, 10SRE, 10SRE-Access-Requests: Migrate dev user accounts for bvibber - https://phabricator.wikimedia.org/T358044#9562210 (10Bugreporter) Alternatively we can keep the bvibber Phab account as the personal account and rename brion to something like bvibber-wmf so much le... [03:46:53] 10GitLab (Pipeline Services Migration🐤), 10Shellbox: Migrate shellbox to GitLab - https://phabricator.wikimedia.org/T344745#9562214 (10tstarling) It's unclear to me how that would work. Is there any documentation about moving libraries to GitLab? At [[https://www.mediawiki.org/wiki/Manual:Developing_libraries... [05:05:50] (03CR) 10DannyS712: secnpm: initial commit (032 comments) [fresh] - 10https://gerrit.wikimedia.org/r/675346 (owner: 10Krinkle) [07:47:03] 10Gerrit: Users with a different name in the cn field compared to uid field cannot use http auth - https://phabricator.wikimedia.org/T225308#9562401 (10hashar) [09:47:02] 10Phabricator, 10LDAP-Access-Requests, 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Migrate dev user accounts for bvibber - https://phabricator.wikimedia.org/T358044#9562598 (10MoritzMuehlenhoff) @bvibber Renaming the user name for SSH access will leave files in the old home inacessible (we don't ne... [11:25:32] hi releng, I can't create branches in https://gerrit.wikimedia.org/r/admin/repos/operations/debs/prometheus-rsyslog-exporter - gerrit says: Error 503 (Service Unavailable): Lock failure [11:25:51] git itself gives "failed to lock" - is there somthing I'm missing? [11:26:03] permission wise I should be fine AIUI [12:04:03] jayme: you broke it ! :) [12:05:23] Gerrit errors end up in elasticsearch / Kibana and can be found via https://logstash.wikimedia.org/app/dashboards#/view/AW1f-0k0ZKA7RpirlnKV [12:05:37] com.google.gerrit.git.LockFailureException: Failed to create refs/heads/upstream [12:07:06] and before that: [12:07:06] com.google.gerrit.exceptions.StorageException: com.google.gerrit.server.update.UpdateException: com.google.gerrit.git.LockFailureException: Update aborted with one or more lock failures: PackedBatchRefUpdate[ [12:07:07] CREATE: 0000000000000000000000000000000000000000 3784041f4bdd1553f58d37104ade09ec00968205 refs/heads/upstream (LOCK_FAILURE) [12:07:07] ] [12:21:16] OH OF COURSE [12:21:19] that one is fun [12:21:26] that is a bug in jgit [12:21:37] or cgit [12:21:38] or whatever [12:21:48] and it is missing the actual cause [12:22:02] we need to file a task [12:23:02] if I try with cgit: fatal: cannot lock ref 'refs/heads/upstream': 'refs/heads/upstream/0.0.0+git20201008' exists; cannot create 'refs/heads/upstream' [12:24:13] which is even misleading [12:24:17] the cause is the reflog [12:24:36] so yeah you can't overlap :/ [12:25:22] cause there is already a `logs/refs/heads/upstream` directory present (which holds a `0.0.0+git20201008` file) [12:25:35] and thus the filesystem is unable to create a file `logs/refs/heads/upstream` [12:25:52] cause it already exists but as a directory [12:26:16] my bet is the branch should probably not exist in the first place since there is already a tag by that name [12:26:25] my guess is the branch got created by mistake [12:26:33] (via a git push) [12:26:35] so [12:27:05] jayme: delete the `upstream/0.0.0+git20201008` branch [12:27:16] that will let git create the `upstream` branch [12:27:51] and there is a tag by the same name `upstream/0.0.0+git20201008` which also points to d721280f81fca98a1c3960a1864b9a454f7699cc [12:27:53] solved [12:31:50] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Migrate internal traffic to k8s - https://phabricator.wikimedia.org/T333120#9563252 (10Clement_Goubert) [12:37:31] 10Release-Engineering-Team, 10Temporary accounts, 10Trust and Safety Product Team, 10Patch-For-Review: Create job that runs PHPUnit unit and integration tests with temp account feature flag is enabled - https://phabricator.wikimedia.org/T355879#9563270 (10Dreamy_Jazz) [12:42:58] (03PS1) 10Jaime Nuche: jjb: simplify Zuul deploy jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1005499 (https://phabricator.wikimedia.org/T342346) [12:50:59] (03PS13) 10Jaime Nuche: generate Python2 wheels for bullseye targets [integration/zuul/deploy] - 10https://gerrit.wikimedia.org/r/1002466 (https://phabricator.wikimedia.org/T342346) [12:51:01] (03PS2) 10Jaime Nuche: jjb: simplify Zuul deploy jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1005499 (https://phabricator.wikimedia.org/T342346) [12:51:09] (03CR) 10Jaime Nuche: "recheck" [integration/zuul/deploy] - 10https://gerrit.wikimedia.org/r/1002466 (https://phabricator.wikimedia.org/T342346) (owner: 10Jaime Nuche) [13:02:07] oh, wow - thanks hashar [13:20:49] (03CR) 10Hashar: [C: 03+2] jjb: simplify Zuul deploy jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1005499 (https://phabricator.wikimedia.org/T342346) (owner: 10Jaime Nuche) [13:21:59] (03Merged) 10jenkins-bot: jjb: simplify Zuul deploy jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1005499 (https://phabricator.wikimedia.org/T342346) (owner: 10Jaime Nuche) [13:30:56] jayme: to be fair it is git bug / off by one [13:33:28] (03PS14) 10Jaime Nuche: generate Python2 wheels for bullseye targets [integration/zuul/deploy] - 10https://gerrit.wikimedia.org/r/1002466 (https://phabricator.wikimedia.org/T342346) [13:39:25] (03CR) 10Hashar: [C: 03+2] generate Python2 wheels for bullseye targets [integration/zuul/deploy] - 10https://gerrit.wikimedia.org/r/1002466 (https://phabricator.wikimedia.org/T342346) (owner: 10Jaime Nuche) [13:39:59] (03Merged) 10jenkins-bot: generate Python2 wheels for bullseye targets [integration/zuul/deploy] - 10https://gerrit.wikimedia.org/r/1002466 (https://phabricator.wikimedia.org/T342346) (owner: 10Jaime Nuche) [14:23:34] 10Release-Engineering-Team (Radar), 10collaboration-services: upgrade contint servers to bullseye - https://phabricator.wikimedia.org/T334517#9563552 (10jnuche) [14:25:13] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Priority Backlog 📥), 10Zuul, 10Patch-For-Review: Refresh integration/zuul/deploy to work on Debian Bullseye - https://phabricator.wikimedia.org/T342346#9563550 (10jnuche) 05In progress→03Resolved [14:51:23] 10Release-Engineering-Team (Seen), 10Content-Transform-Team, 10MW-on-K8s, 10SRE, and 2 others: Create parsoid mediawiki deployment - https://phabricator.wikimedia.org/T357392#9563708 (10akosiaris) 05Open→03In progress p:05Triage→03Medium [14:51:31] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, 10serviceops: Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536#9563710 (10akosiaris) [15:27:21] 10Release-Engineering-Team (Now this 🫠), 10Release, 10Train Deployments: 1.42.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T354437#9563791 (10hoo) [15:57:05] Nice debugging hashar! [16:00:25] 10GitLab (Pipeline Services Migration🐤), 10Shellbox: Migrate shellbox to GitLab - https://phabricator.wikimedia.org/T344745#9563918 (10Jdforrester-WMF) >>! In T344745#9562214, @tstarling wrote: > It's unclear to me how that would work. Is there any documentation about moving libraries to GitLab? At [[https://w... [16:00:31] (DatasourceError) firing: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [16:05:31] (DatasourceError) resolved: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [16:05:46] (DatasourceError) firing: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [16:10:46] (DatasourceError) resolved: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [16:22:41] 10Release-Engineering-Team (Now this 🫠), 10Scap, 10MW-on-K8s, 10SRE, 10serviceops: Find a way to address canary releases directly - https://phabricator.wikimedia.org/T358117#9564123 (10Clement_Goubert) [16:23:05] dancy: merged and ran puppet on deploy2002, the new version of the script is live there [16:23:17] Thanks! [16:24:07] 10Release-Engineering-Team (Now this 🫠), 10Scap, 10MW-on-K8s, 10SRE, 10serviceops: Find a way to address canary releases directly - https://phabricator.wikimedia.org/T358117#9564137 (10Clement_Goubert) p:05Triage→03Medium [16:29:29] 10Release-Engineering-Team (Now this 🫠), 10Scap, 10MW-on-K8s, 10SRE, 10serviceops: Find a way to address canary releases directly - https://phabricator.wikimedia.org/T358117#9564163 (10Clement_Goubert) [16:42:25] 10Release-Engineering-Team, 10collaboration-services: No email received for latest train presync - https://phabricator.wikimedia.org/T358009#9564204 (10thcipriani) Ah, that's why I missed it in the archives. Systemd says "PASSED" but it passed with failure? Anyway, I guess we got the email. Anything else wort... [16:46:28] 10Release-Engineering-Team, 10collaboration-services: No email received for latest train presync - https://phabricator.wikimedia.org/T358009#9564223 (10dancy) Though the message shows up in the archives, it didn't arrive in our mailboxes. [16:47:18] 10Phabricator, 10Release-Engineering-Team: Remove policy selections from production error creation form - https://phabricator.wikimedia.org/T357210#9564243 (10thcipriani) 05Open→03Resolved a:03thcipriani Thanks for the cleanup @taavi I went ahead and re-hid those fields. Looks like we opened them up a... [16:55:25] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Migrate internal traffic to k8s - https://phabricator.wikimedia.org/T333120#9564307 (10Clement_Goubert) [18:01:37] dancy: I kind of enjoy debugging "low" level stuff such as git files :D [18:01:47] as to how it can be fixed ... I have no clue :) [18:02:09] 10Beta-Cluster-Infrastructure, 10MediaWiki-Platform-Team: Set up some beta cluster wikis with different registrable domain - https://phabricator.wikimedia.org/T355281#9564701 (10pmiazga) I spoke with @Urbanecm_WMF about creating new wikis and I learned that we want to keep the beta cluster wikis similar to pro... [18:08:14] (03PS1) 10Arlolra: Make Parsoid depend on ParserFunctions explicitly [integration/config] - 10https://gerrit.wikimedia.org/r/1005564 [18:09:39] (03CR) 10Subramanya Sastry: [C: 03+1] Make Parsoid depend on ParserFunctions explicitly [integration/config] - 10https://gerrit.wikimedia.org/r/1005564 (owner: 10Arlolra) [18:10:24] (03CR) 10Hashar: [C: 03+2] Make Parsoid depend on ParserFunctions explicitly [integration/config] - 10https://gerrit.wikimedia.org/r/1005564 (owner: 10Arlolra) [18:10:58] (03CR) 10Hashar: [C: 03+2] "Note: we should give you the permissions to deploy those changes cause you are literally the expert when it comes to how Parsoid is tested" [integration/config] - 10https://gerrit.wikimedia.org/r/1005564 (owner: 10Arlolra) [18:11:43] (03Merged) 10jenkins-bot: Make Parsoid depend on ParserFunctions explicitly [integration/config] - 10https://gerrit.wikimedia.org/r/1005564 (owner: 10Arlolra) [18:12:19] !log Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/1005564 [18:12:20] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:12:26] (03CR) 10Hashar: [C: 03+2] "Deployed!" [integration/config] - 10https://gerrit.wikimedia.org/r/1005564 (owner: 10Arlolra) [18:14:24] (03CR) 10Arlolra: "Is there a process to initiate that?" [integration/config] - 10https://gerrit.wikimedia.org/r/1005564 (owner: 10Arlolra) [18:17:38] (03CR) 10Hashar: [C: 03+2] "Granting permissions is straightforward, well maybe we would need a new user group to tune the privileges granted. Then I guess it is a bi" [integration/config] - 10https://gerrit.wikimedia.org/r/1005564 (owner: 10Arlolra) [18:21:14] 10Beta-Cluster-Infrastructure, 10MediaWiki-Platform-Team: Set up some beta cluster wikis with different registrable domain - https://phabricator.wikimedia.org/T355281#9564744 (10ArielGlenn) I'd prefer the beta be kept in the name, making it clear that these are wikis on the deployment cluster. I'm not sure of... [18:40:59] (03CR) 10Arlolra: "There will another patch here to remove the Poem dependency in a few weeks time. Maybe we can use that as an opportunity to learn." [integration/config] - 10https://gerrit.wikimedia.org/r/1005564 (owner: 10Arlolra) [18:53:11] hmm, looks like Phabricator is interpreting some gerrit change numbers as git commit ids: https://phabricator.wikimedia.org/T357759#9564838 [19:00:23] dancy: letting people create their own project in Gerrit was once filed as https://phabricator.wikimedia.org/T38937 [19:00:58] and jhathaway recnetly commented about a plugin which might make it possible to delegate the repo creation https://gerrit.googlesource.com/plugins/project-group-structure/+/refs/heads/master/src/main/resources/Documentation/about.md [19:06:51] I havereopened the task [19:06:54] I am off for dinner [19:07:31] 10Gerrit, 10Upstream: Allow for group admins to create project repositories inside a parent project - https://phabricator.wikimedia.org/T38937#9564883 (10hashar) 05Declined→03Open >>! In T38937#9509998, @jhathaway wrote: > Has anyone ever looked at the [[ https://gerrit.googlesource.com/plugins/project-gro... [19:10:15] taavi: oh that's fun. I assume there is a 7 character minimum threshold on attempting to lookup a hex string as a Diffusion commit id and Gerrit ids recently hit that magic size. [19:21:32] 10Release-Engineering-Team (Now this 🫠), 10Patch-For-Review: gitlab-cloud-runner: Roll back pending helm releases before running terraform apply - https://phabricator.wikimedia.org/T354787#9564935 (10CodeReviewBot) sandeeps opened https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/3... [20:07:43] 10Release-Engineering-Team (Now this 🫠), 10Scap, 10MW-on-K8s, 10SRE, and 2 others: Scap should check errors coming from mw-on-k8s canaries during deployments - https://phabricator.wikimedia.org/T357402#9565070 (10CodeReviewBot) dancy opened https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/21... [20:51:28] Thinking about the magic link behavior that t.aavi noticed in T357759#9564838, I think I would be in favor of Phorge dropping magic links entirely and instead needing explicit `{}` around an identifier to create a link. This would also stop the annoying linking behavior for things like mentions of "P123" and "R321" [20:51:28] T357759: Deploy night mode on the minerva skin on test wiki - https://phabricator.wikimedia.org/T357759 [21:07:49] 10Release-Engineering-Team, 10Temporary accounts, 10Trust and Safety Product Team, 10Patch-For-Review: Create job that runs PHPUnit unit and integration tests with temp account feature flag is enabled - https://phabricator.wikimedia.org/T355879#9565247 (10Dreamy_Jazz) [21:07:58] 10Release-Engineering-Team, 10Temporary accounts, 10Trust and Safety Product Team, 10Patch-For-Review: Create job that runs PHPUnit unit and integration tests with temp account feature flag is enabled - https://phabricator.wikimedia.org/T355879#9488262 (10Dreamy_Jazz) [21:13:38] 10Deployments, 10Release-Engineering-Team (Now this 🫠): handle logstash timeouts separately from spikes in errors reported by logstash - https://phabricator.wikimedia.org/T144033#9565270 (10dancy) [21:14:59] 10Deployments, 10Release-Engineering-Team (Now this 🫠): handle logstash timeouts separately from spikes in errors reported by logstash - https://phabricator.wikimedia.org/T144033#9565272 (10jeena) stack trace of timout errors ` 20:16:43 Executing check 'Logstash Error rate for mw2374.codfw.wmnet' 20:16:43 Exe... [21:16:15] 10Release-Engineering-Team (Now this 🫠), 10Release, 10Train Deployments: 1.42.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T354437#9565281 (10matmarex) [21:26:19] 10Deployments, 10Release-Engineering-Team (Now this 🫠): handle logstash timeouts separately from spikes in errors reported by logstash - https://phabricator.wikimedia.org/T144033#9565300 (10dancy) [21:27:41] 10Deployments, 10Release-Engineering-Team (Now this 🫠): handle logstash timeouts separately from spikes in errors reported by logstash - https://phabricator.wikimedia.org/T144033#2586774 (10dancy) [21:40:19] 10GitLab (Pipeline Services Migration🐤), 10Shellbox: Migrate shellbox to GitLab - https://phabricator.wikimedia.org/T344745#9565343 (10bd808) >>! In T344745#9563918, @Jdforrester-WMF wrote: > I believe that no PHP libraries have yet moved. Happy to help Shellbox (or whomsoever) be the first. Let's not rush to... [23:59:11] 10Deployments, 10Release-Engineering-Team (Now this 🫠), 10Patch-For-Review: handle logstash timeouts separately from spikes in errors reported by logstash - https://phabricator.wikimedia.org/T144033#9565801 (10dancy)