[00:34:53] (03CR) 10Thcipriani: [C: 03+2] /srv/patches -> config["patch_path"] (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/752029 (owner: 10Ahmon Dancy) [00:36:16] (03Merged) 10jenkins-bot: /srv/patches -> config["patch_path"] [tools/scap] - 10https://gerrit.wikimedia.org/r/752029 (owner: 10Ahmon Dancy) [04:50:11] 10Release-Engineering-Team (Next), 10Patch-For-Review, 10Release, 10Train Deployments: 1.38.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T293958 (10tstarling) [09:58:22] (03CR) 10Hashar: "recheck" [tools/train-dev] - 10https://gerrit.wikimedia.org/r/736443 (owner: 10Hashar) [10:06:08] (03CR) 10Hashar: "I had some blank spaces related issues in a previous patch this will enforce consistency ;)" [tools/train-dev] - 10https://gerrit.wikimedia.org/r/736443 (owner: 10Hashar) [10:09:15] There's a bug (T298760) that's been introduced by wmf.16. It's not a deployment blocker, so I don't want to add it as a subtask, but is there some other task or tag I should use to indicate that it's introduced by wmf.16? [10:09:15] T298760: Flow\Exception\FlowException: A required post has not been loaded: tn9fp3z7fq89497j - https://phabricator.wikimedia.org/T298760 [10:11:53] it could at least be tagged as a #regression, but I’m not aware of other tags to mark when a bug was introduced [10:30:39] k [10:34:34] kostajh: https://phabricator.wikimedia.org/tag/wikimedia-production-error/ probably and move it to the January 2022 column [10:35:05] thanks, done [10:36:07] that is where we fill anything that ends up generating logs in logstash for mw [10:36:31] ah, right [10:54:36] kostajh: https://phabricator.wikimedia.org/T298763 sounds related too and a potential train blocker [11:04:39] taavi: flow is used a lot on mw.org so that would mean rolling back to 0 right? [11:05:18] That would be sad on a friday [11:05:57] I'd prefer if we had a patch that we could revert [11:06:23] Or that [11:06:46] I'll read changelog [11:06:51] it doesn't seem to be breaking all category pages, but I have no clues on which are affected [11:07:35] also why is flow even trying to load posts on a category page in the first place [11:07:42] https://www.mediawiki.org/wiki/MediaWiki_1.38/wmf.16 doesn't show anything from flow [11:07:46] hashar: fyi above, Flow seems to be breaking some things [11:07:54] what about .15/.14? [11:08:08] Don't exist [11:08:22] https://www.mediawiki.org/wiki/MediaWiki_1.38/wmf.14/Changelog [11:09:16] oh I see what's going on [11:09:47] it's https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Flow/+/667637, in CategoryViewerQuery the raw NS from the database will be a string [11:09:53] https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Flow/+/667637 is the only patch [11:10:08] https://github.com/wikimedia/mediawiki-extensions-Flow/compare/wmf/1.36.0-wmf.13...wmf/1.35.0-wmf.16 doesn't show it which is weird [11:10:21] No it isnt [11:10:33] 11 commits https://github.com/wikimedia/mediawiki-extensions-Flow/compare/wmf/1.36.0-wmf.13...wmf/1.36.0-wmf.16 [11:10:47] you're comparing 1.36 wmf branches, not 1.38 [11:11:24] I should wake up [11:11:30] This is occurring infrequently enough that a full rollback seems kind of extreme. Let’s make a patch to fix it instead? [11:11:34] https://github.com/wikimedia/mediawiki-extensions-Flow/compare/wmf/1.38.0-wmf.13...wmf/1.38.0-wmf.16 [11:11:44] kostajh: we can revert the patch causing [11:11:56] https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Flow/+/667637 has to be it [11:12:17] sure that works too [11:12:35] I need to disappear for a bit, I can deploy reverts after that [11:13:13] kostajh: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Flow/+/752013 [11:13:22] I can cherry pick too [11:13:30] If you are happy and can +2 [11:13:44] taavi: will make sure it's ready but I'm gone in 20 minutes [11:14:50] Cherry picked too [11:15:00] https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Flow/+/752014 [11:20:10] thanks both [11:20:35] I'm fine with the revert patch, but was wondering if we should just do a one-line patch to fix that specific issue in CategoryViewerQuery, as there don't seem to be other issues caused by the patch [11:23:14] i.e. `(int)$row->page_namespace !== NS_TOPIC` [11:23:37] If it'll work [11:25:14] I'll make a patch for that, and we can try it in the backport window [11:25:46] There is no window today, it'll be an emergency deploy [11:25:58] oh right [11:27:23] kostajh: yeah if you can identiy all those spots I'd prefer that too [11:28:50] I'm happy to deploy as long as we get releng+sre approval, the breakage is severe enough (I think) to not leave it like that for the weekend [11:29:20] looking at logstash, I think that's the only thing that's broken [11:29:31] well, the only *new* thing that's broken with Flow :) [11:30:06] https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Flow/+/752139 [11:31:40] I think all places where that patch directly compares database rows are affected, at least the CU stuff in includes/Hooks.php and includes/FlowSetUserIp.php [11:33:35] hm, right. OK, let's go with the simpler option and revert the whole thing. [11:34:10] so that'd be https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Flow/+/752014? [11:34:59] yeah [11:35:06] hashar: can we deploy https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Flow/+/752014, please? [11:35:19] yes [11:35:36] +2 ed [11:36:29] I thought it needed sre approval too [11:41:25] https://wikitech.wikimedia.org/wiki/Deployments/Emergencies says so, but tbh I don't think I ever explicity saw that [11:46:38] 10Continuous-Integration-Config, 10MediaWiki-Core-Tests, 10Code-Health, 10Developer Productivity, and 2 others: Run api-testing tests in parallel - https://phabricator.wikimedia.org/T298735 (10kostajh) [11:46:42] hashar: could you please approve deploying https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ProofreadPage/+/751843/ too? thanks in advance [14:56:39] (03PS1) 10Hashar: jjb: ensure docker macro options is always a string [integration/config] - 10https://gerrit.wikimedia.org/r/752155 [15:04:18] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10Move-Files-To-Commons, 10WMDE-TechWish, and 3 others: Test WikiTextEditor class with browser tests. - https://phabricator.wikimedia.org/T190829 (10thiemowmde) [15:11:12] 10RelEng-Archive-FY201718-Q1, 10Revision-Slider, 10Patch-For-Review, 10Wikimedia-production-error: Catchable fatal error: Argument 2 passed to RevisionSliderHooks::onDiffViewHeader() must be an instance of Revision, null given - https://phabricator.wikimedia.org/T167359 (10thiemowmde) [15:15:33] (03PS1) 10Hashar: jjb: move docker macro entrypoint out of "options" [integration/config] - 10https://gerrit.wikimedia.org/r/752156 [15:15:49] 10Continuous-Integration-Config, 10Revision-Slider, 10Patch-For-Review: Apparent random failing of RevisionSlider qunit tests - https://phabricator.wikimedia.org/T153121 (10thiemowmde) [15:21:42] 10RelEng-Archive-FY201718-Q1, 10Revision-Slider, 10VisualEditor, 10User-Ryasmeen, 10Wikimedia-production-error: Argument 2 passed to VisualEditorHooks::onDiffViewHeader() must be an instance of Revision, null given - https://phabricator.wikimedia.org/T169132 (10thiemowmde) [15:25:42] PROBLEM - Check systemd state on doc1001 is CRITICAL: CRITICAL - degraded: The following units failed: rsync-doc-doc2001.codfw.wmnet.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [15:28:58] 10Release-Engineering-Team (Next), 10Patch-For-Review, 10Release, 10Train Deployments: 1.38.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T293957 (10Krinkle) [16:00:28] PROBLEM - SSH on contint1001.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:09:42] 10Release-Engineering-Team (Next), 10Patch-For-Review, 10Release, 10Train Deployments: 1.38.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T293957 (10Urbanecm) [16:13:18] (03CR) 10Ahmon Dancy: [C: 03+2] Trim trailing whitepsace [tools/train-dev] - 10https://gerrit.wikimedia.org/r/736443 (owner: 10Hashar) [16:14:12] (03Merged) 10jenkins-bot: Trim trailing whitepsace [tools/train-dev] - 10https://gerrit.wikimedia.org/r/736443 (owner: 10Hashar) [16:15:45] (03PS1) 10Ahmon Dancy: Add additional aliases for the 'www' container [tools/train-dev] - 10https://gerrit.wikimedia.org/r/752160 [16:16:12] (03CR) 10Ahmon Dancy: [C: 03+2] Add additional aliases for the 'www' container [tools/train-dev] - 10https://gerrit.wikimedia.org/r/752160 (owner: 10Ahmon Dancy) [16:16:36] (03Merged) 10jenkins-bot: Add additional aliases for the 'www' container [tools/train-dev] - 10https://gerrit.wikimedia.org/r/752160 (owner: 10Ahmon Dancy) [16:21:10] RECOVERY - Check systemd state on doc1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [16:30:46] (03PS1) 10Isabelle Hurbain-Palatin: Add 'parsoid' to 'wikihiero' extension dependencies [integration/config] - 10https://gerrit.wikimedia.org/r/752164 (https://phabricator.wikimedia.org/T272931) [16:47:32] (03PS1) 10Majavah: zuul: [openstack/horizon/wmf-proxy-dashboard] Add tox-docker [integration/config] - 10https://gerrit.wikimedia.org/r/752168 [17:10:21] (03CR) 10Jforrester: [C: 03+1] "Neat." [integration/config] - 10https://gerrit.wikimedia.org/r/752155 (owner: 10Hashar) [17:10:38] (03CR) 10Jforrester: [C: 03+1] "Just a minor deploy of every single definition? ;-)" [integration/config] - 10https://gerrit.wikimedia.org/r/752156 (owner: 10Hashar) [17:49:43] (03PS1) 10Ahmon Dancy: Redirect git.remote_exists() output to /dev/null [tools/scap] - 10https://gerrit.wikimedia.org/r/752179 [17:50:42] (03CR) 10Ahmon Dancy: [C: 03+2] Redirect git.remote_exists() output to /dev/null [tools/scap] - 10https://gerrit.wikimedia.org/r/752179 (owner: 10Ahmon Dancy) [17:51:25] (03Merged) 10jenkins-bot: Redirect git.remote_exists() output to /dev/null [tools/scap] - 10https://gerrit.wikimedia.org/r/752179 (owner: 10Ahmon Dancy) [19:02:58] (03PS1) 10Ahmon Dancy: scap prep: Reduce submodule noise [tools/scap] - 10https://gerrit.wikimedia.org/r/752188 [19:08:30] (03PS2) 10Ahmon Dancy: scap prep: Reduce submodule noise [tools/scap] - 10https://gerrit.wikimedia.org/r/752188 [19:08:55] (03CR) 10Ahmon Dancy: [C: 03+1] scap prep: Reduce submodule noise [tools/scap] - 10https://gerrit.wikimedia.org/r/752188 (owner: 10Ahmon Dancy) [19:11:46] (03PS3) 10Ahmon Dancy: scap prep: Reduce submodule noise [tools/scap] - 10https://gerrit.wikimedia.org/r/752188 [19:11:48] (03PS1) 10Ahmon Dancy: Make scap prep idempotent [tools/scap] - 10https://gerrit.wikimedia.org/r/752189 [19:12:13] RECOVERY - SSH on contint1001.mgmt is OK: SSH OK - OpenSSH_6.6 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [19:14:13] (03PS2) 10Ahmon Dancy: Make scap prep idempotent [tools/scap] - 10https://gerrit.wikimedia.org/r/752189 [19:14:40] (03PS3) 10Ahmon Dancy: Make scap prep idempotent [tools/scap] - 10https://gerrit.wikimedia.org/r/752189 [19:29:58] 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops, 10ops-codfw: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL )) - https://phabricator.wikimedia.org/T283582 (10Dzahn) db2063 and db2068 were affected today [19:32:28] 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops, 10ops-codfw: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL )) - https://phabricator.wikimedia.org/T283582 (10Dzahn) for the record: I have absolutely no idea why contint2001.mgmt disappeared... [19:33:20] 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops, 10ops-codfw: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL )) - https://phabricator.wikimedia.org/T283582 (10Dzahn) a:05Dzahn→03None [19:36:32] 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops, 10ops-codfw: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL )) - https://phabricator.wikimedia.org/T283582 (10Dzahn) @Papaul Do you know about contint2001.mgmt status? [19:37:37] 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops, 10ops-codfw: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL )) - https://phabricator.wikimedia.org/T283582 (10Papaul) @Dzahn no [19:46:38] 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops, 10observability, 10ops-codfw: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL )) - https://phabricator.wikimedia.org/T283582 (10Dzahn) [19:46:59] 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops, 10observability, 10ops-codfw: contint2001.mgmt disappeared from Icinga (was: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL ))) - https://phabricator.wikimedia.org/T283582 (10Dzahn) [19:59:21] 10Project-Admins: Requests for addition to the #acl*Project-Admins group (in comments) - https://phabricator.wikimedia.org/T706 (10ldelench_wmf) Hi there, could @MShilova_WMF and @JMcLeod_WMF both be added to #acl*Project-Admins group? They are technical program managers who will need to create projects/milesto... [20:08:32] 10MediaWiki-Releasing, 10MediaWiki-Installer, 10MediaWiki-Stakeholders-Group, 10Epic, 10MW-1.38-release: Expand the set of bundled extensions and skins in MediaWiki 1.38 - https://phabricator.wikimedia.org/T290934 (10Jdlrobson) [20:08:35] 10MediaWiki-Releasing, 10MediaWiki-Installer, 10MediaWiki-Stakeholders-Group, 10Epic, 10MW-1.37-release: Expand the set of bundled extensions and skins in MediaWiki 1.37 - https://phabricator.wikimedia.org/T279842 (10Jdlrobson) [20:08:52] 10MediaWiki-Releasing, 10MW-1.38-notes (1.38.0-wmf.16; 2022-01-03), 10MW-1.38-release, 10MinervaNeue (Tracking): Bundle MinervaNeue skin with MediaWiki - https://phabricator.wikimedia.org/T191743 (10Jdlrobson) 05Open→03Resolved a:03Jdlrobson Done! [20:27:40] 10Phabricator: Project-Admins can't set Source Repos on Projects - https://phabricator.wikimedia.org/T279284 (10diegodlh) Hi all! I'm getting the same issue. I'm not being able to set source repo for subproject #web2cit-core. On the other hand, I was able to set the source repo for subproject #web2cit-server up... [20:32:22] (03CR) 10Thcipriani: [C: 03+2] "I wonder if the silence will freak anyone out" [tools/scap] - 10https://gerrit.wikimedia.org/r/752188 (owner: 10Ahmon Dancy) [20:33:02] (03Merged) 10jenkins-bot: scap prep: Reduce submodule noise [tools/scap] - 10https://gerrit.wikimedia.org/r/752188 (owner: 10Ahmon Dancy) [20:41:53] 10Phabricator: Update Herald (H260) to include milestones 5687, 5688, 5690, 5691, 5692 - https://phabricator.wikimedia.org/T298809 (10ldelench_wmf) [20:48:16] 10Phabricator: Update Herald (H260) to include milestones 5687, 5688, 5690, 5691, 5692 - https://phabricator.wikimedia.org/T298809 (10ldelench_wmf) @JMcLeod_WMF FYI; this should carry us through the end of March for CommTech (if you indeed want to keep following a 2-week sprint structure with a new milestone boa... [21:06:18] 10Phabricator: Update Herald (H260) to include milestones 5687, 5688, 5690, 5691, 5692 - https://phabricator.wikimedia.org/T298809 (10MBinder_WMF) 05Open→03Resolved {meme, src="seal-of-approval", above=SEAL, below="OF APPROVAL"} [21:07:37] ^ phab memes still being used in real life :) [22:16:07] PROBLEM - SSH on contint1001.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [23:17:05] RECOVERY - SSH on contint1001.mgmt is OK: SSH OK - OpenSSH_6.6 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [23:18:39] 10Phabricator: H394 is far too broad - https://phabricator.wikimedia.org/T298818 (10RhinosF1)