[05:02:28] 10serviceops, 10Product-Infrastructure-Team-Backlog, 10Wikipedia-iOS-App-Backlog, 10iOS-app-v6.9-Carp-On-A-Zamboni: Rotate APNS key before deploying Push Notifications to Production - https://phabricator.wikimedia.org/T288546 (10Tsevener) Thanks for clarifying @Dmantena! I hadn't noticed that we can't gene... [08:31:57] hello folks [08:32:06] I have just build and uploaded scap 4.7.1 [08:32:24] it contains a new feature to make git-lfs working on buster nodes [08:38:33] I think that we could skip a complete rollout, or if possible I'd wait that the ORES nodes are on Buster [08:38:36] would it be ok? [08:38:52] (we are planning to do it during the next couple of weeks) [08:39:21] 10serviceops, 10Scap, 10Release-Engineering-Team (Radar): Deploy Scap version 4.7.1 - https://phabricator.wikimedia.org/T306998 (10jnuche) hi @elukey, thank you for taking care of this task. This release also has a fix for an issue in scap unrelated to `git-lfs`, so we would like to have it rolled out to eve... [08:40:33] nevermind releng asked if we could deploy it everywhere [08:43:49] 10serviceops, 10Scap, 10Release-Engineering-Team (Radar): Deploy Scap version 4.7.1 - https://phabricator.wikimedia.org/T306998 (10elukey) @jnuche ack perfect! I'll coordinate with Service Ops for the rollout. I'll try to skip the ORES nodes for the moment since they are still on stretch. [09:00:45] 10serviceops, 10Wikimedia-Developer-Portal, 10Goal, 10Patch-For-Review, 10Service-deployment-requests: New Service Request: developer-portal - https://phabricator.wikimedia.org/T297140 (10JMeybohm) >>! In T297140#7886240, @Dzahn wrote: > It says to test if everything is ok, when adding a new namespace, w... [09:02:44] elukey: AIUI that means we have to exclude ORES nodes from future scap rollouts? [09:03:13] jayme: nono until we move to Buster, just a couple of weeks [09:04:01] yeah, but with the current rate of scap releases that's something from 7-14 rollouts :D [09:04:07] ahahahah [09:04:23] how did you folks rolled out the last time? Debdeploy? [09:04:27] I can take care of this one [09:05:36] yeah, basically what's in https://wikitech.wikimedia.org/wiki/Scap/Release [09:06:15] if we need to make exceptions for ORES now, please add that to the page for the next poor soul [09:06:50] <_joe_> btw the solution is to NOT upload the new scap to stretch [09:06:58] <_joe_> if ores is the only thing using scap still on stretch [09:07:02] +1 [09:08:48] <_joe_> let's check that [09:09:39] <_joe_> aqs[1004-1009].eqiad.wmnet,doc1001.eqiad.wmnet,ores[2001-2009].codfw.wmnet,ores[1001-1009].eqiad.wmnet,restbase-dev[1004-1006].eqiad.wmnet,sessionstore[2001-2003].codfw.wmnet,sessionstore[1001-1003].eqiad.wmnet,thumbor[2003-2006].codfw.wmnet,thumbor[1001-1002,1005-1006].eqiad.wmnet,webperf[2001-2002].codfw.wmnet,webperf[1001-1002].eqiad.wmnet [09:09:44] <_joe_> sigh paste fail [09:09:50] <_joe_> but yeah, no, more stuff :P [09:11:59] yep [09:28:30] 10serviceops, 10Infrastructure-Foundations, 10Scap, 10Release-Engineering-Team (Priority Backlog 📥): New scap install-world command for self-install - https://phabricator.wikimedia.org/T307081 (10jnuche) [09:35:32] 10serviceops, 10Wikimedia-Developer-Portal, 10Goal, 10Patch-For-Review, 10Service-deployment-requests: New Service Request: developer-portal - https://phabricator.wikimedia.org/T297140 (10akosiaris) >>! In T297140#7886240, @Dzahn wrote: > Hi @akosiaris @JMeybohm > So it seems like https://gerrit.wikime... [09:44:18] 10serviceops, 10Release-Engineering-Team: helm-linter started failing on operations/deployment-charts today - https://phabricator.wikimedia.org/T307043 (10JMeybohm) I can reproduce that when running `rake run_locally['default']` on https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/787058 but not... [09:51:01] <_joe_> jayme: need me to take a look? [09:51:41] _joe_: I was hoping my comment would summon you, actually :-p [09:52:01] <_joe_> I'm quite busy [09:52:07] <_joe_> what's the change failing? [09:52:15] <_joe_> I can take a quick look in a few [09:52:21] https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/787058 [09:52:30] but it's not really the change I guess [09:53:49] still trying to figure that out...but this kind of things do need beefier hardware :D [09:54:05] <_joe_> wdym it's not the change? [09:54:15] <_joe_> the change does trigger a bug in the rakefile I guess [09:54:52] <_joe_> else, if it's unrelated to this change, I guess who changed anything [09:55:31] <_joe_> uhm if I had to guess, this is a failure scenario we didn't handle properly [09:55:43] <_joe_> ah! [09:56:03] <_joe_> I guess "helmfile build" fails with the current version of the helmfile? that would explain the issue [09:56:11] then I'm to ignorant to see the failure [09:56:13] checking that [09:56:34] <_joe_> 00:03:31 /src/.rake_modules/tester/asset.rb:397:in `templates': undefined method `each' for true:TrueClass (NoMethodError) [09:56:36] <_joe_> 00:03:31 from /src/.rake_modules/tester/asset.rb:127:in `diff' [09:56:38] <_joe_> this is the error [09:56:59] <_joe_> not sure about the rest of the errors [09:57:26] ah okay [09:59:03] so it does not fail in current master because new and old version both failing for mwdebug helmfile.yaml [09:59:22] which probably is not exactly ideal either :) [09:59:33] <_joe_> yep [10:01:37] <_joe_> ok so [10:01:53] <_joe_> look at collect_fixtures at line 358 of asset.rb [10:02:42] <_joe_> this means that the asset is marked "bad" [10:03:07] <_joe_> so we need an additional check in templates I guess. [10:03:24] <_joe_> this is a very special case where the original helmfile fails to build [10:04:26] well...actually this change (https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/787054) should have failed CI as the new helmfile fails to build [10:05:27] <_joe_> yes [10:05:34] <_joe_> not sure why it didn't [10:05:38] <_joe_> let me check the CI output [10:05:50] looks clean to me [10:05:53] https://integration.wikimedia.org/ci/job/helm-lint/7252/console [10:06:31] <_joe_> no it's not :P [10:06:33] <_joe_> as in [10:06:40] <_joe_> the mwdebug output is missing [10:06:49] <_joe_> I must have missed something in the handling of @bad [10:07:16] ah, indeed. Easy to miss things that are not there :) [10:12:25] <_joe_> ok, fixing that [10:12:33] <_joe_> then we can force-merge that change heh [10:14:10] * jayme curious for the fix [10:15:03] <_joe_> ahhh found the issue. damn [10:19:47] jayme: one question - I am testing `scap pull` on mwdebug* nodes, and in codfw the scap-cdb-rebuilt took around 3 mins. Is it something known or maybe due to the hosts being less used? [10:20:09] elukey: I've absolutely no clue [10:20:33] is does not take that long usually, that's what I can say [10:21:04] now it takes not time, like in eqiad [10:21:20] I'll add a note to the task so releng will be able to comment [10:21:23] thanks :) [10:21:45] always happy to "help" :p [10:22:40] 10serviceops, 10Scap, 10Release-Engineering-Team (Radar): Deploy Scap version 4.7.1 - https://phabricator.wikimedia.org/T306998 (10elukey) I tested `scap pull` on mwdebug* nodes, and only in codfw I noticed a very long step: ` 10:15:41 Started scap-cdb-rebuild 10:18:51 Finished scap-cdb-rebuild (duration: 0... [10:25:30] 10serviceops, 10Scap, 10Release-Engineering-Team (Radar): Deploy Scap version 4.7.1 - https://phabricator.wikimedia.org/T306998 (10elukey) Deployed also to `restbase-dev` as highlighted on wikitech, all good. I'll wait some hours and then I'll roll it out to the rest of the nodes, excluding the Ores ones. [10:32:28] <_joe_> jayme: so yeah, we need to force-merge my change first [10:32:55] <_joe_> and I might add a second if guard so that the revert actually passes validation [10:33:19] _joe_: I fail to understand why your change fixes this [10:33:47] <_joe_> jayme: look at runner.rb, line 102 [10:34:05] <_joe_> there I want to select only tests that were selected by the task [10:34:18] <_joe_> and instead I was selecting the ones selected by the task and not broken [10:35:15] ah, I see [10:40:49] do you want to add the if guard for "current broken"-"new working" case? [10:47:50] <_joe_> yeah but it's harder than I hoped [11:06:31] 10serviceops, 10Scap, 10Release-Engineering-Team (Radar): Deploy Scap version 4.7.1 - https://phabricator.wikimedia.org/T306998 (10jnuche) @elukey my guess is the long `scap-cdb-rebuild` step was caused by a change in the l10n files. It's most likely not an issue. @dancy maybe you can give some more insight. [11:13:09] <_joe_> jayme: yeah my patch needs some more work [11:13:31] <_joe_> but it's almost ready, it should be done by the afternoon [11:13:47] _joe_: ok, cool [11:14:05] thanks! [11:21:00] <_joe_> (it's now gtg IMHO) [11:21:20] <_joe_> (if only git review worked rn) [11:46:02] 10serviceops, 10Scap, 10Release-Engineering-Team (Radar): Deploy Scap version 4.7.1 - https://phabricator.wikimedia.org/T306998 (10hashar) From the scap logs https://logstash.wikimedia.org/goto/d454018867fa2a63300115ecaf60227c spanning the last 24 hours: | mwdebug2002 | 10:20:59 | Updated 0 CDB files(s) in... [11:46:35] 10serviceops, 10Scap, 10Release-Engineering-Team (Radar): Deploy Scap version 4.7.1 - https://phabricator.wikimedia.org/T306998 (10hashar) Which leads me to the question: why mwdebug2001 and mwdebug2002 did not trigger a cdb rebuild when we ran the train yesterday. Maybe they are not part of the mediawiki in... [11:59:41] for ingress on the generic cassandra http service - should there be one cassandra-http.discovery.wmnet service with routes for each different service, or a dedicated service hostname per service? I assume the latter but just want to be sure [12:07:23] 10serviceops, 10SRE: Provide node14 images for running production node-based services - https://phabricator.wikimedia.org/T306996 (10fgiunchedi) p:05Triage→03Medium [12:07:37] 10serviceops, 10SRE: Migrate node-based services in production to node14 - https://phabricator.wikimedia.org/T306995 (10fgiunchedi) p:05Triage→03Medium [12:08:58] 10serviceops, 10Scap, 10Release-Engineering-Team (Radar): Deploy Scap version 4.7.1 - https://phabricator.wikimedia.org/T306998 (10jnuche) @hashar thanks for all those details. mwdebug2001 and mwdebug2002 are indeed part of the targets, so that's a good question: why didn't the cdb rebuild trigger yesterda... [12:41:14] 10serviceops, 10Scap, 10Release-Engineering-Team (Radar): Deploy Scap version 4.7.1 - https://phabricator.wikimedia.org/T306998 (10jnuche) Interestingly, the only cdb rebuild happening on mwdebug2001 before today was on the 7th of Feb: https://logstash.wikimedia.org/goto/2504b0b3cbc067990f7197efda5381c2 Als... [12:45:32] 10serviceops, 10Release-Engineering-Team, 10Patch-For-Review: helm-linter started failing on operations/deployment-charts today - https://phabricator.wikimedia.org/T307043 (10Joe) 05Open→03Resolved a:03Joe [13:19:43] 10serviceops, 10Scap, 10Release-Engineering-Team (Radar): Deploy Scap version 4.7.1 - https://phabricator.wikimedia.org/T306998 (10elukey) new version rolled out everywhere :) [13:22:43] 10serviceops, 10Scap, 10Release-Engineering-Team (Radar): Deploy Scap version 4.7.1 - https://phabricator.wikimedia.org/T306998 (10jnuche) @elukey thanks! [14:39:30] 10serviceops, 10SRE: Provide node14 images for running production node-based services - https://phabricator.wikimedia.org/T306996 (10MoritzMuehlenhoff) While Debian has e.g. 14 and 16 in various development branches none of those are going to be continously updated (e.g. 14 will be replaced by 16 in testing so... [14:50:27] 10serviceops, 10Product-Infrastructure-Team-Backlog, 10Wikipedia-iOS-App-Backlog, 10iOS-app-v6.9-Carp-On-A-Zamboni: Rotate APNS key before deploying Push Notifications to Production - https://phabricator.wikimedia.org/T288546 (10Jgiannelos) Just to echo what @Dmantena said, Apple and APNS only have one act... [14:55:58] During maintenance on eqiad appserver I noticed three api_appserver hosts mw[1339,1340,1410] are not pooled. Some machines were reimaged by mutante and pooled in SAL. mw1340 was depooled by krinkle for performance testing. Should this hosts stay depooled? [14:56:00] (I'll pm krinkle becaus not in this channel) [15:05:49] 10serviceops, 10SRE: Provide node14 images for running production node-based services - https://phabricator.wikimedia.org/T306996 (10Jdforrester-WMF) >>! In T306996#7888124, @MoritzMuehlenhoff wrote: > We can import the nodesource packages into separate repository components, e.g. thirdparty/node14 and thirdpa... [15:49:09] 10serviceops, 10Patch-For-Review: Test running php7.2 and php7.4 in parallel on the beta cluster - https://phabricator.wikimedia.org/T295578 (10hnowlan) There were some issues with the make_beta_config.py script for helm3, there's a CR open to fix it. In general is the plan to create an entirely independent se... [16:23:07] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10Cmjohnson) [16:46:59] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1001.eqiad.wmnet with OS buster [16:54:26] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1002.eqiad.wmnet with OS buster [16:55:24] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1004.eqiad.wmnet with OS buster [16:55:56] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1005.eqiad.wmnet with OS buster [16:56:29] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1006.eqiad.wmnet with OS buster [17:02:30] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1003.eqiad.wmnet with OS buster [17:03:08] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1007.eqiad.wmnet with OS buster [17:03:49] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1008.eqiad.wmnet with OS buster [17:12:10] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1001.eqiad.wmnet with OS buster complet... [17:13:18] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1009.eqiad.wmnet with OS buster [17:19:25] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1002.eqiad.wmnet with OS buster complet... [17:19:42] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1010.eqiad.wmnet with OS buster [17:24:02] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1006.eqiad.wmnet with OS buster complet... [17:26:01] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1004.eqiad.wmnet with OS buster complet... [17:27:00] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1005.eqiad.wmnet with OS buster complet... [17:27:18] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1011.eqiad.wmnet with OS buster [17:27:33] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1012.eqiad.wmnet with OS buster [17:27:43] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1013.eqiad.wmnet with OS buster [17:29:46] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1008.eqiad.wmnet with OS buster complet... [17:29:53] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1003.eqiad.wmnet with OS buster complet... [17:30:22] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1014.eqiad.wmnet with OS buster [17:30:40] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1015.eqiad.wmnet with OS buster [17:30:56] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1007.eqiad.wmnet with OS buster complet... [17:31:08] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10Patch-For-Review: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1016.eqiad.wmnet with OS buster [17:35:59] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1016.eqiad.wmnet with OS buster [17:36:05] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1016.eqiad.wmnet with OS buster executed with errors: - parse... [17:36:47] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1016.eqiad.wmnet with OS buster [17:36:52] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1016.eqiad.wmnet with OS buster executed with errors: - parse... [17:40:15] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1016.eqiad.wmnet with OS buster [17:40:21] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1016.eqiad.wmnet with OS buster executed with errors: - parse... [17:42:06] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1017.eqiad.wmnet with OS buster [17:45:30] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1018.eqiad.wmnet with OS buster [17:45:45] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1009.eqiad.wmnet with OS buster completed: - parse1009 (**PAS... [17:46:02] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1019.eqiad.wmnet with OS buster [17:47:31] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1010.eqiad.wmnet with OS buster completed: - parse1010 (**PAS... [17:47:43] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1020.eqiad.wmnet with OS buster [17:49:10] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1021.eqiad.wmnet with OS buster [17:52:02] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1022.eqiad.wmnet with OS buster [17:55:25] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1013.eqiad.wmnet with OS buster completed: - parse1013 (**PAS... [17:58:52] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1015.eqiad.wmnet with OS buster completed: - parse1015 (**WAR... [17:59:20] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1016.eqiad.wmnet with OS buster completed: - parse1016 (**FAI... [17:59:24] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1016.eqiad.wmnet with OS buster executed with errors: - parse... [17:59:36] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1023.eqiad.wmnet with OS buster [17:59:56] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host parse1024.eqiad.wmnet with OS buster [18:01:12] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1012.eqiad.wmnet with OS buster completed: - parse1012 (**WAR... [18:01:22] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1011.eqiad.wmnet with OS buster completed: - parse1011 (**WAR... [18:01:49] we now have these new aliases for owners: [18:01:51] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1014.eqiad.wmnet with OS buster completed: - parse1014 (**WAR... [18:01:51] [cumin2002:~] $ sudo cumin 'A:owner-serviceops' 'uname -r ' [18:01:52] 506 hosts will be targeted: [18:01:59] [cumin2002:~] $ sudo cumin 'A:owner-core-platform' 'uname -r ' [18:01:59] 66 hosts will be targeted: [18:08:32] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1017.eqiad.wmnet with OS buster completed: - parse1017 (**PAS... [18:11:41] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1018.eqiad.wmnet with OS buster completed: - parse1018 (**WAR... [18:13:23] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1019.eqiad.wmnet with OS buster completed: - parse1019 (**PAS... [18:17:28] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1020.eqiad.wmnet with OS buster completed: - parse1020 (**PAS... [18:19:06] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1021.eqiad.wmnet with OS buster completed: - parse1021 (**PAS... [18:21:13] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1022.eqiad.wmnet with OS buster completed: - parse1022 (**PAS... [18:25:50] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1023.eqiad.wmnet with OS buster completed: - parse1023 (**PAS... [18:28:08] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host parse1024.eqiad.wmnet with OS buster completed: - parse1024 (**PAS... [19:11:32] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10Cmjohnson) [19:13:47] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install parse100[01-24] - https://phabricator.wikimedia.org/T299573 (10Cmjohnson) 05Open→03Resolved these have all be installed [19:22:21] 10serviceops, 10Product-Infrastructure-Team-Backlog, 10Wikipedia-iOS-App-Backlog, 10iOS-app-v6.9-Carp-On-A-Zamboni: Rotate APNS key before deploying Push Notifications to Production - https://phabricator.wikimedia.org/T288546 (10JMinor) p:05High→03Unbreak! [19:26:47] 10serviceops, 10Product-Infrastructure-Team-Backlog, 10Wikipedia-iOS-App-Backlog, 10iOS-app-v6.9-Carp-On-A-Zamboni: Rotate APNS key before deploying Push Notifications to Production - https://phabricator.wikimedia.org/T288546 (10Dzahn) @Tsevener Sure, we can keep this task limited to swap the old key with... [19:56:05] 10serviceops, 10Product-Infrastructure-Team-Backlog, 10Wikipedia-iOS-App-Backlog, 10iOS-app-v6.9-Carp-On-A-Zamboni: Rotate APNS key before deploying Push Notifications to Production - https://phabricator.wikimedia.org/T288546 (10Dzahn) >>! In T288546#7886711, @Tsevener wrote: > @Dzahn Given @Dmantena's not... [19:56:57] I replaced the "APNS key" for push-notification as they requested on that ticket that was raised to UBN for reasons I don't get. [19:57:12] All of it was within _staging_ context, not production. [19:57:48] and ticket said "before production" it should be rotated. so yea.. did not seem UBN but .. DONE so it's not us blocking [19:59:05] 10serviceops, 10Product-Infrastructure-Team-Backlog, 10Wikipedia-iOS-App-Backlog, 10iOS-app-v6.9-Carp-On-A-Zamboni: Rotate APNS key before deploying Push Notifications to Production - https://phabricator.wikimedia.org/T288546 (10JMinor) p:05Unbreak!→03High Hey @dzahn I raised the priority because this... [20:02:39] 10serviceops, 10Product-Infrastructure-Team-Backlog, 10Wikipedia-iOS-App-Backlog, 10iOS-app-v6.9-Carp-On-A-Zamboni: Rotate APNS key before deploying Push Notifications to Production - https://phabricator.wikimedia.org/T288546 (10Dzahn) Hi @JMinor so I did now exactly what @Tsevener asked for and replaced t... [20:19:39] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10GitLab (Infrastructure): Q3:(Need By: TBD) rack/setup/install gitlab100[3|4] and gitlab-runner100[2|3|4] - https://phabricator.wikimedia.org/T301177 (10Cmjohnson) [20:40:10] bd808: developer-portal and image-suggestion have namespaces in all 4 envs/clusters now and Janis fixed the docs [20:40:48] excellent. thanks for following up mutante [20:40:51] it's 'kubectl describe ns $SERVICE_NAME' instead of "get ns..." [20:41:06] and as root [20:41:09] that makes more sense [20:44:04] 10serviceops, 10Wikimedia-Developer-Portal, 10Goal, 10Patch-For-Review, 10Service-deployment-requests: New Service Request: developer-portal - https://phabricator.wikimedia.org/T297140 (10Dzahn) Thank you both! I can confirm I see both new namespaces in all 4 envs/clusters. [21:11:04] 10serviceops, 10GitLab: bring new gitlab hardware servers into production - https://phabricator.wikimedia.org/T307142 (10Dzahn) [21:35:26] 10serviceops, 10Generated Data Platform, 10Image-Suggestions, 10SRE, and 3 others: New Service Request Generated Datasets: Image Suggestions Service - https://phabricator.wikimedia.org/T304891 (10Dzahn) Also see T297140#7886240 where 2 new namespaces were added, one for developer-portal and this over here... [21:38:28] 10serviceops, 10Wikimedia-Developer-Portal, 10Goal, 10Patch-For-Review, 10Service-deployment-requests: New Service Request: developer-portal - https://phabricator.wikimedia.org/T297140 (10Dzahn) ` @deploy1002:~# kube_env admin eqiad @deploy1002:~# kubectl describe ns developer-portal Name: deve... [22:00:54] 10serviceops, 10Product-Infrastructure-Team-Backlog, 10Wikipedia-iOS-App-Backlog, 10iOS-app-v6.9-Carp-On-A-Zamboni: Rotate APNS key before deploying Push Notifications to Production - https://phabricator.wikimedia.org/T288546 (10Tsevener) Appreciate it @Dzahn! So, unfortunately now we aren't receiving push... [22:02:00] 10serviceops, 10SRE: Q1:(Need By: TBD) rack/setup/install mw241[2-9].codfw.wmnet - https://phabricator.wikimedia.org/T290192 (10Dzahn) @Papaul I removed you from the ticket and any tags related to dcops though. Still an issue? [22:08:15] 10serviceops, 10Generated Data Platform, 10Image-Suggestions, 10SRE, and 2 others: Blubber setup for Image Suggestions Service - https://phabricator.wikimedia.org/T305155 (10Dzahn) namespace image-suggestion has now been created on all 4 clusters, staging and production. [23:51:49] 10serviceops, 10Product-Infrastructure-Team-Backlog, 10Wikipedia-iOS-App-Backlog, 10iOS-app-v6.9-Carp-On-A-Zamboni: Rotate APNS key before deploying Push Notifications to Production - https://phabricator.wikimedia.org/T288546 (10Dzahn) @Tsevener Please try again one more time now!