[05:52:33] 10netops, 10Infrastructure-Foundations, 10SRE: TATA SKY Broadband (AS134674) issues with connecting to upload.wikimedia.org - https://phabricator.wikimedia.org/T275234 (10ayounsi) 05Open→03Resolved a:03ayounsi No more news from Tata Sky and nothing we can do at our network layer neither. To be reopened... [07:33:16] wow that was a lot of movement on phab :) [07:56:42] yup [07:57:20] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Test dhcp-option 82 - https://phabricator.wikimedia.org/T221388 (10Volans) 05Open→03Resolved The test of the option 82 has been successful and we're switching to this system for all physical hosts DHCP settings in T269855. In the final... [08:04:39] 10netops, 10Infrastructure-Foundations, 10SRE, 10Traffic-Icebox: externally-hosted NEL report forwarders for more timely report reception - https://phabricator.wikimedia.org/T292870 (10ayounsi) I'd wary of the complexity of the setup. As I'm not quite familiar with NEL setup, is there a downside of puttin... [08:41:46] 10Traffic, 10Infrastructure-Foundations, 10SRE: Anycast: Add IPv6 support to bird and anycast-healthchecker (Puppet) - https://phabricator.wikimedia.org/T292737 (10ayounsi) Thanks that's great! Could you update the doc to reflect the new config knobs? And we need to be sure we don't forget to update https:... [08:59:59] 10Traffic, 10SRE, 10Patch-For-Review, 10User-ema: Experiment with single backend CDN nodes - https://phabricator.wikimedia.org/T288106 (10ema) [09:48:24] 10Traffic, 10Performance-Team, 10SRE, 10User-ema: Package and deploy Varnish 6.0.8 - https://phabricator.wikimedia.org/T292290 (10ema) Heads up #performance-team: as with all Varnish upgrades, this may have an impact (positive or negative) on performance. You may want to keep the upgrade process on your ra... [10:10:41] 10Traffic, 10Commons, 10MediaWiki-Uploading, 10SRE, and 3 others: Various errors when trying to upload large files (Could not acquire lock, Service Temporarily Unavailable, 503 Backend fetch failed, 502 Next Hop Connection Failed) - https://phabricator.wikimedia.org/T280926 (10aborrero) [13:19:00] cp4027 is running varnish 6.0.8 now, I don't expect anything spectacular to happen but if you want to stare at graphs with me every now and then please do so [13:19:05] https://grafana.wikimedia.org/d/A__2L7eWz/cache-hosts-comparison?orgId=1&var-site=ulsfo%20prometheus%2Fops&var-instance=cp4027&var-instance_b=cp4028&from=now-1h&to=now [13:20:16] the cache is slowly refilling, we're now at 325K objects vs 13.6M on cp4028 [13:23:08] volans: hi! re: T201317, I haven't reimaged any host in a long time so I'm not sure if the issue is still there [13:23:08] T201317: wmf-auto-reimage: 'execution expired' on first puppet run - https://phabricator.wikimedia.org/T201317 [13:23:48] volans: do you think there could be a workaround for it to be implemented in wmf-auto-reimage, potentially? [13:23:59] if not I agree we can safely drop the SRE-tools tag [13:26:12] ema: sorry in a meeting, I'll get back to you in a bit [13:29:50] volans: ack, no rush [14:09:03] ema: the wmf-autp-reimage is no more, migrated to the cookbook sre.hosts.reimage since this morning (see related announce emails and https://wikitech.wikimedia.org/wiki/Server_Lifecycle/Reimage ) ;) [14:09:15] the new one does ask the user what to do if the first puppet run fails [14:09:53] so it should probably catch the above in the sense of asking what to do the user, in that case I guess the user can go to the host and retry the puppet run [14:10:20] but before coding something in the cookbook would be useful to see a real case, to make sure we're actually fixing it [14:10:32] the other approach could be to force puppet on v4 I guess [14:10:56] unless the change does make all the connections to flap [14:27:51] fair enough, well step one I guess is seeing if the issue is still there :) [14:31:46] do you have a host that can be reimaged? [15:03:41] volans: yeah, cp4021 for example [15:03:58] ok, let's test it whenever you want [15:05:18] cool, let's give it a spin tomorrow morning! [15:05:29] great [15:43:32] bblack: can i get you to review https://gerrit.wikimedia.org/r/c/operations/puppet/+/662688 thanks [16:01:40] jbond: I'd add a comment about that sleep(1) [16:02:27] vgutierrez: ack thanks [16:02:47] also.. apparently f-strings are preferred to .format() [16:03:43] vgutierrez: this is a script which will run everywhere as such we cant use f-strings as they where only introduced in 3.6 (stretch is on 3.5) [16:03:50] oh ok [16:08:07] yes its a bit of a PITA [16:53:13] TIL - What an f-string is. Nice :) [22:22:10] 10Traffic, 10Performance-Team, 10SRE, 10User-ema: Package and deploy Varnish 6.0.8 - https://phabricator.wikimedia.org/T292290 (10Krinkle) I've made some improvements to the by-host dash that may be of use: