[07:21:39] vgutierrez: https://gerrit.wikimedia.org/r/c/operations/puppet/+/685811 and https://gerrit.wikimedia.org/r/c/operations/puppet/+/685810 look like final cleanup patches after T236120/T238625, right? [07:21:39] T236120: Get rid of nginx puppetization for cache upload - https://phabricator.wikimedia.org/T236120 [07:58:31] ema: o/ [07:58:34] buongiorno [07:58:53] elukey: hey! [07:59:22] a couple of days ago I left a note about cp1087, it went down again and I didn't touch it (only pooled=inactive) due to https://phabricator.wikimedia.org/T278729 [07:59:55] didn't know what was best, maybe we can powercycle it again but it seems a little unstable [08:01:04] not a very lucky host, heh [08:02:38] thanks elukey, yeah I agree with leaving it depooled at this point, let's see if dcops has further ideas for other things to try [08:06:19] 10Traffic, 10SRE, 10ops-eqiad: cp1087 powercycled - https://phabricator.wikimedia.org/T278729 (10ema) @Cmjohnson: this is still happening unfortunately. The host is currently down and depooled, please feel free to try anything else that comes to mind. No heads-up needed. [08:15:16] ema: it seems like you're right as usual (re gerrit 685810 and 685811) [08:15:50] s/usual/always/ but yeah [08:17:03] 99.999999% of the time [08:23:00] I always appreciate modest people working with me [08:29:41] I think there's actually still some followup work, some of the do_ocsp code paths in tlsproxy::localssl can probably also go away [13:03:20] 10Traffic, 10netops, 10SRE, 10User-jbond: varnish filtering: should we automatically update public_cloud_nets - https://phabricator.wikimedia.org/T270391 (10jbond) Nice work :) >>! In T270391#7120197, @cmooney wrote: > There is this script for AWS that @ema pointed me towards: > > https://gerrit.wikimed... [15:08:49] 10netops, 10Data-Persistence-Backup, 10SRE: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10jcrespo) FYI, cross-dc backups are now in a "normal state" meaning we should only have those a few hours... [15:18:00] 10Traffic, 10netops, 10SRE, 10User-jbond: varnish filtering: should we automatically update public_cloud_nets - https://phabricator.wikimedia.org/T270391 (10cmooney) Thanks jbond appreciate the feedback. Your improvements to the script look great. Nice work on the parsing, much cleaner than my shite, and... [15:19:08] 10Traffic, 10SRE, 10User-jbond: Implement machine-local forwarding DNS caches - https://phabricator.wikimedia.org/T171498 (10jbond) [15:33:45] 10Traffic, 10SRE, 10Patch-For-Review, 10User-jbond: interface-rps.py should have a flag to avoid CPU0 - https://phabricator.wikimedia.org/T236208 (10jbond) [15:50:36] 10Traffic, 10netops, 10SRE, 10User-jbond: varnish filtering: should we automatically update public_cloud_nets - https://phabricator.wikimedia.org/T270391 (10jbond) > One thing I do think we should include is some sort of IP aggregation completely agree, its an oversight that it missed > I'm not sure if Ne... [19:53:57] 10Traffic, 10SRE, 10ops-eqiad: cp1087 powercycled - https://phabricator.wikimedia.org/T278729 (10Cmjohnson) @ema Looks to be a DIMM issue, submitted a ticket to Dell You have successfully submitted request SR1061284651. [21:50:50] 10Traffic, 10netops, 10SRE, 10User-jbond: varnish filtering: should we automatically update public_cloud_nets - https://phabricator.wikimedia.org/T270391 (10Volans) >>! In T270391#7126325, @cmooney wrote: > I'm not sure if Netbox is the right place to *store* this data, but happy to discuss. You folk know... [23:54:56] (VarnishTrafficDrop) firing: (2) 67% GET drop in text@esams during the past 30 minutes - https://alerts.wikimedia.org [23:59:57] (VarnishTrafficDrop) resolved: (2) 67% GET drop in text@esams during the past 30 minutes - https://alerts.wikimedia.org