[09:34:50] 06Traffic: Enable QoS for upload video files - https://phabricator.wikimedia.org/T412785 (10Fabfur) 03NEW [09:47:42] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: mr1-codfw: add second uplink to lsw1-a2-codfw - https://phabricator.wikimedia.org/T410717#11463124 (10ayounsi) @Jhancock.wm I'll leave it to you and @RobH to procure the needed equipment. If you prefer a fiber run between the two devi... [10:32:11] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: mr1-codfw: add second uplink to lsw1-a2-codfw - https://phabricator.wikimedia.org/T410717#11463290 (10cmooney) >>! In T410717#11463123, @ayounsi wrote: > If a copper run is fine, then it's an SFP-T (that you probably have in stock) on... [12:53:45] 06Traffic, 10Hiddenparma, 13Patch-For-Review: Add ipblock-source objects and logic - https://phabricator.wikimedia.org/T402014#11463847 (10JMeybohm) 05Open→03Resolved This is done. I've created {T412805} for the follow up work. [13:09:59] 10netops, 06Infrastructure-Foundations, 10netbox, 13Patch-For-Review: Automatically run Capirca Netbox script regularly - https://phabricator.wikimedia.org/T361549#11463998 (10ayounsi) Once the two patches above are deployed, comes the question on how to run it regularly. There are 2 possible options : *... [13:20:23] 10netops, 06Infrastructure-Foundations, 06SRE: Cloudcephosd: migrate to single network uplink - https://phabricator.wikimedia.org/T399180#11464054 (10fgiunchedi) >>! In T399180#11432250, @cmooney wrote: >>>! In T399180#11432052, @fgiunchedi wrote: >> I think the easiest would be to: >> >> * Remove the spuri... [13:47:48] FIRING: PuppetDisabled: Puppet disabled on lvs2011:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=lvs&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [13:50:36] 10netops, 06Infrastructure-Foundations, 06SRE: InboundInterfaceErrors alerts firing for Nokia switches on v25.10.1 - https://phabricator.wikimedia.org/T412733#11464225 (10ayounsi) a:03Papaul @Papaul would you be ok to work with Nokia's support to figure out what those inbound errors mean ? Thanks [13:57:48] FIRING: [2x] PuppetDisabled: Puppet disabled on lvs2011:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=lvs&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [14:12:48] FIRING: [4x] PuppetDisabled: Puppet disabled on lvs2011:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=lvs&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [14:14:59] ^^ fabfur / slyngs [14:19:51] looking [14:24:58] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Servers exposing incorrect LLDP info - https://phabricator.wikimedia.org/T250367#11464421 (10Papaul) @ayounsi what else needs to be done here? [14:27:15] puppet on lvs2011 is disabled since dec 09 [14:27:25] administratively disabled (Reason: 'adding new service druid-public-coordinator'); [14:27:41] enabling it [14:30:01] We also have lvs2012 [14:30:42] slyngs: can you take care of that? I'm looking on why it didn't alerted before [14:31:00] Already doing [14:31:36] Same message, enabling [14:32:17] yep the other ones too [14:32:19] sigh [14:33:08] I'll just run puppet to ensure that nothing is broken [14:33:33] yep, it should be all good but it's no good it's been more than 7d disabled [14:33:38] and no one noticed [14:34:05] I think we should adjust the alert, no more than e.g. three days [14:36:51] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Servers exposing incorrect LLDP info - https://phabricator.wikimedia.org/T250367#11464499 (10ayounsi) I was working on that as we speak. As sretest2003 was reclaimed to test hosts I was able to run some more tests. Running the still not m... [14:37:45] fabfur, slyngs: fwiw lvs[2013-2014] also have puppet disabled for druid-public-coordinator [14:38:37] yep all lvs hosts in codfw are in this state since Dec 09 [14:41:10] Seems like the people working on it are all out, or no longer work here. [14:47:19] fabfur: They are all re-enabled [14:47:31] ack tnx [14:50:46] The Puppet alerting is "global" so if I just adjust it down it will change for everyone [14:51:10] I can create a new alert, just for the lvs cluster and have that alert earlier [14:52:48] FIRING: [4x] PuppetDisabled: Puppet disabled on lvs2011:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=lvs&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [14:52:51] I think in this case would be more important on cp hosts due to keys but also lvs hosts are important because of pybal conf, but I wouldn't apply anything now (on this wonderful time of the year) [14:53:02] ^^ ?? [14:54:19] It's the only one not on the dashboard still https://alerts.wikimedia.org/?q=%40state%3Dactive&q=%40cluster%3Dwikimedia.org&q=instance%3D~%28cp%7Cncredir%7Cacme%7Clvs%7Cncmonitor%7Cdns%7Cdoh%29 [14:56:07] Oh, no no that's fine. notice the 4x [14:56:26] It will clear in a bit, it's just modern monitoring being weird. [14:57:14] lvs2011 fired first, so its label got used, then it triggered for the other three hosts, but that doesn't change the label. [14:57:17] tnx [14:57:41] 2012 clear now [14:57:48] FIRING: [4x] PuppetDisabled: Puppet disabled on lvs2011:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=lvs&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [14:58:06] We do need a traffic server restart on cp1100 says our dashboard [15:12:48] FIRING: [3x] PuppetDisabled: Puppet disabled on lvs2012:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=lvs&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [15:17:48] RESOLVED: [2x] PuppetDisabled: Puppet disabled on lvs2013:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=lvs&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [15:25:39] There we go [15:27:37] 👍 [16:11:46] 06Traffic, 10DNS, 10wikimediafoundation.org, 07IPv6, 13Patch-For-Review: wikimediafoundation.org does not support IPv6 - https://phabricator.wikimedia.org/T403269#11465032 (10BCornwall) 05In progress→03Resolved [16:15:40] 06Traffic, 10MediaWiki-Debug-Logger, 06MediaWiki-Platform-Team: Pass through information about the client from the CDN to MediaWiki to Logstash - https://phabricator.wikimedia.org/T412396#11465047 (10Krinkle) [16:15:49] 06Traffic, 10MediaWiki-Debug-Logger, 06MediaWiki-Platform-Team (Kanban Board): Pass through information about the client from the CDN to MediaWiki to Logstash - https://phabricator.wikimedia.org/T412396#11465050 (10Krinkle) [16:25:31] 06Traffic, 13Patch-For-Review: Enable QoS for upload video files - https://phabricator.wikimedia.org/T412785#11465075 (10Fabfur) Update: we've tested this on cp7009 but apparently this isn't setting the TOS as expected. I've rolled it back with https://gerrit.wikimedia.org/r/c/operations/puppet/+/1218788 to tr... [16:47:24] 06Traffic, 06MediaWiki-Platform-Team (Radar), 07SecTeam-Processed, 07Security: SUL Integration for eventyay (Wikimania virtual event platform) - https://phabricator.wikimedia.org/T378157#11465213 (10MarioB) We have a few test deployments. Once we consolidate the next version I will provide a list of ol... [17:04:06] hi traffic -- https://gerrit.wikimedia.org/r/c/operations/puppet/+/1217774 came in for the puppet request window, anyone able to take a look? [17:04:26] (looks okay to me, I just don't want to be the only SRE reviewer on a VCL patch) [17:18:23] rzl: +1 from me as long as the VSL tests pass (which the people on the patch so far might not realize isn't part of puppet CI) [17:18:38] (and, I mean, they almost certainly do, but) [19:25:49] hi all (no rush on this as it's late in the day and also late in the year) [19:26:32] we've been thinking of adding another /evt-***/ endpoint because we think we're losing events to ad-blocking when we send to intake-analytics.wikimedia.org [19:26:39] So I wrote https://gerrit.wikimedia.org/r/c/operations/puppet/+/1218817 [19:26:57] and I'd love to validate with folks here what would need to happen for that to become functional and if there are any concerns [19:27:56] (cc tchin) [20:46:57] 06Traffic, 06Fundraising-Backlog, 06Fundraising-Tech-Roadmap, 10MediaWiki-extensions-CentralNotice, 06SRE: Set expiry time for GeoIP cookies - https://phabricator.wikimedia.org/T122097#11466380 (10AKanji-WMF)