[07:50:48] good bot [08:31:47] ema: cool :D [08:36:39] 10Traffic: Prometheus Varnish exporter unit should depend on Varnish - https://phabricator.wikimedia.org/T283660 (10ema) [08:36:50] 10Traffic: Prometheus Varnish exporter unit should depend on Varnish - https://phabricator.wikimedia.org/T283660 (10ema) p:05Triage→03Low [08:38:52] 10netops, 10Data-Persistence-Backup, 10SRE: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10cmooney) Hi Jamie, Thanks for the feedback. I think given the desire to push the WAN links relatively h... [10:14:05] mmandere: from https://gerrit.wikimedia.org/r/c/operations/puppet/+/692869 I'm assuming that you're mimicking eqsin, please note that it doesn't make sense to have text nodes from cp6001 to cp6006 and then 6013-6014. This happened on eqsin cause we got some extra servers after the initial deploy (same applies to upload) [10:17:43] per https://wikitech.wikimedia.org/wiki/Infrastructure_naming_conventions#Name_re-use we don't reuse server names, hence we don't reshuffle them either, that's why on eqsin things look funny :) [10:21:44] vgutierrez: you are right, we are mimicking eqsin. At the moment, the aim was to have drmrs appear where every instance of eqsin is. Just a start, and once everything else is setup on site we can do the actual clean up and have drmrs puppet code correctly depict the new site [10:23:59] ack [10:33:49] cdanis: I know :) (re: 12:52:19 < cdanis> sukhe: re the above I'm happy to help ofc [10:33:52] ) [12:37:46] 10netops, 10SRE: Detect IP address collisions - https://phabricator.wikimedia.org/T189522 (10faidon) a:05faidon→03None [12:37:54] 10netops, 10SRE: OSPF metrics - https://phabricator.wikimedia.org/T200277 (10faidon) a:05faidon→03None [12:51:15] 10Traffic, 10Advanced Mobile Contributions, 10SRE, 10User-Joe: AMC – Opt-in for logged out users - https://phabricator.wikimedia.org/T215624 (10phuedx) a:05phuedx→03None [14:42:49] 10Traffic, 10observability, 10User-fgiunchedi: Port traffic/netops grafana alerts to AM - https://phabricator.wikimedia.org/T282806 (10ema) p:05Triage→03Medium [14:58:54] 10netops, 10SRE: Netbox has incorrect email address for GTT - https://phabricator.wikimedia.org/T246564 (10ayounsi) a:05faidon→03wiki_willy When trying to login to the Ethervision dashboard I'm getting: > Additional Provisioning Required. Please contact your Customer Success Manager to request access. @wik... [15:55:57] bblack, ema, vgutierrez, sukhe: FYI I've sent a patch (that actually was just merged) to add wikibugs to the sre-foundations channel with all our tags. After checking with XioNoX the patch was proposing to move all netops phab tasks to the foundations channel istead of here. But ofc we can have wikibugs publish them in both places. What do you think? [15:56:19] I can send the patch to re-added if needed. Sorry was merged quicker than I thought :) [15:58:57] for reference: https://gerrit.wikimedia.org/r/c/labs/tools/wikibugs2/+/695235/3/channels.yaml#b264 [16:23:40] 10Traffic, 10Advanced Mobile Contributions, 10SRE, 10Readers-Web-Backlog (Tracking), 10User-Joe: AMC – Opt-in for logged out users - https://phabricator.wikimedia.org/T215624 (10Jdlrobson) [16:30:44] volans: so we used to have netops going here as well as -netops (which no longer exists I think). Historically Arzhel was here in this group, was kind of how it started, but even after he split off, it seemed like keeping an eye on netops tickets is kinda relevant to the traffic team and useful. [16:31:19] so, I'd say keep both! [16:31:21] what's this wikibug everybody talks about ;) [16:32:10] the one you muted XioNoX ;) [16:34:37] ack, let me re-add it, sorry for the trouble [16:34:42] np! [16:36:45] https://gerrit.wikimedia.org/r/c/labs/tools/wikibugs2/+/695388 sent cc legoktm if he's so kind to merge this one too, sorry for the additional work [16:54:01] 10Traffic, 10SRE, 10Patch-For-Review: Offer Wikidough as an anycasted service - https://phabricator.wikimedia.org/T283027 (10ssingh) [16:54:21] there you got, traffic is back :) [16:54:35] sorry I mean netops, the patch was merged [16:55:11] 10netops, 10Wikibugs: wikibugs test bug part II - https://phabricator.wikimedia.org/T90594 (10Volans) [16:55:31] ok it works fine :) [16:59:11] 10Traffic, 10netops, 10SRE: Please configure the routers for Wikidough's anycasted IP - https://phabricator.wikimedia.org/T283503 (10cmooney) Merged and pushed with homer to cr1-codfw and cr2-codfw, working ok with the first VM (Bird being enabled on others shortly): ` cmooney@re0.cr2-codfw> show route rece... [17:27:26] 10Traffic: RIPE Atlas monitoring of reachability & latency towards anycasted Wikidough IP - https://phabricator.wikimedia.org/T283614 (10cmooney) This service is now live in codfw, answering DoH, DoT and plain-old DNS. I believe dnsdist is terminating the DoH/DoT, but regular queries on UDP/TCP 53 go directly... [17:34:26] 10Traffic: RIPE Atlas monitoring of reachability & latency towards anycasted Wikidough IP - https://phabricator.wikimedia.org/T283614 (10ssingh) >>! In T283614#7116994, @cmooney wrote: > This service is now live in codfw, answering DoH, DoT and plain-old DNS. > > I believe dnsdist is terminating the DoH/DoT,... [17:37:38] 10Traffic, 10SRE: Offer Wikidough as an anycasted service - https://phabricator.wikimedia.org/T283027 (10ssingh) [17:39:09] sukhe: are there plans to have wikimedia-dns.org also answer https? [17:41:29] there's a minimal server with a doc link already [17:41:37] $ curl https://wikimedia-dns.org/ [17:41:38] Wikidough

Please visit the Wikitech page for more information.

[17:41:48] ah! that didn't work for me just a minute ago [17:48:45] sukhe: could you add a mention of DoH to https://wikitech.wikimedia.org/wiki/Anycast#External ? [17:59:32] cdanis: yeah, so we will use this for /policy and stuff. there is no redirect from 80 to 443 though as we don't listen on 80. maybe we should, at least for the landing page [17:59:47] XioNoX: will do, thanks for the help! [18:00:29] yeah I could see either argument re: port 80 [18:00:37] 10Traffic: RIPE Atlas monitoring of reachability & latency towards anycasted Wikidough IP - https://phabricator.wikimedia.org/T283614 (10cmooney) Ok thanks. That actually makes sense, and what I had originally expected. I have fallen into a trap I've hit before, in that my home network here is secretly redirec... [18:01:25] it's not essential for new services I don't think, and it does add another attackable port to a public IP, but if the implementation is simplistic (just a fixed redirect in the http server config) it's not awful to do it either. [18:04:03] yeah I think it should be simple enough in dnsdist/h2o but I will check [18:05:15] sukhe: we can work around that for most browsers with HSTS preload :) [18:06:33] cdanis: yes :D "strict-transport-security: max-age=106384710; includeSubDomains; preload [18:06:36] " [18:09:00] we were covered for it with malmok being on *.wikimedia.org but I will submit it for wikimedia-dns.org [18:10:02] 👍 [18:28:01] 10Traffic: RIPE Atlas monitoring of reachability & latency towards anycasted Wikidough IP - https://phabricator.wikimedia.org/T283614 (10CDanis) RIPE Atlas probes support sending DoT queries; however, the option is not exposed anywhere in the measurement creation web UI, nor in the official ripe-atlas-tools dist... [19:39:46] 10netops, 10SRE: Netbox has incorrect email address for GTT - https://phabricator.wikimedia.org/T246564 (10wiki_willy) Sure, no prob. I just sent out an email, for our rep to loop in the Customer Success Manager. You're copied on it, so feel free to chime in on the reply. Thanks, Willy >>! In T246564#711608... [22:35:15] 10netops, 10DC-Ops: allow mgmt network to access tftp servers for firmware updates - https://phabricator.wikimedia.org/T283771 (10RobH) p:05Triage→03Medium [22:36:35] 10netops, 10DC-Ops: allow mgmt network to access tftp servers for firmware updates - https://phabricator.wikimedia.org/T283771 (10RobH) Oh, I added Arzhel so he is aware of any network asks and can potentially point out any hard blockers that are proposed. I plan to bring this up in our next DC ops meeting, j... [22:46:39] 10netops, 10DC-Ops, 10SRE: allow mgmt network to access tftp servers for firmware updates - https://phabricator.wikimedia.org/T283771 (10Dzahn) As one who worked on the installserver puppet roles in the past I'd say the cons of B aren't so bad. We should be able to reuse existing profiles and just combine th... [22:49:03] 10netops, 10DC-Ops, 10SRE: allow mgmt network to access tftp servers for firmware updates - https://phabricator.wikimedia.org/T283771 (10RobH) >>! In T283771#7117937, @Dzahn wrote: > As one who worked on the installserver puppet roles in the past I'd say the cons of B aren't so bad. We should be able to reus... [22:57:31] 10netops, 10DC-Ops, 10SRE: allow mgmt network to access tftp servers for firmware updates - https://phabricator.wikimedia.org/T283771 (10Dzahn) But wouldn't the dedicated ganeti server need hardware then instead? [23:02:22] 10netops, 10DC-Ops, 10SRE: allow mgmt network to access tftp servers for firmware updates - https://phabricator.wikimedia.org/T283771 (10RobH) >>! In T283771#7117961, @Dzahn wrote: > But wouldn't the dedicated ganeti server need hardware then instead? My initial task description is stating ganeti instance no... [23:04:16] 10netops, 10DC-Ops, 10SRE: allow mgmt network to access tftp servers for firmware updates - https://phabricator.wikimedia.org/T283771 (10RobH) [23:05:04] 10netops, 10DC-Ops, 10SRE: allow mgmt network to access tftp servers for firmware updates - https://phabricator.wikimedia.org/T283771 (10Dzahn) ACK, sorry, I read " create a ganeti server" as a server running ganeti that then hosts VMs on it. Yea, so then the biggest part of this would be the "make the mgmt...