[09:14:20] 06Traffic, 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation): Replace or remove Debian Buster VMs in 'traffic' cloud-vps project - https://phabricator.wikimedia.org/T360710#9968342 (10Vgutierrez) Actionable right now: * `traffic-cache-atstext-buster` can be wiped. Things that need to be fixed:... [09:22:45] 06Traffic, 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation): Replace or remove Debian Buster VMs in 'traffic' cloud-vps project - https://phabricator.wikimedia.org/T360710#9968417 (10Vgutierrez) and something is definitely wrong with other hosts using the new puppetmaster: ` vgutierrez@traffic-... [10:49:31] 06Traffic, 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation): Replace or remove Debian Buster VMs in 'traffic' cloud-vps project - https://phabricator.wikimedia.org/T360710#9968683 (10Vgutierrez) regarding traffic-cache instances: * I've tried to unify their puppet configuration in 3 prefixes:... [11:18:37] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Consolidate Automation Templates for DC Switches - https://phabricator.wikimedia.org/T312635#9968801 (10cmooney) I think the work on this can be done in tandem with the review of the setup in {T367203}. Off my head an OSPF/IBGP design simi... [11:21:49] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e3-eqiad - https://phabricator.wikimedia.org/T365995#9968809 (10Marostegui) @cmooney got to be closed? [11:37:17] 10netops, 06Infrastructure-Foundations, 06SRE: Upgrade core routers to Junos 22.4R3 - https://phabricator.wikimedia.org/T364092#9968852 (10cmooney) [11:49:38] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Include vlans with defined IRB int in device vlans even if no port present - https://phabricator.wikimedia.org/T366348#9968898 (10cmooney) 05Open→03Resolved [11:51:55] 10netops, 06Infrastructure-Foundations, 06SRE: Adjust IBGP route-reflector spine/leaf automation to support separate client clusters - https://phabricator.wikimedia.org/T364103#9968908 (10cmooney) 05Open→03Resolved Closing task - is a duplicate work was completed under T365169 [12:41:55] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e3-eqiad - https://phabricator.wikimedia.org/T365995#9969045 (10cmooney) 05Open→03Resolved >>! In T365995#9968809, @Marostegui wrote: > @cmooney got to be closed... [12:48:34] 10netops, 06Traffic, 06Infrastructure-Foundations, 06SRE: Support PyBal routes announced with lower priority than "backup" - https://phabricator.wikimedia.org/T354839#9969092 (10ssingh) a:03ssingh [13:29:40] 06Traffic, 06SRE, 13Patch-For-Review: Migrate DNS depooling of sites from operations/dns (git) to confctl - https://phabricator.wikimedia.org/T369366#9969298 (10ssingh) >>! In T369366#9967398, @Scott_French wrote: > Thanks, @ssingh! > > In short, and I realize this doesn't help much, my understanding is tha... [13:41:20] 10Domains, 06Traffic: [toolforge] transfer/adopt toolsbeta.org domain to the foundation - https://phabricator.wikimedia.org/T362253#9969324 (10dcaro) @Andrew the domain has been already moved to markmonitor: ` dcaro@urcuchillay$ whois toolsbeta.org | grep -i markmonitor Registrar WHOIS Server: http://whois.mar... [13:58:36] 06Traffic, 06SRE, 13Patch-For-Review: Migrate DNS depooling of sites from operations/dns (git) to confctl - https://phabricator.wikimedia.org/T369366#9969379 (10ssingh) Correction: @Joe tells me we can do reason/pooled in one line with: `set/pooled=yes;reason=foo`. [14:08:56] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e1-eqiad - https://phabricator.wikimedia.org/T365993#9969455 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=9ca0faf1-4b9d-4345-9bb8-9c7153e17163) se... [14:17:46] 06Traffic, 06SRE, 13Patch-For-Review: Migrate DNS depooling of sites from operations/dns (git) to confctl - https://phabricator.wikimedia.org/T369366#9969518 (10CDanis) Some fragmented thoughts below! --- Something worth pointing out: the original, early documentation suggests this usage of Node. > the 'no... [14:27:43] 06Traffic, 06SRE, 13Patch-For-Review: Migrate DNS depooling of sites from operations/dns (git) to confctl - https://phabricator.wikimedia.org/T369366#9969566 (10ssingh) Thanks @CDanis! >>! In T369366#9969518, @CDanis wrote: > Some fragmented thoughts below! > > --- > > Something worth pointing out: the or... [15:23:39] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e1-eqiad - https://phabricator.wikimedia.org/T365993#9969926 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=5386f05e-734c-49b0-a4c5-1acbef4c187a) se... [15:24:15] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e1-eqiad - https://phabricator.wikimedia.org/T365993#9969929 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=9475b2b6-bc5f-41f8-97d1-970eb62b38bc) se... [15:28:34] sudo -i cumin 'A:lvs-high-traffic1' 'ipvsadm -Ln |grep wrr || echo "this LVS is wrr free"' [15:28:43] (7) lvs2011.codfw.wmnet,lvs6001.drmrs.wmnet,lvs1017.eqiad.wmnet,lvs5004.eqsin.wmnet,lvs3008.esams.wmnet,lvs7001.magru.wmnet,lvs4008.ulsfo.wmnet [15:28:43] ----- OUTPUT of 'ipvsadm -Ln |gre...LVS is wrr free"' ----- [15:28:43] this LVS is wrr free [15:28:50] high-traffic1 fully migrated to maglev \o/ [15:28:54] \m/ [15:28:56] love the echo :D [15:28:58] \o/ [15:29:33] 06Traffic, 13Patch-For-Review: migrate all high-traffic1 and high-traffic2 services to maglev - https://phabricator.wikimedia.org/T368083#9969962 (10Vgutierrez) [15:31:26] 06Traffic, 06Infrastructure-Foundations, 13Patch-For-Review: migrate all high-traffic1 and high-traffic2 services to maglev - https://phabricator.wikimedia.org/T368083#9969979 (10BCornwall) [15:34:04] amazing [15:45:40] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e1-eqiad - https://phabricator.wikimedia.org/T365993#9970099 (10cmooney) Switch upgraded successfully and all hosts back online/pinging. Thanks everyone for the assista... [15:46:55] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e1-eqiad - https://phabricator.wikimedia.org/T365993#9970119 (10ABran-WMF) db1190 repooling dbproxy reloaded everything looks OK [15:52:19] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e1-eqiad - https://phabricator.wikimedia.org/T365993#9970127 (10Eevans) ms-fe1012 repooled, and everything looks good. [16:09:30] 06Traffic, 06Infrastructure-Foundations, 13Patch-For-Review: migrate all high-traffic1 and high-traffic2 services to maglev - https://phabricator.wikimedia.org/T368083#9970181 (10BCornwall) [16:09:39] 06Traffic, 06Infrastructure-Foundations, 13Patch-For-Review: migrate all high-traffic1 and high-traffic2 services to maglev - https://phabricator.wikimedia.org/T368083#9970183 (10BCornwall) 05Open→03Resolved [16:21:32] Aand no more wrr on high-traffic2 as well. Thx brett [16:25:51] 10Domains, 06Traffic: [toolforge] transfer/adopt toolsbeta.org domain to the foundation - https://phabricator.wikimedia.org/T362253#9970321 (10Dzahn) I would recommend to add the domain in WMFs actual DNS zones and copy the existing setup for, for example, `wikimedia.cloud`. So operations/dns -> templates ->... [16:29:05] 06Traffic, 10DNS, 10fundraising-tech-ops, 06serviceops, and 2 others: redirect benefactors.wikimedia.org (was: Cleanup unused DNS subdomains) - https://phabricator.wikimedia.org/T367012#9970327 (10Dzahn) Thanks Clément! I was about to make a subtask and rename this again but you already took it on. It's ap... [17:04:05] 10Domains, 06Traffic: [toolforge] transfer/adopt toolsbeta.org domain to the foundation - https://phabricator.wikimedia.org/T362253#9970576 (10dcaro) >>! In T362253#9970321, @Dzahn wrote: > I would recommend to add the domain in WMFs actual DNS zones and copy the existing setup for, for example, `wikimedia.clo... [17:26:55] 06Traffic, 06SRE: Regression: Reading spam blacklists of all projects suddenly returns status 429 on fifth consecutive read - https://phabricator.wikimedia.org/T369414#9970649 (10Dzahn) a:05Dzahn→03None [17:27:10] 06Traffic, 06SRE: Regression: Reading spam blacklists of all projects suddenly returns status 429 on fifth consecutive read - https://phabricator.wikimedia.org/T369414#9970646 (10Dzahn) 05Open→03Resolved a:03Dzahn Being bold and closing the task because the task creator said so. Please feel free to... [18:54:11] Not sure if this is the right place for it but I see an alert for `SystemdUnitFailed: geoip_update_ipinfo.service on puppetmaster1001:9100` in operations [18:54:16] Seems like it's getting a 403 from maxmind: [18:54:27] Jul 10 04:30:02 puppetmaster1001 geoip_update_ipinfo[22062]: Received an unexpected HTTP status code of 403 from https://updates.maxmind.com/geoip/databases/GeoIP2-Enterprise/update?db_md5=REDACTED [18:54:27] Jul 10 04:30:02 puppetmaster1001 geoip_update_ipinfo[22062]: Invalid product ID or subscription expired for GeoIP2-Enterprise [18:58:36] yeah, thanks [19:00:28] we should probably clean this up given T366272 [19:00:29] T366272: Update puppet configuration to use GeoLite2 (free) instead of GeoIP2-Enterprise data - https://phabricator.wikimedia.org/T366272 [19:21:05] ryankemper: thanks for reporting. I added a comment at https://phabricator.wikimedia.org/T366272#9971036 [21:31:01] 06Traffic, 06SRE, 13Patch-For-Review: Migrate DNS depooling of sites from operations/dns (git) to confctl - https://phabricator.wikimedia.org/T369366#9971465 (10Scott_French) Cool, it sounds like the conversation has evolved to using a dedicated schema, and we're on the same page that a multi-value `set` sho... [22:28:13] 06Traffic: ncmonitor should test markmonitor API response parsing - https://phabricator.wikimedia.org/T369769 (10BCornwall) 03NEW [22:29:32] patch to fix the failed unit on the puppetmasters https://gerrit.wikimedia.org/r/c/operations/puppet/+/1053390 [22:33:45] mutante: +1 [22:37:04] :) thanks! be back soon [22:38:29] can take care of cleaning up the units in a little while [22:39:34] 06Traffic: ncmonitor should test markmonitor API response parsing - https://phabricator.wikimedia.org/T369769#9971673 (10BCornwall) p:05Triage→03Low [22:43:41] 06Traffic, 13Patch-For-Review: [ncmonitor] Detect, ignore, and notify about duplicate domain name entries in MarkMonitor - https://phabricator.wikimedia.org/T368758#9971680 (10BCornwall) 05In progress→03Resolved [23:03:35] done. no more failed units on any of puppetmaster::frontend, puppetmaster::backend or puppetserver. per cumin. [23:03:49] started that unit on puppetmaster1001 and puppetserver1001 manually once [23:03:59] enterprise product removed [23:42:21] 06Traffic, 10DNS, 10fundraising-tech-ops, 06serviceops, 06SRE: redirect benefactors.wikimedia.org (was: Cleanup unused DNS subdomains) - https://phabricator.wikimedia.org/T367012#9971801 (10Dzahn) [23:46:03] 06Traffic, 10DNS, 10fundraising-tech-ops, 06serviceops, 06SRE: redirect benefactors.wikimedia.org (was: Cleanup unused DNS subdomains) - https://phabricator.wikimedia.org/T367012#9971806 (10Dzahn) Thanks to Clement and Reuven for the redirect change and deploying it. benefactors redirects now. @Pppery... [23:49:12] 06Traffic, 10DNS, 10fundraising-tech-ops, 06serviceops, 06SRE: Cleanup DNS subdomains displaying wikimedia.org homepage when they shouldn't - https://phabricator.wikimedia.org/T367012#9971809 (10Pppery) 05Open→03Resolved [23:51:41] 10Wikimedia-Apache-configuration, 06serviceops: Change redirect target of sep11.wikipedia.org - https://phabricator.wikimedia.org/T367014#9971812 (10Pppery) Another possibility is https://meta.wikimedia.org/wiki/Sep11wiki. On second thought that's probably better than the Wayback Machine as it explains the con...