[06:29:10] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10RKemper) [07:45:16] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10elukey) [07:48:46] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10MoritzMuehlenhoff) [07:49:33] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10MoritzMuehlenhoff) [09:40:55] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10cmooney) [09:41:54] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10cmooney) [09:50:32] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10elukey) [09:54:20] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10cmooney) [10:12:28] 10Traffic, 10DC-Ops, 10SRE, 10Sustainability (Incident Followup): Audit eqiad & codfw LVS network links - https://phabricator.wikimedia.org/T286881 (10Vgutierrez) @Papaul maybe I'm missing some limitation, but there is any reason to just be using two switches per row on codfw for LVS networking links? Addi... [10:21:17] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10cmooney) [10:22:20] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10cmooney) [10:58:06] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10ArielGlenn) [11:17:46] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10MoritzMuehlenhoff) [11:18:44] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10MoritzMuehlenhoff) [12:21:13] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10cmooney) [12:59:54] 10Traffic, 10DC-Ops, 10SRE, 10Sustainability (Incident Followup): Audit eqiad & codfw LVS network links - https://phabricator.wikimedia.org/T286881 (10Papaul) @Vgutierrez the only limitation i cans see is the number of NIC ports on each server. Each server has 4 NIC's each NIC connected to 1 row on 1 swi... [13:03:32] 10Traffic, 10DC-Ops, 10SRE, 10Sustainability (Incident Followup): Audit eqiad & codfw LVS network links - https://phabricator.wikimedia.org/T286881 (10Vgutierrez) @Papaul nope.. the idea would be replace some of the current links with new ones to additional switches [13:17:29] 10Traffic, 10DC-Ops, 10SRE, 10Sustainability (Incident Followup): Audit eqiad & codfw LVS network links - https://phabricator.wikimedia.org/T286881 (10Papaul) @Vgutierrez if my understanding is right you want for example lvsX NIC 1 to switch asw-a2 NIC 2 to switch asw-a7 (2 switches in ROW A) and NIC 3 to... [13:30:24] 10Traffic, 10DC-Ops, 10SRE, 10Sustainability (Incident Followup): Audit eqiad & codfw LVS network links - https://phabricator.wikimedia.org/T286881 (10Vgutierrez) the current problem is that both lvs2007 (primary for high-traffic1) and lvs2010 (secondary) get row A traffic from the very same switch, so if... [13:52:51] 10Traffic, 10DC-Ops, 10SRE, 10Sustainability (Incident Followup): Audit eqiad & codfw LVS network links - https://phabricator.wikimedia.org/T286881 (10Papaul) Ok understood. Please provide me with the configuration you want in a table like above for each server which NIC connects to which switch and i can... [14:37:30] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10Vgutierrez) [14:40:08] 10Traffic, 10Analytics, 10SRE, 10Patch-For-Review: Downloading from Archiva.wikimedia.org seems slower than Maven Central - https://phabricator.wikimedia.org/T273086 (10hashar) The performance are currently severely degraded, seems each request made to archiva has a 3-4 seconds delay before starting the tr... [14:43:58] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10cmooney) [14:45:04] Just a heads up - will be executing T286061 to reconfigure switch buffers on eqiad row B shortly. [14:45:04] T286061: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 [14:45:19] topranks: hopefully in 15 mins :) [14:45:29] we're currently depooling the affected DNS server [14:45:39] Two elements there marked for action in advance - depool authdns1001 and failover lvs1014 to lvs1016. [14:45:53] and the 4 cp hosts affected [14:45:53] haha yep! sounds like you're way ahead of me :) [14:45:53] :) [14:46:08] cp hosts & authdns1001 done [14:46:20] lvs1014 soon :) [14:47:12] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10ops-monitoring-bot) Icinga downtime set by mmandere@cumin1001 for 1:00:00 4 host(s) and their services with reason: Eqiad row B maintenance ` c... [14:47:24] nice one [14:51:34] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10ops-monitoring-bot) Icinga downtime set by mmandere@cumin1001 for 1:00:00 1 host(s) and their services with reason: Eqiad row B maintenance ` a... [14:52:31] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10Vgutierrez) [14:55:30] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10ops-monitoring-bot) Icinga downtime set by mmandere@cumin1001 for 1:00:00 1 host(s) and their services with reason: Eqiad row B maintenance ` l... [14:55:58] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10Vgutierrez) [14:57:18] vgutierrez: moritz.m has confirmed he disabled puppet if that has a bearing on the lvs change. [14:57:33] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10Marostegui) [14:57:52] hmm we also disabled puppet on lvs1014, maybe his puppet disable got first? [14:58:48] hmmm nope, mmandere CMD got there first :) [14:58:56] The last Puppet run was at Tue Jul 27 14:33:20 UTC 2021 (25 minutes ago). Puppet is disabled. T286061 - mmandere [14:58:56] T286061: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 [14:59:11] so all good :) [15:00:08] you run a tight operation :) [15:10:09] 10Traffic, 10netops, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10MoritzMuehlenhoff) [15:12:34] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10Marostegui) >>! In T286061#7231980, @Marostegui wrote: > m1-master.eqiad.wmnet switched over to dbproxy1012 which is on row A. Once this row is... [15:16:06] 10Traffic, 10netops, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10Marostegui) [15:18:18] 10Traffic, 10netops, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10Marostegui) [15:18:37] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10Bstorm) [15:19:14] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10Bstorm) [15:19:22] 10netops, 10Infrastructure-Foundations, 10SRE: Adjust egress buffer allocations on ToR switches - https://phabricator.wikimedia.org/T284592 (10cmooney) [15:20:54] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10cmooney) [16:15:10] 10Traffic, 10SRE, 10serviceops, 10Patch-For-Review, 10User-jijiki: Access mwdebug kubernetes deployment via the 'X-Wikimedia-Debug' header - https://phabricator.wikimedia.org/T286491 (10Joe) 05Open→03Resolved a:03Joe [16:15:16] 10Traffic, 10SRE, 10WikimediaDebug, 10Performance-Team (Radar): Allow ATS to route traffic to mwdebug deployment on kubernetes - https://phabricator.wikimedia.org/T286482 (10Joe) [16:15:55] 10Traffic, 10SRE, 10serviceops, 10Patch-For-Review, 10User-jijiki: Access mwdebug kubernetes deployment via the 'X-Wikimedia-Debug' header - https://phabricator.wikimedia.org/T286491 (10Joe) [16:40:42] 10Traffic, 10Analytics, 10SRE: Downloading from Archiva.wikimedia.org seems slower than Maven Central - https://phabricator.wikimedia.org/T273086 (10hashar) Note that uploading is fast. Here for a file named `service-0.3.78-dist.tar.gz` ` 01:42:07.283 [INFO] [INFO] Uploaded to archiva.releases: https://archi... [17:09:27] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10cmooney) 05Open→03Resolved [17:09:35] 10netops, 10Infrastructure-Foundations, 10SRE: Adjust egress buffer allocations on ToR switches - https://phabricator.wikimedia.org/T284592 (10cmooney) [18:09:57] 10Traffic, 10SRE, 10serviceops, 10Datacenter-Switchover: During DC switch, helm-charts failed verification because it doesn't have a service IP - https://phabricator.wikimedia.org/T285707 (10Legoktm) p:05Triage→03High