[07:52:55] FIRING: MaxConntrack: Max conntrack at 84.82% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [07:57:55] RESOLVED: MaxConntrack: Max conntrack at 85.2% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [10:15:27] moritzm: I'm trying to reimage lvs3010 as bookworm and it's blocked asking for the IP of the gateway in the debian installer [10:15:45] could it be related to bookworm 12.10 update? [10:18:43] the gateway was there.. I just needed to hit enter [10:19:26] and after that the cookbook found the reboot as expected (Found reboot since 2025-03-17 09:46:43.138781 for hosts lvs3010.esams.wmnet) [10:19:43] but yeah.. the automated installation isn't working at the moment [10:20:07] :( [10:20:21] good news that at least the network config info are returned correctly [10:20:22] now it's stuck at the partition disks menu [10:20:30] hmmh, nothing should have substantially with the new installer, it's basically just a rebuild with the new kernel [10:20:49] when there is a new point release usually we have to rebuild the installer, but that's only for the kernel shouldn't affect network IIRC [10:20:57] some preseed change? [10:20:58] or maybe some preseed config broke and this is entirely unrelated? [10:21:04] vgutierrez: can you give us more info about how "stuck"? [10:21:14] elukey: asking for user input [10:21:29] but has it correctly selected a partition scheme etc..? [10:21:35] d-i blue and gray box [10:22:04] I think that you hitting "enter" in the previous step may have interfered with how d-i works now [10:22:07] elukey: no idea :) it's currently asking for a partitioning method, highlighted option is "Guided - use the largest continuous free space" [10:22:16] ah no ok that is not good then :D [10:22:29] it's actually caused by a broken preseed config [10:22:34] I think it didn't get the recipe correctly [10:22:39] making a patch to fix it [10:22:42] thx moritzm <3 [10:22:44] super thanks! [10:22:50] it's caused by a config change for some new elastic nodes [10:24:42] sigh, again preseed broken due to syntax? [10:24:53] yeah [10:28:28] +1ed [10:31:02] thx, merging [10:34:46] vgutierrez: the fix is merged and I forced a puppet run on install servers, it should work now [10:37:11] moritzm: thx again 🍻 [10:39:24] yw :-) [10:51:44] confirmed.. it's working as expected now [11:07:49] vgutierrez, moritzm, https://gerrit.wikimedia.org/r/c/operations/puppet/+/1128372 good idea, bad idea ? and is the CI error related to my change ? [11:14:40] the CI error is unrelated and can be ignored, the PCC runs are made for Puppet 5 and Puppet 7 and in Puppet 5 max_files (which is used by the installserver role) does not exist [11:15:39] and it's certainy a good idea! I'll have a closer look at the patch later the day [11:17:18] yeah.. idea looks good, I wonder if we could have some kind of linting in place for the rendered file [11:22:30] from my PoV ideally at some point the partman config would be simply be a drop down in Netbox, but in the mean time this will catch the error class we ran into so far [12:50:55] FIRING: MaxConntrack: Max conntrack at 81.43% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [12:55:55] RESOLVED: MaxConntrack: Max conntrack at 80.71% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [13:13:03] 10netops, 06Infrastructure-Foundations, 10ops-drmrs: cr1-drmrs to asw1-b12-drmrs link down - https://phabricator.wikimedia.org/T389071 (10ayounsi) 03NEW p:05Triage→03High [13:14:55] moritzm thanks for catching/fixing that typo [13:18:54] np! these will soon be caught in CI once https://gerrit.wikimedia.org/r/c/operations/puppet/+/1128372 is merged [13:23:17] 10netops, 06Infrastructure-Foundations, 10ops-drmrs: cr1-drmrs to asw1-b12-drmrs link down - https://phabricator.wikimedia.org/T389071#10641632 (10ayounsi) [14:45:38] hi, as discussed with jhathaway on Friday I'd like to proceed merging the two lists CRs (https://gerrit.wikimedia.org/r/q/topic:%22T385067%22) that would enable dual stack TLS configuration in both Apache and exim for lists.wm.o [14:46:48] sounds good [14:50:08] jhathaway: lists1004/2001 work as active/passive? [14:51:44] at least lists.wm.o is pointing to lists1004 at the moment, if that's the case I'd disable puppet on lists1004 and double check that everything looks good in 2001 [14:52:00] yes, 2001 is a fallback/passive host only [14:52:16] sounds good [14:52:21] ok, proceeding :) [14:59:20] moritzm, brouberol, https://gerrit.wikimedia.org/r/c/operations/puppet/+/1128438/ oops, that one is the good one [15:00:16] jhathaway: apache2 in lists2001 is now offering both certs :D [15:00:21] awesome [15:00:29] XioNoX: approved! Thanks for providing a failing test case! [15:05:50] jhathaway: same for exim4, do you wanna double check lists2001 before I reenable puppet on lists1004? [15:06:40] vgutierrez: sure [15:07:10] thanks [15:07:42] vgutierrez: looks good, go for it [15:07:48] awesome, thanks :D [15:10:45] lists1004 done as well, thx [15:11:21] thank you [15:14:30] 10netops, 06Infrastructure-Foundations, 10ops-drmrs: cr1-drmrs to asw1-b12-drmrs link down - https://phabricator.wikimedia.org/T389071#10642215 (10RobH) Draft of directions: > Support, > > We just had an optic fail on one of our router to switch links, and need the switch side optic swapped out with spa... [15:20:39] 10netops, 06Infrastructure-Foundations, 10ops-drmrs: cr1-drmrs to asw1-b12-drmrs link down - https://phabricator.wikimedia.org/T389071#10642258 (10RobH) Had the option for 'normal work' which must be planned in work hours and 24 hours in advance (with time zone changes that means if I entered it now, it woul... [15:21:26] 10netops, 06Infrastructure-Foundations, 10ops-drmrs: cr1-drmrs to asw1-b12-drmrs link down - https://phabricator.wikimedia.org/T389071#10642262 (10RobH) a:03RobH [15:27:01] 10CAS-SSO, 10Bitu, 06Infrastructure-Foundations, 10Phabricator, 10Striker: Inconsistent mapping of Developer accounts and SUL accounts across Phabricator, Bitu, and Striker - https://phabricator.wikimedia.org/T388498#10642294 (10SLyngshede-WMF) p:05Triage→03Low [16:22:55] FIRING: MaxConntrack: Max conntrack at 84.18% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [16:26:08] 10netops, 06Infrastructure-Foundations, 10ops-drmrs: cr1-drmrs to asw1-b12-drmrs link down - https://phabricator.wikimedia.org/T389071#10642593 (10ayounsi) remote hands replaced the optic, but the issue persists. Looking closer at it it converts the 40G port into 4x10G lanes. This might be because lane 1 is... [16:27:55] RESOLVED: MaxConntrack: Max conntrack at 85% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [16:38:10] 10netops, 06Infrastructure-Foundations, 10ops-drmrs: cr1-drmrs to asw1-b12-drmrs link down - https://phabricator.wikimedia.org/T389071#10642675 (10RobH) IRC update: We asked them to swap both optic and fiber patch to reduce complexity in troubleshooting. > Support, > > Background: For some reason this li... [19:48:55] FIRING: MaxConntrack: Max conntrack at 85.64% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [19:53:55] RESOLVED: MaxConntrack: Max conntrack at 85.64% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [19:56:36] 10CAS-SSO, 10Bitu, 06Infrastructure-Foundations, 10Phabricator, 13Patch-Needs-Improvement: Phabricator should use IDP for developer account logins - https://phabricator.wikimedia.org/T377061#10643719 (10Aklapper)