[07:52:11] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE, 13Patch-For-Review: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9703059 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by fabfur@cumin1002 for host cp3070.esams.wmnet with OS bullseye [08:42:43] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE, 13Patch-For-Review: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9703163 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by fabfur@cumin1002 for host cp3070.esams.wmnet with OS bullseye completed: - cp3070 (**PASS**)... [08:58:18] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE, 13Patch-For-Review: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9703181 (10Fabfur) [11:57:45] sukhe: for when you'll be online, lmk when we want to merge https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/1009539 ;) [12:46:34] volans: hello [12:46:36] let's do it :) [12:47:45] sukhe: ack, merging [12:48:17] volans: I am sure you have a way of doing a test run so I think we can try that as well [12:48:38] if you wantt sure [12:50:57] running test-cookbook -c 1009539 -d sre.switchdc.mediawiki.09-restore-ttl codfw eqiad [12:51:02] would have run [12:51:13] Executing commands ['rm -fv /var/run/confd-template/.discovery-{api-rw,appservers-rw,jobrunner,mwdebug,mw-web,mw-api-ext,mw-api-int,mw-jobrunner,mw-parsoid,mw-wikifunctions,parsoid-php,videoscaler}.state*.err'] on 14 hosts: dns[1004-1006,2004-2006,3003-3004,4003-4004,5003-5004,6001-6002].wikimedia.org [12:51:20] does that sounds right? [12:51:29] the selection of dns hosts [12:51:35] yep looks good! [12:51:48] doh, that's the alias [12:52:01] let's try something, I want to depool one and see if it gets picked up [12:52:27] running sre.dns.netbox now [12:53:16] ok I will wait for you to finish then [12:54:59] doh this is a tricky one, in dry-run it doesn't get that far, if no changes either [12:55:31] ah ok [12:55:37] well it should be fine, just my OCD :) [12:55:44] I can make a change in netbox, run it for real [12:55:55] then revert and re-run it [12:56:16] sure [12:56:39] let me depool dns6001 [12:56:52] that way it should be excluded and that's all the testing we need [12:56:56] and then I will update it manually [12:57:06] done [13:01:58] removed AAAA for sretest1001.eqiad.wmnet, running cookbook [13:02:15] ok thanks! [13:02:40] then you cAn repool it and I can re-run it [13:02:45] yep [13:03:52] Updating the authdns copies of the repository on dns[1004-1006,2004-2006,3003-3004,4003-4004,5003-5004,6002].wikimedia.org [13:03:55] :) [13:04:01] Deploying the updated zonefiles on dns[1004-1006,2004-2006,3003-3004,4003-4004,5003-5004,6002].wikimedia.org [13:04:39] basically done, running the hiera cookbook [13:04:53] repooling [13:05:41] netbox restored [13:06:16] dns6001 repooled [13:07:04] running [13:08:03] 06Traffic, 06DC-Ops, 10ops-codfw, 10ops-eqiad, 10SRE-swift-storage: Reimage cookbook on new eqiad hosts stuck at PXE booting - https://phabricator.wikimedia.org/T350179#9703728 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host cp4052.ulsfo.wmnet with OS b... [13:08:43] Updating the authdns copies of the repository on dns[1004-1006,2004-2006,3003-3004,4003-4004,5003-5004,6001-6002].wikimedia.org [13:08:53] cool! [13:08:59] output for dns6001: [13:09:00] Updating 32adb59..3f1422e [13:09:00] Fast-forward [13:09:10] the others had diff [13:09:26] thanks for running the test! good confirmation immediately vs surprises later [13:09:30] No action needed, zones and config files unchanged [13:09:39] thanks [13:09:53] so yeah noop for 6001, gnds reloded for the others [13:10:00] seems good [13:10:16] yep, big change, hopefully it serves us well [13:10:31] merign for real [13:17:06] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE: 14Q1:Install cp11[00-15] and rotate into production - 14https://phabricator.wikimedia.org/T349244#9703743 (10ops-monitoring-bot) 14Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host cp1112.eqiad.wmnet with OS bullseye [13:26:55] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE: 14Q1:Install cp11[00-15] and rotate into production - 14https://phabricator.wikimedia.org/T349244#9703759 (10ops-monitoring-bot) 14Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host cp1112.eqiad.wmnet with OS bullseye executed with errors:... [13:27:19] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE: 14Q1:Install cp11[00-15] and rotate into production - 14https://phabricator.wikimedia.org/T349244#9703762 (10ops-monitoring-bot) 14Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host cp1112.eqiad.wmnet with OS bullseye [13:55:05] 06Traffic, 06DC-Ops, 10ops-codfw, 10ops-eqiad, 10SRE-swift-storage: Reimage cookbook on new eqiad hosts stuck at PXE booting - https://phabricator.wikimedia.org/T350179#9703921 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host cp4052.ulsfo.wmnet with OS bulls... [14:08:07] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE: 14Q1:Install cp11[00-15] and rotate into production - 14https://phabricator.wikimedia.org/T349244#9703948 (10ops-monitoring-bot) 14Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host cp1112.eqiad.wmnet with OS bullseye completed: - cp1112 (... [15:28:02] 06Traffic, 13Patch-For-Review: 14Update ncredir HTTPS ciphersuite to conform to WMF standards - 14https://phabricator.wikimedia.org/T362197#9704286 (10BCornwall) 05In progress→03Resolved [15:52:34] 10Domains: [toolforge] transfer/adopt toolsbeta.org domain to the foundation - https://phabricator.wikimedia.org/T362253#9704446 (10Dzahn) [15:57:38] (LVSRealserverMSS) firing: (4) Unexpected MSS value on 208.80.154.232:443 @ ncredir1002 - TODO - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqiad&var-cluster=ncredir - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [16:02:38] (LVSRealserverMSS) resolved: (4) Unexpected MSS value on 208.80.154.232:443 @ ncredir1002 - TODO - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqiad&var-cluster=ncredir - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [16:19:12] 10Domains: [toolforge] transfer/adopt toolsbeta.org domain to the foundation - https://phabricator.wikimedia.org/T362253#9704506 (10dcaro) >>! In T362253#9704443, @Dzahn wrote: > @dcaro All I know is that you probably have to contact @CRoslof / legal directly. > > I don't think jus tagging it with https://phabr... [16:35:37] 10Domains, 06Traffic: [toolforge] transfer/adopt toolsbeta.org domain to the foundation - https://phabricator.wikimedia.org/T362253#9704641 (10Dzahn) ACK, so the rough order I would have expected would be: - talk to legal to transfer domain ownership - once whois data shows MarkMonitor and/or WMF as the owner... [17:38:06] 06Traffic, 06DC-Ops, 10ops-codfw, 10ops-eqiad, 10SRE-swift-storage: Reimage cookbook on new eqiad hosts stuck at PXE booting - https://phabricator.wikimedia.org/T350179#9704892 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host cp1115.eqiad.wmnet with OS b... [17:48:16] 06Traffic, 06DC-Ops, 10ops-codfw, 10ops-eqiad, 10SRE-swift-storage: Reimage cookbook on new eqiad hosts stuck at PXE booting - https://phabricator.wikimedia.org/T350179#9704944 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host cp1115.eqiad.wmnet with OS bulls... [17:48:32] 06Traffic, 06DC-Ops, 10ops-codfw, 10ops-eqiad, 10SRE-swift-storage: Reimage cookbook on new eqiad hosts stuck at PXE booting - https://phabricator.wikimedia.org/T350179#9704945 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host cp1115.eqiad.wmnet with OS b... [18:26:49] 06Traffic, 06DC-Ops, 10ops-codfw, 10ops-eqiad, 10SRE-swift-storage: Reimage cookbook on new eqiad hosts stuck at PXE booting - https://phabricator.wikimedia.org/T350179#9705040 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host cp1115.eqiad.wmnet with OS bulls... [18:28:42] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE, 13Patch-For-Review: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9705041 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host cp3071.esams.wmnet with OS bullseye [18:40:29] 06Traffic, 06DC-Ops, 10ops-codfw, 10ops-eqiad, 10SRE-swift-storage: Reimage cookbook on new eqiad hosts stuck at PXE booting - https://phabricator.wikimedia.org/T350179#9705080 (10ssingh) For `cp1115` that we tried today, I downgraded the BIOS, NIC and iDRAC firmwares, to match what we have in esams, whe... [19:20:45] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE, 13Patch-For-Review: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9705132 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host cp3071.esams.wmnet with OS bullseye completed: - cp3071 (**PASS**)... [19:24:06] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE, 13Patch-For-Review: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9705133 (10ssingh)