[05:46:45] I have fixed (SystemdUnitFailed) firing: (3) prometheus-mysqld-exporter.service Failed on db2194:9100 which has been firing since yesterday apparently [12:11:14] FYI, I'm setting up a secondary cumin host in eqiad (cumin1002), which uses Puppet 7. Then all cookbooks can run from cumin[12]002 and cumin1001 can be kept around for DB tasks as long as needed [12:11:33] will send a patch to enabe cumin1002 as a DB admin host later [12:16:20] moritzm: cool, we'll need your input on the tracking task too, and see how we can progress [12:17:47] ack [12:22:41] thanks moritzm! [13:33:02] jynus marostegui [13:33:14] https://github.com/wikimedia/operations-puppet/blob/production/manifests/site.pp#L917-L920 sorry for the early ping, do you know if I can decommission this host? [13:33:25] (db1133) [13:33:30] Not yet [13:34:09] ack thanks [13:37:29] the puppet run on cumin1002 fails since dbbackups/transfer.pp expects a modules/profile/templates/dbbackups/cumin1002.cnf.erb [13:37:43] should that for now simply be identical with the one for cumin1001? [13:38:46] jynus: ^ [14:22:38] either remove the profile or, if you have to add it, add it empty [14:23:38] adding data there will cause the backup to perform in parallel on both hosts at the same time(duplicate backups), which will likely put down the bs [14:23:41] *dbs [14:24:16] ok, I'll add an empty template for now and when cumin1001 gets phased out for good, we can move this over [14:25:15] you can also set profile::dbbackups::transfer::enabled to false [14:25:30] ah, that's better, I'll do that instead [14:25:31] but it defaults to true [14:26:13] moritzm: context- https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/profile/manifests/dbbackups/transfer.pp [14:27:01] "If the host has it enabled but the file hasn't been setup, puppet run will fail" - the failing of puppet is on purpose to signal there is an issue with configuration [14:28:21] perfect! quick sanity check for https://gerrit.wikimedia.org/r/c/operations/puppet/+/983198/ if you have a moment [14:30:20] wmfbackups was prepared for puppet 7, so they can be migrated at any time- but it is important to setup the same backups in parallel at the same time [14:30:29] *to not [15:31:09] has anyone seen these errors on the new R450s? — https://phabricator.wikimedia.org/P54440 [15:32:58] I'm not sure what to make them yet, except that I guess that's the SAS controller, and that it is generally unhappy [15:35:38] hrmm... that might not have been the best example, either. Most seem to start with `mpt3sas_cm0: sending diag reset !!`