[10:17:41] some backups for graphite and matomo are failing, godog and btullis hopefully you can have some time in the afternoon or tomorrow so toghether with me we can have a deeper look of why (not urgent at the moment) [10:18:09] jynus: Thanks. Will do. [10:27:44] jynus: SGTM [10:28:44] just let me know when you are available, but not at the moment, as I am busy with something more urgent [14:32:32] I have added the backup stuff to the SRE doc, won't be adding x2 or schemas, etc [14:33:30] as in, I won't add that, but may be someone else wants to [14:57:31] Doc revert: https://wikitech.wikimedia.org/w/index.php?title=MariaDB&type=revision&diff=2009575&oldid=2009128 [14:59:52] jynus: do you have a pointer for me to start looking into the backup failure s? [15:00:10] thanks, great moment as I just got out from a meeting [15:00:21] I should start from the logs and then go from there [15:00:35] and you can help me with the service/date intended to be backed up [15:00:52] let me check it right now [15:01:01] SGTM jynus [15:02:31] Interesting, incremental: graphite1004.eqiad.wmnet-Monthly-1st-Thu-productionEqiad-srv-carbon-whisper-coal.2022-09-03_04.05.01_32 says it is ok [15:03:08] but this ones is failing: graphite1004.eqiad.wmnet-Weekly-Mon-productionEqiad-srv-carbon-whisper-daily [15:03:13] what's the difference? [15:03:22] one is monthly and the other is weekly [15:03:26] that's weird [15:04:14] ah, they must be different datasets [15:04:21] let me search the definition [15:05:37] one backups /srv/carbon/whisper/coal [15:05:44] and the other /srv/carbon/whisper/daily [15:06:05] but if one works and the other not, that is even more weird [15:07:20] the dirs seem to exist both [15:11:16] godog: I got it [15:11:46] it is not in a failed status (f) it is on a "Job waiting on File daemon" state (F) [15:12:19] I hope you pardon my mistake "f" vs "F" code [15:12:27] which means there is nothing to do, except wait [15:12:47] I will check if it is the same case for matomo [15:13:25] it just happened to be more delayed as it is not running daily, like other backups [15:13:52] jynus: oh okay, that might line up with the bullseye reboots, which would explain [15:14:22] no, nothing to do with that- I don't see a failure, just got the code wrong myself [15:14:58] *nod* [15:15:10] and the other things that contributed with that is that if there is a scheduled backup and no backups run on the new system, I am too strict and consider the status as "all failures" [15:15:29] (older backups are still kept, just ignored for monitoring) [15:16:18] matomo is also ok, just as a weekly backup it will have to wait until Wednesday for its run [15:16:48] so sorry for calling you, normally it is something more sublte, in this case it is just (my) monitoring not being super clear [15:37:07] that's fine no worries jynus, better be safe