[05:22:04] I am going to try again to clean gtid_domain_id on db_inventory [06:11:51] looks like I got it right this time [06:14:07] I am going to go for m5 [07:03:04] I broke it, then fixed it again, and then broke it again [07:03:07] This is so painful [07:03:25] I am going to need to reclone codfw entirely [07:11:15] you should coordinate with Amir if ever tried in mediawiki, I think the chronology protector does some assumptions re:gtid [07:11:32] yeah, there's lots of work there [07:11:43] I really want to disable the pt-heartbeat on the secondary dc [07:11:47] I need to talk to amir about it [07:23:25] ok all fixed and codfw recloned [07:45:52] I am rebooting dbproxy2004 (m5 codfw proxy) which is in a very weird state after all these issues [07:45:56] and not detecting any backedn [07:48:37] Ah, I know what it is [07:50:04] fixed [13:11:46] it only took me 4 hours to understand what went wrong XD [13:13:29] fun times :) [14:25:42] Emperor: if swift returns a 400, does it log the underlying reason somewhere (and if so, where)? [14:25:50] Emperor: for example: https://logstash.wikimedia.org/app/discover#/doc/0fade920-6712-11eb-8327-370b46f9e7a5/ecs-k8s-1-1.11.0-6-2023.21?id=7LGvTIgBs53OSt3dsZkq [15:01:19] urandom: swift's logging isn't very helpful. You might find something by grepping for the relevant timestamp in the frontends' proxy-access.log and server.log [15:01:43] urandom: that's presumably thanos, so at least it's not so many frontends to go grepping in [15:02:03] it doesn't log remotely? [15:02:25] I don't think we ship swift logs anywhere, no [15:05:28] [if you would like more help/advice/heckling do shout] [15:20:31] urandom re: T330693. I wonder if the issue is that the new bucket name contains '_'. [15:20:32] T330693: Storage request: swift s3 bucket for mediawiki-page-content-change-enrichment checkpointing - https://phabricator.wikimedia.org/T330693 [15:20:55] might be an invalid character for the s3 protocol https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html [15:21:12] investigating [15:21:15] ^ ottomata [15:25:51] urandom ottomata replacing `_` with `-` did the trick, the checkpoints are stored as expected [15:26:34] gmodena: ha, I was just coming here to suggest the same [15:28:20] it works for swift (I guess that's how you create the container) [15:28:27] s/create/created/ [15:30:18] urandom indeed. I created the bucket with `swift post` [15:30:36] urandom thanks for looking into this, and apologies for the noise. [15:30:46] s/bucket/container [15:30:54] gmodena: no worries [15:33:49] in s3 it's a bucket ;-) [15:53:42] right, it was a container that he created, one that did not work as a bucket :) [16:43:17] i guess a bucket is a container but a container is not necessarily a bucket :p