[03:11:21] as much as I love the new automation, can we make them emit less !logs in -operations? [05:39:29] FYI: Last snapshot for s7 at eqiad (db1171) taken on 2022-04-13 01:06:20 is 1029 GiB, but the previous one was 1085 GiB, a change of -5.1 % [05:39:58] That's the reduction from arwiki.flaggedtemplates [05:40:05] It goes from 120G to around 30 or so [05:40:11] nice [05:40:54] btw, when I send those I am not too worried, they are just warnings I share in case they are not expected [05:41:12] yeah, you did well [05:42:00] e.g. not getting an answer would be the norm for me- only if they are weird we should do more research [06:48:12] jynus: I was wondering if we can have them in grafana [06:48:41] not on public grafana [06:49:26] but I have as a goal having something easily searchable by the end of the Q- maybe including a prometheus scrapper [06:49:35] legoktm: I thought about it a lot and honestly couldn't find a way. The only thing is to make downtime emit one log but that's out of my hand [06:50:24] jynus: it doesn't have to be precise if privacy is the concern [06:50:36] Amir1: we could maybe negotiate having on public grafana the largest wikis, or a subset of the metrics [06:50:43] (with security) [06:51:03] jynus: why not section? [06:51:14] and the raw full data under nda password [06:51:23] That's already aggregated [06:51:36] ah, you mean the total sizes? [06:51:42] yeah, that should be fine [06:51:59] I am talking about the per-table sizes, which I also have [06:52:20] Yeah for that I query [06:52:28] and on small wikis having it too up to date would hint when an edit is done [06:55:00] Yup. That'd be a privacy issue but e.g. s3 was 1085 gb and now it's 1019gb would be really valuable [06:55:48] yes, that is the plan- create a private dashboard with the full data and select a subset of those metrics to export them to prometheus [06:56:22] after security review [06:56:33] Cool [06:56:36] I will also ask for your feedback on what metrics would be useful [06:56:46] Sure [06:59:50] now, I wonder if we should ask obs. team if it would be a good idea to have a private (as in, not public) prometheus instance [07:00:27] that way I could submit the full data there, you could submit production db data and then compare it/create reports [07:01:21] the mysql exporter has a lot of functionality we cannot currently use because it is not PII but sensitive [07:01:30] same for backup data [07:12:21] prometheus is already private (nda/wmf ldap groups), it's just the grafana dashboards that expose it publicly [07:12:40] s/expose it publicly/expose some data from it publicly/ [07:13:18] taavi: sure- but every person that has grafana access can query all of prometheus arbitrarilly [07:13:27] grafana edit* [07:13:49] grafana editing is limited to nda:d users too? [07:14:24] in any case- you don't have to argue with me, but with security, which were the ones that imposed this limitation [07:30:23] Amir1: I have finally fixed everything on https://phabricator.wikimedia.org/T304626 [07:30:56] marostegui: thank you so much [08:33:30] oh, fsck it, the handy "object with a filename that isn't latin1-compatible" I've been using for testing is marked for deletion from commons /o\ [08:34:48] Emperor: which file? [08:35:31] https://commons.wikimedia.org/wiki/File:%E5%8F%B0%E7%81%A3%E7%BE%A4%E5%B1%B1%E6%97%97.svg (it may well be that the flagged for deletion for copyright violation is fair; but it was a useful test object, and now I'll have to find another) [08:36:41] Anyhow, I am moving towards understanding why our wmf rewrite middleware doesn't work on bullseye [08:40:09] TL;DR> "swob" the WSGI-alike used inside swift takes the WSGI path, un-url-encodes it; the resulting byte sequence is decoded into a string using latin1; which is fine if all you want to do is pass it around and then turn it back into bytes later. But we do a bunch of string operations on this mojibaked string, and the result can end up being something that when latin1-encoded again isn't a valid utf-8 byte sequence [08:41:46] if you update the path with a string, then "swob" checks that the byte sequence produced by latin1-encoding that string is valid utf-8, and thus the errors [08:41:54] Emperor: in case it helps- I have a database of commons uploads that is very searchable [08:42:26] can get you a list of files with non-latin chars in no time [08:42:55] Just one is fine, I think (I mean, maybe I do want to test every single one at this point?) [08:43:26] also, I want their URL-encoded paths really [08:44:44] But I think the fix, therefore, is to take the path we want to work on, turn it into a string properly (i.e. take the original bytes and decode as if utf-8), operate on that, and then we'll know the new string we try and set path to will be correct [08:48:20] see https://phabricator.wikimedia.org/P24583 [08:49:01] currently we are operating on the .path_info member directly as if it were a string - and you can see clearly there that means we're going to be trying to edit mojibake, which is not going to work [08:51:03] Emperor: these are from testwiki https://phabricator.wikimedia.org/P24584 [08:51:28] you can find the url encoding on the wiki itself [08:52:54] let me know if there is any other way in which I can help [08:53:23] I know nothing of swift, but I've done my hours with mw file backend code et al [09:03:07] jynus: it's not obvious to me how I got from objects in your list to e.g. an upload.wikimedia.org URL or a page like https://commons.wikimedia.org/wiki/File:%E5%8F%B0%E7%81%A3%E7%BE%A4%E5%B1%B1%E6%97%97.svg [09:04:16] so the upload name for testwiki ends up on https://test.wikipedia.org/wiki/File: [09:04:31] you will find the url encoded file on that page [09:04:39] or you can url-encoded yourself [09:05:03] Ah, yes, I was confused about quoting [09:05:15] the quoting is the name of the file, sadly [09:05:23] that is test wiki, it is pretty wild [09:05:57] I chose the first 50 files from test wiki, so ' appears as part of the name :-/ [09:06:02] :) [09:06:31] I put the location on swift too, in case it helps [09:07:12] (they may be duplicates, as I didn't filter well by unique files + there are mw issues with taht [09:07:57] https://test.wikipedia.org/wiki/File:'''Ensemble_Altstadt_Straubing'''_1344014362555_(%E0%A6%95%E0%A7%8D%E0%A6%B7%E0%A7%81%E0%A6%A6%E0%A7%8D%E0%A6%B0_%E0%A6%B8%E0%A6%82%E0%A6%B8%E0%A7%8D%E0%A6%95%E0%A6%B0%E0%A6%A3).jpeg will do for my nefarious purposes [09:08:03] to be fair, file names like "onerror="alert('XSS')"a=".png make sense for a test wiki [09:08:32] so I knew you would like testwiki varied file names :-) [09:09:42] plus on testwiki you can upload your own cutom files without affecting commons users [09:09:47] *custom [09:12:45] oh, because they have their own path on upload.wikimedia.org? [09:14:17] I mean it in terms of community/logical organization- test is meant for testing, commons is administrated by the commons community [09:15:02] everything goes to the same place in the end-your hands (even if different containers) [09:15:32] https://test.wikipedia.org/wiki/Wikipedia:What_we_do_on_this_wiki [09:15:54] ^"testing code" would fit in what you are doing :-D [09:17:00] I should change the integration test (such as it is) to use a non-ASCII path too [09:46:01] jynus: actually, I do have a question - does mw always use UTF-8 for encoding file paths? [09:46:47] I tested reading every single file on mw and all are valid utf-8 encoded [09:46:57] let me show you the restrictions [09:47:06] even if I didn't trust them at first [09:48:44] there are some files with NULL or empty name on the database, but I don't think you will find that on swift (they are logical errors) [09:52:12] here: https://www.mediawiki.org/wiki/Manual:Page_title#Invalid_page_titles [09:52:44] for once, documentation matched reality ("Titles with an invalid UTF-8 sequence" are invalid titles -> and so invalid file names) [09:53:12] super, thanks. [09:55:32] jynus: how easy would it be to get me a prod (rather than test) filename or two which aren't latin1-compatible? [09:55:47] sure [09:55:50] commons? [09:56:15] yes [09:56:41] (basically I want one like my zh one I was using until I noted it was flagged for deletion) [09:57:34] I think you will like my media backus dasbhoard too :-) BTW [09:58:41] https://commons.wikimedia.org/wiki/File:!!!%E4%B8%89%E7%94%B0%E9%80%9A%E3%82%8A.JPG [09:58:48] ^been there since 2006 [09:59:39] just the ticket, thank you :) [09:59:49] Now I "just" need to fix the code... [10:03:08] https://commons.wikimedia.org/wiki/File:!!Abajo_los_solteros!!_-_fantas%C3%ADa_c%C3%B3mico-l%C3%ADrica_gubernamental_en_siete_cuadros_y_un_real_decreto,_en_prosa_(IA_abajolossolteros476riba).pdf [10:04:24] although that may be able to be encoded on latin1 [10:05:52] here is another script: https://commons.wikimedia.org/wiki/File:%D0%86%D0%B2%D0%B0%D0%BD%D0%BE-%D0%A4%D1%80%D0%B0%D0%BD%D0%BA%D1%96%D0%B2%D1%81%D1%8C%D0%BA,_%D0%93%D0%BE%D1%82%D0%B5%D0%BB%D1%8C_%D0%90%D0%B2%D1%81%D1%82%D1%80%D1%96%D1%8F,_%D0%B2%D1%83%D0%BB._%D0%A1%D1%96%D1%87%D0%BE%D0%B2%D0%B8%D1%85_%D0%A1%D1%82%D1%80%D1%96%D0%BB%D1%8C%D1%86%D1%96%D0%B2_12.jpg [10:12:00] I think that's enough to be getting on with thanks :) [13:01:23] I just realised that MIXED as binlog format isn't shown in orchestator, only shows row or statement, but hosts with MIXED do not show that [13:01:59] ah no [13:01:59] it is not shown cause they do not have binlogs enabled [13:02:31] marostegui: https://github.com/openark/orchestrator/issues/853 [13:02:48] ah, haha Simon had the same issue! [13:02:53] (I worked with Simon for 2 years) [13:03:06] lol [13:03:12] small world™ [13:03:33] And in fact, I set that orchestrator infra with him :p [13:14:02] marostegui: ok to use the new templatelinks column in s4 now? According to the ticket it should be [13:14:10] yes, it is all done [13:14:11] but better safe then sorry [13:14:15] Awesome [13:14:25] tl_target_id right? [13:17:12] I think my rewrite.py fix DTRT :) [13:23:54] marostegui: yup [13:24:03] yep, it is everywhere in s4 [13:24:07] marostegui: Is Thu next week taken for switchover? [13:24:12] I saw Tue is s7 [13:24:19] I don't have anything on thu [13:24:22] awesome [13:24:25] Tue I am doing s7 [13:24:28] okay to do s8? [13:24:32] Sure, up to you [13:24:37] coool [13:26:16] I'll make an invite and sent it soon [13:54:50] PROBLEM - MariaDB sustained replica lag on m1 on db1117 is CRITICAL: 44 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1117&var-port=13321 [13:55:40] PROBLEM - MariaDB sustained replica lag on m1 on db2078 is CRITICAL: 42.4 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2078&var-port=13321 [13:57:00] etherpad, probably? [13:57:02] RECOVERY - MariaDB sustained replica lag on m1 on db1117 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1117&var-port=13321 [13:57:50] RECOVERY - MariaDB sustained replica lag on m1 on db2078 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2078&var-port=13321 [16:04:00] go.dog is away, anyone else want to look at my rewrite fix? https://gerrit.wikimedia.org/r/c/operations/puppet/+/779900 [16:32:48] Emperor: how urgent is that patch needed? [16:33:02] I don't have specific context but if you need some python review I can surely have a look [16:37:57] why the 2.7 test needed a rewrite too? [16:52:21] volans: the two failing cases needed removing, and I wanted to check that my test cases worked on current-prod [16:54:29] urgent> I'm not going to put a py3 frontend into prod before Easter now [16:55:02] that was my next question, this is the first time we use it with py3? [16:56:50] it's the first time wmf/rewrite.py has been used on py3 [16:57:11] we have other py3 swift frontends, but they don't use the rewrite stuff [16:57:23] got it [17:12:59] Emperor: {done} with the caveat of lack of context :D [17:15:55] thanks :) [20:47:18] legoktm: and another idea is to have specialized SALs. I feel we out-scaled a central SAL. Like one for edge caches, one for network, one for db maint, etc.