[00:11:26] !log tools.sal Rotated elasticsearch password (T324637) [00:11:29] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.sal/SAL [00:44:44] !log toolhub Update demo server to cbc780 [00:44:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolhub/SAL [03:16:25] !log wm-bot locally applied fixes to stop the apache error log from filling up with spam [03:16:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wm-bot/SAL [12:48:51] !log paws bump pywikibot version fea0a1a38c8d3a5e2cd29f8fa854664dd7cb6643 T324404 [12:48:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Paws/SAL [12:48:54] T324404: New upstream release for Pywikibot - https://phabricator.wikimedia.org/T324404 [15:56:24] does anyone have the cycles to review https://gerrit.wikimedia.org/r/c/operations/puppet/+/867646 ? It is failing PCC but I think that's just because labstore1007.wikimedia.org is listed as a host and supposedly it is deprecated? See also https://gerrit.wikimedia.org/r/c/operations/puppet/+/832543 which has been reverted? [16:04:44] inflatador: I'm in meetings for a while but feel free to ping me in a couple hours if I don't get to it before then [16:06:42] andrewbogott ACK, thanks! [20:00:59] taavi: are you around to give Southparkfan and me a bit of help with some acme/cert/tls problems? I'm out of my depth but spf doesn't have access to the host in question [20:01:11] (which is syslog-server-audit02.cloudinfra.eqiad1.wikimedia.cloud ) [20:02:28] what's up? [20:03:07] hi,I found a file called "hieradata/codfw/profile/openstack/codfw1dev/cloudgw.yaml" because it lists the contint servers. looks to me like that is what allows cloud VPS to connect to prod CI. so far so good. But is it expected that this is only "codfw1dev" and not eqiad? and the content has both codfw and eqiad hosts inside? [20:03:35] I am just wondering from the perspective of someone who is replacing the contint server [20:04:55] I just found there is also a cloudgw.yaml for eqiad, but that does not list CI servers. related change: https://gerrit.wikimedia.org/r/c/operations/puppet/+/867675 [20:05:10] taavi: I'll let Southparkfan elucidate but in theory every VM should be talking to rsyslogd on that server and instead complains about invalid certs [20:07:27] Is there a concept of deploy accounts/keys on toolforge? E.g. if I wanted my CI to deploy to toolforge, but I didn't want to my personal private key in the pipeline configuration. [20:08:24] kindrobot: not at the moment :/ [20:08:52] the current recommended method is pull-to-deploy https://wikitech.wikimedia.org/wiki/Help:Toolforge/Auto-update_a_tool_from_GitHub [20:09:08] mutante: I will try to look in a bit. But generally no codfw1dev config is likely to matter for your purposes [20:09:28] taavi: one moment, looks like andrew and I have found a fix [20:09:41] andrewbogott: how about I just add you as gerrit reviewer and there is no rush to it. take your time. thank you [20:09:47] ok [20:09:52] thanks, done [20:10:38] mutante: the eqiad1 file is, and it hieradata/eqiad/profile/openstack/eqiad1/cloudgw.yaml does list contint2001 [20:11:28] Thanks SantiComposite [20:12:35] Though I imagine that won't work for me because it's a rust tool that has to compile and then restart the webservice. [20:12:47] Hm... [20:13:03] andrewbogott: at least a manual `openssl s_client` works and is able to verify the certificate.. where are you seeing issues? [20:13:05] taavi: oh, eqiad1 vs eqiad? thank you! [20:13:30] I will amend. you focus on your cert issue, thanks guys [20:13:33] taavi: it only just started working. Confused by the many different certs that acme-chief makes... [20:15:23] k8s init containers would be a good solution to "webservice needs a build step", but I couldn't get them to work on Toolforge when I tried a year ago or so [20:18:24] Oh, interesting. I was also thinking about smol gitlab runner as a container on toolforge, but I can't quiet noodle my way around how to get that to pull, compile, and restart the webservice. [20:19:28] taavi: any idea if acme-chief distributes the root certificate as a dedicated file, instead as part of .chained.crt? [20:20:22] Southparkfan: yes, .chain.crt [20:20:50] can you paste an example? [20:21:50] https://phabricator.wikimedia.org/P42689 [20:22:09] thanks [20:22:12] that's the LE root and intermediate [20:22:34] yep, R3 + ISRG Root X1 it seems [20:22:45] whereas .chained.crt contains leaf + R3 + ISRG Root X1 [20:23:01] It distributes the intermediate CAs [20:23:03] taavi: for context, we're wondering what CA to point clients at to validate LE certs. Currently using /etc/ssl/certs/ca-certificates.crt [20:23:05] Not the root one [20:23:17] andrewbogott: ca-certificates should be fine.. [20:23:21] At least, I think Southparkfan is referring to https://gerrit.wikimedia.org/r/c/operations/puppet/+/866628/4/hieradata/cloud/eqiad1.yaml#292 [20:25:37] I cannot find any example of software using acme certificates that have a separate config setting for the ca [20:25:55] rsyslog is an exception [20:26:45] and setting .chained.crt. as both cafile and certfile 'works', but feels stupid [20:27:30] what does it need cafile for? that sounds something that would be used for client cert verification, which we don't do [20:27:55] vgutierrez: what are the intermediate and root certificates here? https://letsencrypt.org/certificates/ gives me information, but at some point I fell in the rabbithole of the X3 expiration thing [20:27:58] * andrewbogott tries it without [20:29:05] acme-chief provides the intermediate as .chain.crt [20:29:23] The alt chain is also available if needed as .alt.chain.crt [20:29:36] it seems to work with $DefaultNetstreamDriverCAFile /etc/acmecerts/rsyslog/live/ec-prime256v1.chained.crt removed from 10-receiver.conf [20:29:53] so probably that setting is ignored [20:30:35] that's good to know [20:31:12] vgutierrez: both R3 and X1 are considered intermediate certificates? [20:32:44] X1 is the root certificate [20:32:46] R3 is the intermediate one [20:33:02] And ISRG Root X1 is the root CA [20:33:04] so .chain.crt actually contains intermediate AND root? [20:33:08] Nope [20:33:13] Just the intermediate [20:33:32] A server isn't required to send the root CA cert during the TLS handshake [20:33:52] exactly, I'd expect the client to verify the intermediate against its own trust store [20:36:26] that's right [20:36:40] And it should work unless you have a pretty old client involved [20:37:36] ok, so finally I have found a minor reference in the terrible rsyslog documentation that DefaultNetstreamDriverCAFile is indeed only used for client verification, supporting andrewbogott's finding [20:38:19] So for clarity we could configure it with /dev/null or similar [20:38:53] but we shouldn't forget that syslog servers acting as syslog clients too have it defined twice [20:41:20] as long as certificate validation won't be used, we can keep it at /etc/ssl/certs/ca-certificates.crt (for server verification) and call it day, I guess [20:44:20] yeah, harmless if weird [20:46:21] Southparkfan: want me to write the patch for certfile or do you have it in progress? [20:49:29] vgutierrez: stupid question perhaps - is ISRG Root X1 distributed as an intermediate for compatibility purposes (relying on DST Root CA X3 for intermediate <-> root signature check)? [20:52:14] andrewbogott: trying to clear up some chain-of-trust confusion first :) [20:53:05] ok. When you're ready... https://gerrit.wikimedia.org/r/c/operations/puppet/+/867709 [20:55:40] did your commit message end up in receiver.pp? :P [20:56:12] apparently? Fixing [21:06:40] Southparkfan: not on my computer right now.. but yes.. one of the chains include it for compatibility purposes.. as is it cross-signed by DST X3 [21:07:30] yes, so technically, for modern clients (that have ISRG Root X1 in the trust store), using alt.chained.crt (only using R3 as intermediate instead of using R3 and ISRG Root X1 as intermediates) should work too, right? [21:08:34] all clients are >= buster systems [21:10:48] that makes sense assuming that ISRG root X1 is on debian buster list of trusted CA certs [21:10:57] my buster client says it is [21:11:09] ack :) [21:11:56] now I understand - I thought that chain.crt (non-alt) contained intermediate AND root, but that's depending of the chain of trust [21:12:04] because the root can be used as an intermediate :P [21:12:25] ack, then alt.chained.crt can be used server-side cc andrewbogott [21:13:27] ok, patch updated [21:13:29] I guess that strictly speaking if you configure TLS versions properly.. only 1.2 and 1.3.. you are effectively restricting the client base enough to use .alt.chained.crt safely [21:14:22] in this case.. rsyslog servers and supporting >= buster you could even go with just TLS 1 [21:14:31] *TLS 1.3 [21:14:43] sounds fair [21:14:56] even then I'd prioritise validation of the SANs, though [21:30:32] taavi, vgutierrez: certificate issues have been fixed, syslog clients are logging now. many thanks for your help! [21:33:36] Nice :) [21:34:10] alt.chained.crt did the trick [21:36:01] is that a usenet group? [21:36:20] I have learned that.... wikimedia.cloud is a real domain name, how LE chains work, you don't want to use rsyslog with gnutls, ca_file is not used for the thing you thought it was used for and that it wasn't actually the fault of puppet [21:36:23] hahaha [21:37:03] ;) usenet group:) [21:37:46] for my own safety, it's probably wise to never talk about certificates and TLS ever again [21:37:54] Southparkfan: nice:) [21:56:13] thanks for the code review andrewbogott ! [22:02:51] I didn't understand your nit as far as where to put in 'clouddumps100x' but happy to revisit if need be [22:19:17] I think in the patch description you say 'labstore'? [22:19:28] Which confused me because labstores barely exist and also aren't dumps :) [22:58:01] If I wanted a mirror of `dumpmirrorsother` (per https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps) about how much disk space would I need? [23:10:39] >"Other": 31 TB (Dec 2020). Pageview analytics, CirrusSearch indexes, Wikidata entity dumps and other datasets. [23:33:56] How feasible do you suppose it would be to break that up into multiple streams? Such as one for Wikibase triple dumps and one for analytics dumps [23:36:38] You can just do that yourself if you wanted [23:37:36] Just mirror https://dumps.wikimedia.org/other/wikibase/wikidatawiki/ or https://dumps.wikimedia.org/other/analytics etc [23:38:24] https://github.com/wikimedia/puppet/blob/023568661222f170f5741ced10791cb34dff1063/modules/profile/templates/dumps/distribution/mirrors/rsyncd.conf.dumps_to_public.erb#L63-L67 [23:40:02] Not sure offhand what the term for that "group" is in rsync parlance... But if you file a task, I think it would be trivial enough [23:42:39] Basically I would have an rsync parameter (or you would, on your end) saying that I want the "other" dataset, but not those particular subsets. So that way we don't have to create a whole new dataset. Do I have that right? [23:48:56] I think you can do something like `rsync -avRP dumps.wikimedia.org::dumpmirrorsother/analytics/ /dst` [23:49:19] (your IP would need to be in the allow list for rsyncd on the dumps server) [23:49:46] Is being added to the allow list the extent of the coordination required? Once I'm in, I run the rsync on my end? [23:50:08] Yup