Fork me on GitHub

Wikimedia IRC logs browser - #wikimedia-releng

Filter:
Start date
End date

Displaying 154 items:

2022-07-22 00:10:20 <wikibugs> ('PS1) ''Brian Wolff: Make composer-php80 run on gate-and-submit for MW core [integration/config] - ''https://gerrit.wikimedia.org/r/816062 (https://phabricator.wikimedia.org/T300463)'
2022-07-22 00:12:59 <wikibugs> ('CR) ''CI reject: [V: ''-1] Make composer-php80 run on gate-and-submit for MW core [integration/config] - ''https://gerrit.wikimedia.org/r/816062 (https://phabricator.wikimedia.org/T300463) (owner: ''Brian Wolff)'
2022-07-22 00:16:02 <wikibugs> ('CR) ''Reedy: Make composer-php80 run on gate-and-submit for MW core (''2 comments) [integration/config] - ''https://gerrit.wikimedia.org/r/816062 (https://phabricator.wikimedia.org/T300463) (owner: ''Brian Wolff)'
2022-07-22 00:17:45 <wikibugs> ('PS2) ''Brian Wolff: Make composer-php80 run on gate-and-submit for MW core [integration/config] - ''https://gerrit.wikimedia.org/r/816062 (https://phabricator.wikimedia.org/T300463)'
2022-07-22 00:35:07 <wikibugs> ('PS3) ''Brian Wolff: Make composer-php80 run on gate-and-submit for MW core [integration/config] - ''https://gerrit.wikimedia.org/r/816062 (https://phabricator.wikimedia.org/T300463)'
2022-07-22 01:38:16 <wikibugs> 'Continuous-Integration-Config, ''PHP 8.0 support, ''Patch-For-Review: Make PHP 8.0 voting on MW master - https://phabricator.wikimedia.org/T300463 (''Bawolff) Well looks like it does not pass on 1.35 yet.'
2022-07-22 08:38:42 <wikibugs> ('CR) ''Jaime Nuche: [C: ''+2] deploy-promote: Terminate line after jenkins has merged the patch [tools/scap] - ''https://gerrit.wikimedia.org/r/816015 (owner: ''Ahmon Dancy)'
2022-07-22 08:43:09 <wikibugs> ('Merged) ''jenkins-bot: deploy-promote: Terminate line after jenkins has merged the patch [tools/scap] - ''https://gerrit.wikimedia.org/r/816015 (owner: ''Ahmon Dancy)'
2022-07-22 08:56:55 <wikibugs> 'Beta-Cluster-Infrastructure, ''Continuous-Integration-Infrastructure, ''MediaWiki-SettingsBuilder, ''ci-test-error: beta-update-databases-eqiad failing due to invalid MediaWiki configuration parameters - https://phabricator.wikimedia.org/T313128 (''daniel) >>! In T313128#8096152, @RhinosF1 wrote: > We sp...'
2022-07-22 08:58:55 <wikibugs> 'Beta-Cluster-Infrastructure, ''Continuous-Integration-Infrastructure, ''MediaWiki-SettingsBuilder, ''ci-test-error: beta-update-databases-eqiad failing due to invalid MediaWiki configuration parameters - https://phabricator.wikimedia.org/T313128 (''RhinosF1) There is a copy of the code somewhere that can...'
2022-07-22 09:06:13 <wikibugs> ('PS1) ''Jaime Nuche: deploy-promote: abort process if version check fails [tools/scap] - ''https://gerrit.wikimedia.org/r/816113'
2022-07-22 09:23:13 <wikibugs> ('PS1) ''Hashar: POST events asynchronously [software/gerrit/plugins/events-wikimedia] - ''https://gerrit.wikimedia.org/r/816115'
2022-07-22 09:23:46 <wikibugs> 'Deployments, ''Release-Engineering-Team (Doing), ''SRE, ''bacula, ''Parsoid (Tracking): Accidental removal of some files under /srv/deployment on deploy1002 - https://phabricator.wikimedia.org/T307349 (''jcrespo) @elukey We didn't receive any bad reports so far, should we be good to close this task or...'
2022-07-22 09:32:53 <wikibugs> 'Deployments, ''Release-Engineering-Team (Doing), ''SRE, ''bacula, ''Parsoid (Tracking): Accidental removal of some files under /srv/deployment on deploy1002 - https://phabricator.wikimedia.org/T307349 (''RhinosF1) T309162 is still actionable from the incident.'
2022-07-22 10:02:17 <wikibugs> 'Release-Engineering-Team (The Decommission Mission 💀), ''SRE, ''SRE-Access-Requests, ''Patch-For-Review: Add dancy to phabricator-roots - https://phabricator.wikimedia.org/T313551 (''Vgutierrez) p:''Triage→''Medium'
2022-07-22 12:11:07 <wikibugs> 'GitLab (CI & Job Runners), ''serviceops, ''serviceops-collab, ''Patch-For-Review: DNS/networking not working on Trusted Runners - https://phabricator.wikimedia.org/T311241 (''Jelto) p:''High→''Medium >>! In T311241#8091812, @dduvall wrote: > > The primary reason for the custom docker network is to h...'
2022-07-22 12:34:36 <wikibugs> 'GitLab (Project Migration), ''Release-Engineering-Team: Create new GitLab project group: Community Resources Team - https://phabricator.wikimedia.org/T313593 (''Osnard)'
2022-07-22 12:36:32 <wikibugs> 'Phabricator (Upstream), ''Release-Engineering-Team, ''Upstream, ''User-brennen: Uploaded files via the drag-and-drop are defaulting to private-access - https://phabricator.wikimedia.org/T310833 (''Esanders) This also happens when editing comments.'
2022-07-22 13:48:29 <wikibugs> ('PS1) ''Hashar: build: manage dependencies with rules_jvm_external [software/gerrit/plugins/events-wikimedia] - ''https://gerrit.wikimedia.org/r/816172'
2022-07-22 13:48:44 <wikibugs> ('PS2) ''Hashar: build: manage dependencies with rules_jvm_external [software/gerrit/plugins/events-wikimedia] - ''https://gerrit.wikimedia.org/r/816172'
2022-07-22 14:07:38 <wikibugs> ('PS3) ''Hashar: build: manage dependencies with rules_jvm_external [software/gerrit/plugins/events-wikimedia] - ''https://gerrit.wikimedia.org/r/816172'
2022-07-22 14:40:13 <wikibugs> ('CR) ''Ahmon Dancy: [C: ''+2] deploy-promote: abort process if version check fails [tools/scap] - ''https://gerrit.wikimedia.org/r/816113 (owner: ''Jaime Nuche)'
2022-07-22 14:46:59 <wikibugs> ('Merged) ''jenkins-bot: deploy-promote: abort process if version check fails [tools/scap] - ''https://gerrit.wikimedia.org/r/816113 (owner: ''Jaime Nuche)'
2022-07-22 14:51:28 <wikibugs> 'Beta-Cluster-Infrastructure, ''Continuous-Integration-Infrastructure, ''MediaWiki-SettingsBuilder, ''ci-test-error: beta-update-databases-eqiad failing due to invalid MediaWiki configuration parameters - https://phabricator.wikimedia.org/T313128 (''RhinosF1) > 16:23:04 <hashar> RhinosF1: James_F: we can...'
2022-07-22 15:14:54 <wikibugs> 'Phabricator (Upstream), ''Release-Engineering-Team, ''Upstream, ''User-brennen: Uploaded files via the drag-and-drop are defaulting to private-access - https://phabricator.wikimedia.org/T310833 (''DLynch) Granted, my understanding is that "automatically enabling access to files that're in edited-content"...'
2022-07-22 15:26:15 <wikibugs> 'Beta-Cluster-Infrastructure, ''Continuous-Integration-Infrastructure, ''MediaWiki-SettingsBuilder, ''ci-test-error: beta-update-databases-eqiad failing due to invalid MediaWiki configuration parameters - https://phabricator.wikimedia.org/T313128 (''hashar) Validating configuration remembered me of MediaW...'
2022-07-22 15:50:53 <wikibugs> ('CR) ''Jforrester: "I don't think it's acceptable for us to have divergent PHP support criteria for vendor and composer jobs for the master branch. Otherwise " [integration/config] - ''https://gerrit.wikimedia.org/r/816062 (https://phabricator.wikimedia.org/T300463) (owner: ''Brian Wolff)'
2022-07-22 19:38:51 <wikibugs> 'Phabricator, ''Release-Engineering-Team (The Decommission Mission 💀), ''serviceops, ''serviceops-collab: Setup rsync for phab data on disk - https://phabricator.wikimedia.org/T313360 (''Dzahn) > (must. save. @mmodell's bash history.) I made a phab1001-home-twentyafterfour.tar.gz so the entire home and...'
2022-07-22 19:42:25 <wikibugs> 'Phabricator, ''Release-Engineering-Team (The Decommission Mission 💀), ''serviceops, ''serviceops-collab: Setup rsync for phab data on disk - https://phabricator.wikimedia.org/T313360 (''Dzahn) a:''Dzahn'
2022-07-22 19:45:25 <wikibugs> 'Phabricator, ''Release-Engineering-Team (The Decommission Mission 💀), ''serviceops, ''serviceops-collab: Setup rsync for phab data on disk - https://phabricator.wikimedia.org/T313360 (''Dzahn) from syncing data last time back in 2019 https://gerrit.wikimedia.org/r/c/operations/puppet/+/554628'
2022-07-22 19:48:35 <wikibugs> 'Phabricator, ''Release-Engineering-Team (The Decommission Mission 💀), ''User-brennen: Deploy Phabricator with scap - https://phabricator.wikimedia.org/T313259 (''brennen) `scap deploy -v -l 'phab2001.codfw.wmnet'` fails from deploy1002 - ` ... Received disconnect from 10.192.32.147 port 22:2: Too many aut...'
2022-07-22 20:11:26 <wikibugs> 'Phabricator, ''Release-Engineering-Team (The Decommission Mission 💀), ''User-brennen: Deploy Phabricator with scap - https://phabricator.wikimedia.org/T313259 (''Dzahn) 10.64.32.28 is deploy1002 in the logs on phab2001, looking for connections from deploy1002: ` Jul 19 14:23:02 phab2001 sshd[13278]: Con...'
2022-07-22 20:15:20 <wikibugs> 'Phabricator, ''Release-Engineering-Team (The Decommission Mission 💀), ''User-brennen: Deploy Phabricator with scap - https://phabricator.wikimedia.org/T313259 (''Dzahn) root@deploy1002:/home/dzahn# ssh -i /etc/keyholder.d/phabricator scap@phab2001.codfw.wmnet Jul 22 20:12:39 phab2001 sshd[27629]: Failed...'
2022-07-22 20:20:59 <wikibugs> 'Phabricator, ''Release-Engineering-Team (The Decommission Mission 💀), ''User-brennen: Deploy Phabricator with scap - https://phabricator.wikimedia.org/T313259 (''Dzahn) For the scap user it would be: `ssh -i /etc/keyholder.d/scap scap@phab2001.codfw.wmnet`. scap key for scap user. but that one has: Load...'
2022-07-22 20:30:32 <mutante> brennen: SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh -oIdentitiesOnly=yes -oIdentityFile=/etc/keyholder.d/phabricator phab-deploy@phab2001.codfw.wmnet
2022-07-22 20:30:37 <mutante> this is how it _should_ work
2022-07-22 20:30:53 <mutante> it says so at https://wikitech.wikimedia.org/wiki/Keyholder#Hints
2022-07-22 20:31:01 <mutante> and the user is "phab-deploy"
2022-07-22 20:31:19 <mutante> I only get an 'sign_and_send_pubkey: signing failed: agent refused operation'
2022-07-22 20:31:33 <mutante> but that only happens when the rest is correct,heh
2022-07-22 20:32:55 <mutante> [deploy1002:~] $ SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh -oIdentitiesOnly=yes -oIdentityFile=/etc/keyholder.d/phabricator phab-deploy@phab2001.codfw.wmnet
2022-07-22 20:32:57 <mutante> Linux phab2001 4.19.0-20-amd64 #1 SMP Debian 4.19.235-1 (2022-03-17) x86_64
2022-07-22 20:33:01 <mutante> ^ works
2022-07-22 20:34:52 <wikibugs> 'Phabricator, ''Release-Engineering-Team (The Decommission Mission 💀), ''User-brennen: Deploy Phabricator with scap - https://phabricator.wikimedia.org/T313259 (''Dzahn) This is how it actually works, using the AUTH_SOCK from keyholder, and using the correct "phab-deploy" user and not trying it as root: `...'
2022-07-22 20:38:23 <wikibugs> 'Phabricator, ''Release-Engineering-Team (The Decommission Mission 💀), ''User-brennen: Deploy Phabricator with scap - https://phabricator.wikimedia.org/T313259 (''Dzahn) @brennen Seems to me the issue is it's trying to connect as "scap" but it should use "phab-deploy" user. Then it should work together with...'
2022-07-22 20:39:12 <brennen> hmm, yeah, wrong user would make sense, i think - though i don't know why it's not using the one in the config file...
2022-07-22 20:43:10 <mutante> which one, /etc/scap.cfg ?
2022-07-22 20:44:04 <brennen> /srv/deployment/phabricator/deployment/scap/scap.cfg
2022-07-22 20:44:06 <brennen> the one from the repo
2022-07-22 20:44:47 <mutante> i see. yea, that has phab-deploy
2022-07-22 20:47:09 <mutante> "Uses local .scaprc as config for each host in cluster
2022-07-22 20:47:24 <mutante> but that is just general scap help text
2022-07-22 20:53:40 <hasharAway> brennen: mutante: maybe scap log has some more details?
2022-07-22 20:54:15 <hasharAway> there is something funky which will cause ssh to try every single keys instead of the one for the user
2022-07-22 20:54:28 <hasharAway> so it tries each of the keys in the keyholder one after the others
2022-07-22 20:54:38 <hasharAway> until the remote sshd bails out cause there was too many auth failures
2022-07-22 20:55:23 <hasharAway> eg https://logstash.wikimedia.org/app/discover#/doc/0fade920-6712-11eb-8327-370b46f9e7a5/ecs-default-1-1.7.0-5-2022.29?id=bDinJ4IB86RsLKL31MDN
2022-07-22 20:55:55 <hasharAway> ran as phab-deploy@phab2001.codfw.wmnet (correct user?)
2022-07-22 20:56:08 <hasharAway> then it lists a long list of keys
2022-07-22 20:56:38 <hasharAway> it tries the 6 first then the remote bails out
2022-07-22 20:57:23 <brennen> hmm: 20:44:44 Unable to find keyholder key for phab_deploy
2022-07-22 20:57:36 <brennen> ...is it converting phab-deploy to phab_deploy or something?
2022-07-22 20:57:50 <hasharAway> yeah scap has a `get_keyholder_key()` which iirc is being passed the user (so would be phab-deploy)
2022-07-22 20:57:57 <mutante> I also noticed earlier in one place it was underscore and in the other it was -
2022-07-22 20:57:57 <hasharAway> it iterates through the key comment names
2022-07-22 20:58:03 <hasharAway> i think
2022-07-22 20:58:18 <brennen> i just saw something about underscores
2022-07-22 20:58:19 <hasharAway> we had the issue with trainbranchbot
2022-07-22 20:58:22 <brennen> digs through open tabs
2022-07-22 20:58:42 <hasharAway> is that key new? - 2048 SHA256:QpALwrv9ZQnSiC42TDpwfHSHuMxqNgxDv1M7MOP1I30 /etc/keyholder.d/phabricator (RSA)
2022-07-22 20:58:55 <hasharAway> or is that cause you are using a new username?
2022-07-22 20:59:19 <brennen> i haven't changed either
2022-07-22 20:59:26 <mutante> key is not new
2022-07-22 21:01:12 <mutante> /etc/ssh/userkeys/phab-deploy is also same on phab1001 and phab2001
2022-07-22 21:02:11 <mutante> and it also has the same checksum as /etc/keyholder.d/phabricator.pub on deploy1002
2022-07-22 21:02:22 <hasharAway> maybe the scap.cfg needs the name `keyholder_key: phabricator`
2022-07-22 21:02:55 <hasharAway> reading scap code it looks like it checks for the existence of `/etc/keyholder.d/{self.config["ssh_user"]}`
2022-07-22 21:03:05 <hasharAway> which would be `/etc/keyholder.d/phab-deploy`
2022-07-22 21:03:22 <hasharAway> which does not exist
2022-07-22 21:03:24 <mutante> yea, this sounds like a good guess
2022-07-22 21:03:37 <mutante> phabricator vs phab-deploy
2022-07-22 21:03:40 <hasharAway> so maybe on the deployment server manually amend /srv/deployment/phabricator/deployment/scap/scap.cfg
2022-07-22 21:03:41 <hasharAway> and add
2022-07-22 21:03:45 <hasharAway> keyholder_key: phab-deploy
2022-07-22 21:03:53 <hasharAway> ERROR
2022-07-22 21:04:00 <hasharAway> `keyholder_key: phabricator`
2022-07-22 21:04:08 <brennen> tries that
2022-07-22 21:04:10 <hasharAway> $ ls -la /etc/keyholder.d/phabricator
2022-07-22 21:04:10 <hasharAway> -r--r----- 1 root keyholder 1766 Nov 30 2020 /etc/keyholder.d/phabricator
2022-07-22 21:04:19 <hasharAway> I don't know why it would have broken
2022-07-22 21:04:31 <hasharAway> maybe due to some codechange done recently in scap
2022-07-22 21:05:04 <brennen> seems like it might be working
2022-07-22 21:05:14 <hasharAway> I might be the one to blame
2022-07-22 21:05:32 <hasharAway> cause I know close to nothing about scap code and if I know about that get_keyholder_key method it must be that I have altered it recently
2022-07-22 21:05:50 <brennen> that did it - thanks hasharAway!
2022-07-22 21:05:53 <hasharAway> well
2022-07-22 21:05:57 <hasharAway> great :]
2022-07-22 21:05:59 <mutante> :) nice win for Friday afternoon/night
2022-07-22 21:06:08 <hasharAway> Using key: /etc/keyholder.d/phabricator
2022-07-22 21:06:10 <hasharAway> from the scap log
2022-07-22 21:06:54 <hasharAway> while previously we had:
2022-07-22 21:06:57 <hasharAway> `Running remote deploy cmd ['/usr/bin/scap', 'deploy-local', '-v', '--repo', 'phabricator/deployment', '-g', 'default', 'fetch', '--refresh-config']`
2022-07-22 21:07:03 <hasharAway> `Unable to find keyholder key for phab_deploy`
2022-07-22 21:07:16 <hasharAway> `['/usr/bin/scap', 'deploy-local', '-v', '--repo', 'phabricator/deployment', '-g', 'default', 'fetch', '--refresh-config'] (ran as phab-deploy@phab2001.codfw.wmnet) `
2022-07-22 21:07:49 <hasharAway> if we can't find a keyholder key using the ssh_user or the keyholder_key config value if it is set
2022-07-22 21:07:56 <hasharAway> then I think scap should abort entirely
2022-07-22 21:08:09 <hasharAway> else it tries to do every single keys from the keyholder ( see above logstash link)
2022-07-22 21:08:17 <brennen> right, and then just fails on too many auth attempts
2022-07-22 21:08:23 <hasharAway> and fails unless you deploy with one of the first 6 keys
2022-07-22 21:08:30 <brennen> my guess is that at one time the fallback might have worked because there weren't very many keys to try
2022-07-22 21:08:31 <hasharAway> which sounds like it can be filed as a task
2022-07-22 21:08:34 <mutante> yea, that explains "too many authentication failures"
2022-07-22 21:08:48 <hasharAway> I am sure I have encountered the same issue with jnuche a few weeks ago
2022-07-22 21:08:51 <brennen> i can file a task
2022-07-22 21:08:57 <mutante> there is a max number
2022-07-22 21:09:02 <hasharAway> cause at 11pm there is no way I can figure that out of thin air
2022-07-22 21:09:20 <hasharAway> I bet $7 or a drink that the faulty code would blame me :]
2022-07-22 21:09:25 <brennen> haha
2022-07-22 21:09:44 <hasharAway> for the task you can copy the few lines I have pasted above
2022-07-22 21:09:57 <brennen> what's that line - "debugging is like solving a mystery in which you are simultaneously the detective, the murderer, and the victim"
2022-07-22 21:10:03 <hasharAway> and the ssh debug log showing up the list of keys attempted (that is the message in https://logstash.wikimedia.org/app/discover#/doc/0fade920-6712-11eb-8327-370b46f9e7a5/ecs-default-1-1.7.0-5-2022.29?id=bDinJ4IB86RsLKL31MDN )
2022-07-22 21:10:30 <hasharAway> fun thing
2022-07-22 21:10:42 <hasharAway> when I had regular 1/1 hacking sessions with thcipriani
2022-07-22 21:10:59 <hasharAway> we often resorted to google search to figure out about a cryptic faults we encountered during the error
2022-07-22 21:11:07 <hasharAway> only to find out the first hit is a phabricator task filed a few years ago
2022-07-22 21:11:17 <hasharAway> with PAGES of debugging about it often authored by one of us
2022-07-22 21:11:25 <hasharAway> fun
2022-07-22 21:11:39 <hasharAway> cause years later we encountered the exact same issue and were about to do the whole debugging step
2022-07-22 21:11:57 <hasharAway> but were thanksful to have extensively captured the debugging sessions and the solution founds a few years back
2022-07-22 21:12:00 <hasharAway> time saver! :]
2022-07-22 21:12:27 <hasharAway> what I am wondering is whether people in a century will still resort on those tales and lore to fix up the future infra
2022-07-22 21:12:49 <hasharAway> or maybe by that time the singularity AI will spurt the non sense we have been writing since January 1st 1970
2022-07-22 21:13:06 <hasharAway> shuts up
2022-07-22 21:15:02 <mutante> the other day I searched for something like "NodeSet syntax wildcard" and the result was a page where v.olans is talking to upstream clustershell project what syntax we could use for host selection in cumin
2022-07-22 21:16:39 <hasharAway> it has a few bugs iirc
2022-07-22 21:17:43 <hasharAway> brennen: so phab can be deployed again isn't it?
2022-07-22 21:19:31 <brennen> hasharAway: i am unblocked in getting phab deploy to work with scap
2022-07-22 21:19:38 <hasharAway> \o/
2022-07-22 21:20:04 <hasharAway> jnuche mentioned moving Phabricator to a docker image and potentially toward k8s
2022-07-22 21:20:22 <hasharAway> it is probably a good thing to do, then that is unrelated to the above or current sprint :-]
2022-07-22 21:22:40 <wikibugs> 'Release-Engineering-Team, ''Scap, ''User-brennen: scap should fail if it can't find a keyholder key using ssh_user or keyholder_key values - https://phabricator.wikimedia.org/T313624 (''brennen)'
2022-07-22 21:22:44 <mutante> if we ever do that then we should do that with phorge.it
2022-07-22 21:22:56 <wikibugs> 'Release-Engineering-Team, ''Scap, ''User-brennen: scap should fail if it can't find a keyholder key using ssh_user or keyholder_key values - https://phabricator.wikimedia.org/T313624 (''brennen) p:''Triage→''Low'
2022-07-22 21:27:13 <hasharAway> ideally we would want to invest some engineering time to assist phorge.it
2022-07-22 21:27:23 <hasharAway> or get involved in the community effort
2022-07-22 21:27:56 <hasharAway> anyway I have E_TOO_MANY_IDEAS
2022-07-22 21:30:02 <mutante> we already have our own patches that upstream phab doesnt have. so we need to get those into phorge then
2022-07-22 21:30:14 <mutante> but the benefit is we don't have wmf-form
2022-07-22 21:30:16 <mutante> fork
2022-07-22 21:32:04 <hasharAway> yeah forking has a price
2022-07-22 21:32:18 <hasharAway> I am super happy to be able to deploy Gerrit straight from the upstream release
2022-07-22 21:32:34 <hasharAway> and would love to achieve that for plugins as well
2022-07-22 21:44:50 <wikibugs> 'GitLab (Project Migration), ''Release-Engineering-Team (The Decommission Mission 💀), ''Striker, ''Tools: Figure out workflow for programatically adding GitLab users - https://phabricator.wikimedia.org/T313366 (''demon) a:''demon'
2022-07-22 21:48:06 <p858snake> maybe its time to review our custom patches and see if there is any that we could potentially ditch

This page is generated from SQL logs, you can also download static txt files from here