Fork me on GitHub

Wikimedia IRC logs browser - #wikimedia-releng

Filter:
Start date
End date

Displaying 557 items:

2025-04-29 08:03:09 <wikibugs> ('PS8) ''Jakob: [DNM] Include OpenSearch in quibble [integration/config] - ''https://gerrit.wikimedia.org/r/1137108 (https://phabricator.wikimedia.org/T386691)'
2025-04-29 08:03:57 <jakob_WMDE> ^ hashar: bonjour! any chance you could take another look at that? :)
2025-04-29 08:04:17 <hashar> jakob_WMDE: unlikely this week unfortunately. I will see what I can do :)
2025-04-29 08:04:35 <jakob_WMDE> ok, thank you!
2025-04-29 08:04:39 <hashar> I am running the MediaWiki train this week and thursday is an holiday here
2025-04-29 08:04:57 <hashar> but if it is calm enough hopefully I will have the bandwith on Friday!
2025-04-29 08:05:01 <wikibugs> ('CR) ''CI reject: [V:''-1] [DNM] Include OpenSearch in quibble [integration/config] - ''https://gerrit.wikimedia.org/r/1137108 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 08:05:09 <hashar> or maybe it is straightforward and I would just do it we shall see
2025-04-29 08:06:25 <hashar> jakob_WMDE: have you managed to build the image locally and test it?
2025-04-29 08:06:31 <hashar> s/test/try/ it?
2025-04-29 08:06:55 <jakob_WMDE> yes :)
2025-04-29 08:06:59 <hashar> ahh good
2025-04-29 08:07:19 <hashar> which is like 80% of the work
2025-04-29 08:07:46 <hashar> so I guess I can manage to multitask the remaining 20% once you get CI Verifiying +1 and that you are happy with
2025-04-29 08:08:01 <hashar> if that does not affect the rest and is behind a feature flag, I guess it is an easy review
2025-04-29 08:08:20 <hashar> the rebuild is automatzed (I'll just run ./fab deploy_docker) from the root of the repo
2025-04-29 08:08:29 <hashar> so yeah keep pinging me :]
2025-04-29 08:09:37 <jakob_WMDE> ok, that sounds promising, thanks!
2025-04-29 09:05:15 <wikibugs> ('PS9) ''Jakob: Include OpenSearch in quibble [integration/config] - ''https://gerrit.wikimedia.org/r/1137108 (https://phabricator.wikimedia.org/T386691)'
2025-04-29 09:18:07 <hashar> jakob_WMDE: I did the review of the Quibble patch https://gerrit.wikimedia.org/r/c/integration/quibble/+/1137857
2025-04-29 09:18:08 <hashar> :)
2025-04-29 09:18:17 <hashar> tldr, drop distutils :)
2025-04-29 09:18:19 <hashar> rest is fine
2025-04-29 09:18:33 <hashar> oh there might be a need to use maintenance/run.php from MediaWiki core
2025-04-29 09:18:52 <hashar> but I don't know whether it can find the maintenance script from an extension. I haven't checked :/
2025-04-29 09:24:37 <jakob_WMDE> thanks for the review! I think I tried using run.php for the CirrusSearch maintenance scripts and it didn't work out of the box, but I can take another look
2025-04-29 09:28:07 <hashar> jakob_WMDE: Quibble guarantees the extensions are cloned under $IP/extensions/*
2025-04-29 09:28:24 <hashar> but it could theorically be invoked outside of the path and end up messing things up
2025-04-29 09:28:34 <hashar> whereas maintenance/run.php would resolve the paths for us
2025-04-29 09:28:35 <hashar> but then
2025-04-29 09:28:50 <hashar> don't waste too much time on it
2025-04-29 09:29:04 <hashar> if it does not work and there is no quick/easy fix, just keep and we will use it as-is
2025-04-29 09:29:08 <hashar> distutils should be gone though
2025-04-29 09:29:36 <hashar> and in another change maybe we can roll our own copy of strtobool, but I think it is sufficient to just check for the env variable existence in order to enable the feature
2025-04-29 10:54:23 <wikibugs> ('PS9) ''Jakob: Add OpenSearch [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691)'
2025-04-29 10:58:58 <jakob_WMDE> hashar: I removed the use of distutils, but failed to get getMaintenanceScript() to find the CirrusSearch maintenance script :(
2025-04-29 11:55:13 <wikibugs> ('CR) ''Jakob: Add OpenSearch (''1 comment) [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 12:25:23 <wikibugs> ('CR) ''Hashar: Add OpenSearch (''1 comment) [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 12:25:49 <hashar> jakob_WMDE: so at least `php maintenance/run.php 'CirrusSearch\Maintenance\UpdateSearchIndexConfig'` works for me
2025-04-29 12:26:16 <hashar> I gotta debug getMaintenanceScript now :b
2025-04-29 12:27:37 <jakob_WMDE> hehe :)
2025-04-29 12:28:15 <jakob_WMDE> I'll change it to using run.php without getMaintenanceScript for now
2025-04-29 12:36:10 <wikibugs> 'Continuous-Integration-Infrastructure, ''Testing Support, ''ci-test-error (WMF-deployed Build Failure), ''MW-1.44-notes (1.44.0-wmf.23; 2025-04-01), ''Patch-For-Review: Selenium timeouts can cause the job to remain stuck until the build times out - https://phabricator.wikimedia.org/T389536#10776250 (''z...'
2025-04-29 12:36:30 <wikibugs> ('PS10) ''Jakob: Add OpenSearch [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691)'
2025-04-29 12:39:23 <wikibugs> ('CR) ''Jakob: Add OpenSearch (''2 comments) [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 12:59:14 <hashar> >>> subprocess.call(quibble.mediawiki.maintenance.getMaintenanceScript('CirrusSearch:UpdateSearchIndexConfig'))
2025-04-29 12:59:15 <hashar> Updating cluster ...
2025-04-29 12:59:15 <hashar> indexing namespaces...
2025-04-29 12:59:16 <hashar> hmm
2025-04-29 12:59:34 <hashar> jakob_WMDE: maybe because you have MW_INSTALL_PATH set to something else
2025-04-29 13:11:27 <hashar> Could not open input file: maintenance/\CirrusSearch\Maintenance\UpdateSearchIndexConfig.php
2025-04-29 13:11:30 <hashar> that is what I got
2025-04-29 13:11:38 <hashar> I am gonna fix that getMaintenanceScript()
2025-04-29 13:12:12 <jakob_WMDE> hashar: yeah, that's what I've been getting too
2025-04-29 13:12:16 <hashar> cool
2025-04-29 13:12:42 <hashar> so yeah that getMaintenanceScript does not support a class name
2025-04-29 13:14:05 <jakob_WMDE> hmm? but isn't it only telling us "Could not open input file: maintenance/\CirrusSearch\Maintenance\UpdateSearchIndexConfig.php" because it thinks there is no run.php and then tries to open it like a php file?
2025-04-29 13:14:18 <jakob_WMDE> I think it would support the class name if it had found run.php
2025-04-29 13:14:33 <hashar> yup it should ideally :b
2025-04-29 13:14:39 <hashar> else:
2025-04-29 13:14:39 <hashar> if ext == '':
2025-04-29 13:14:39 <hashar> cmd = ['php', 'maintenance/%s.php' % basename]
2025-04-29 13:15:02 <hashar> so when it is given` UpdateSearchIndexConfig`, os.path.splitext() gives no extension
2025-04-29 13:15:05 <hashar> the code enter that branch
2025-04-29 13:15:05 <jakob_WMDE> exactly, that's what ends up happening =)
2025-04-29 13:15:22 <hashar> and end up with a funky maintenance/UpdateSearchIndexConfig.php
2025-04-29 13:15:28 <hashar> so that function is broken in that regard
2025-04-29 13:16:21 <hashar> sorry for the misleading review!
2025-04-29 13:16:54 <hashar> then there is the who depends on who problem
2025-04-29 13:16:55 <jakob_WMDE> no worries! thanks for the review!
2025-04-29 13:17:05 <hashar> and maybe we need elasticsearch to be added to the image first
2025-04-29 13:17:22 <hashar> so this way we can have your change https://gerrit.wikimedia.org/r/c/integration/quibble/+/1137857 tested with the image
2025-04-29 13:17:25 <hashar> but well
2025-04-29 13:17:28 <hashar> I am lazy this day
2025-04-29 13:17:29 <hashar> s
2025-04-29 13:17:52 <jakob_WMDE> :D
2025-04-29 13:18:20 <hashar> for the docker image / supervisord config, did the trick `autostart = %(ENV_QUIBBLE_OPENSEARCH)s` work?
2025-04-29 13:19:52 <jakob_WMDE> yes, that worked!
2025-04-29 13:20:26 <jakob_WMDE> although I'm realizing that I didn't test that anymore since I changed the quibble code not to use strtobool D:
2025-04-29 13:20:48 <jakob_WMDE> i.e. I don't know what `autostart = %(ENV_QUIBBLE_OPENSEARCH)s` does when QUIBBLE_OPENSEARCH is empty/unset
2025-04-29 13:21:07 <jakob_WMDE> tries
2025-04-29 13:21:42 <hashar> ah yeah
2025-04-29 13:21:45 <hashar> what a mess
2025-04-29 13:21:45 <hashar> :/
2025-04-29 13:21:54 <hashar> I get why you went with strtobool now
2025-04-29 13:21:58 <hashar> face palms
2025-04-29 13:22:03 <hashar> palm fae
2025-04-29 13:22:04 <hashar> ce
2025-04-29 13:22:07 <hashar> whatever
2025-04-29 13:25:33 <jakob_WMDE> "Error: not a valid boolean value: '' in section 'program:opensearch'" :(
2025-04-29 13:28:26 <hashar> :(
2025-04-29 13:28:27 <hashar> sorry
2025-04-29 13:29:16 <hashar> so my other review is wrong and we need to import strtobool from distutils
2025-04-29 13:31:36 <wikibugs> ('open) ''jnuche: spiderpig: ensure each interaction is notified only once [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/778 (https://phabricator.wikimedia.org/T392487)'
2025-04-29 13:32:11 <jakob_WMDE> hashar: so the distutils dependency would be ok for now?
2025-04-29 13:32:34 <hashar> https://gist.github.com/hashar/8c08622dae4edfb8c07fb2c7d380f13f
2025-04-29 13:32:35 <hashar> :)
2025-04-29 13:33:03 <hashar> I apologize for the back and forth
2025-04-29 13:33:14 <hashar> let me add that one
2025-04-29 13:33:18 <hashar> or well now
2025-04-29 13:33:21 <hashar> it can be done in your change
2025-04-29 13:33:28 <hashar> you can stick that in quibble.utils
2025-04-29 13:33:52 <jakob_WMDE> ok! no worries :)
2025-04-29 13:33:54 <hashar> then rollback to before my misleading comment
2025-04-29 13:34:23 <hashar> and maybe leave a comment in Quibble that CI uses QUIBBLE_OPENSEARCH to set autostart=false in supervisor
2025-04-29 13:34:46 <hashar> that would prevent me from refactoring to an empty string :b
2025-04-29 13:36:04 <wikibugs> ('update) ''jnuche: spiderpig: ensure each interaction is notified only once [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/778 (https://phabricator.wikimedia.org/T392487)'
2025-04-29 13:37:03 <jakob_WMDE> next to where it's set to "false" in the dockerfile or where would be the best place for that comment?
2025-04-29 13:38:22 <hashar> well there is already a comment in supervisord.conf
2025-04-29 13:38:26 <hashar> that is probably sufficient
2025-04-29 13:38:39 <wikibugs> ('update) ''jnuche: spiderpig: ensure each interaction is notified only once [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/778 (https://phabricator.wikimedia.org/T392487)'
2025-04-29 13:39:51 <hashar> jakob_WMDE: I am polishing up the Quibble image and will build it
2025-04-29 13:39:55 <hashar> then switch the Jenkins job to them
2025-04-29 13:40:40 <jakob_WMDE> ok, thanks!
2025-04-29 13:41:52 <wikibugs> ('CR) ''Hashar: Include OpenSearch in quibble (''1 comment) [integration/config] - ''https://gerrit.wikimedia.org/r/1137108 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 13:42:02 <hashar> i never know whether I am picky
2025-04-29 13:42:13 <hashar> or have too many ideas surging and overflowing the people I review
2025-04-29 13:42:15 <hashar> or somewhere in between
2025-04-29 13:42:24 <hashar> or that some hamster in my head is spurting random ideas
2025-04-29 13:42:25 <hashar> :b
2025-04-29 13:42:40 <hashar> or it is because I should really stop multitasking
2025-04-29 13:43:59 <wikibugs> ('PS10) ''Hashar: Include OpenSearch in quibble [integration/config] - ''https://gerrit.wikimedia.org/r/1137108 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 13:44:08 <hashar> I tweaked the changelog files
2025-04-29 13:44:34 <hashar> it still using Quibble 1.13.0
2025-04-29 13:44:47 <wikibugs> ('CR) ''Hashar: [C:''+2] Include OpenSearch in quibble [integration/config] - ''https://gerrit.wikimedia.org/r/1137108 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 13:45:28 <wikibugs> ('update) ''jnuche: spiderpig: ensure each interaction is notified only once [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/778 (https://phabricator.wikimedia.org/T392487)'
2025-04-29 13:46:30 <wikibugs> ('Merged) ''jenkins-bot: Include OpenSearch in quibble [integration/config] - ''https://gerrit.wikimedia.org/r/1137108 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 13:47:30 <wikibugs> ('PS11) ''Jakob: Add OpenSearch [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691)'
2025-04-29 13:48:58 <hashar> I am building the images
2025-04-29 13:50:07 <jakob_WMDE> yay, thanks!
2025-04-29 13:50:09 <hashar> MOUAHAHAH
2025-04-29 13:50:22 <hashar> I am confused
2025-04-29 13:50:25 <jakob_WMDE> and thanks for the speedy reviews! :D
2025-04-29 13:50:29 <jakob_WMDE> oh no, what happened
2025-04-29 13:52:40 <wikibugs> ('CR) ''CI reject: [V:''-1] Add OpenSearch [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 13:52:45 <hashar> so we do not use that quibble-bullseye image
2025-04-29 13:53:00 <hashar> but we use the docker-registry.wikimedia.org/releng/quibble-buster-php74
2025-04-29 13:53:03 <hashar> which is based on Buster
2025-04-29 13:53:06 <hashar> and surely should no more be used
2025-04-29 13:53:15 <hashar> and maybe really we should drop php7.4 eventually
2025-04-29 13:53:26 <hashar> I mixed up buster/bullseye/bookworm
2025-04-29 13:53:28 <hashar> :/
2025-04-29 13:53:29 <hashar> anyway
2025-04-29 13:55:35 <jakob_WMDE> oh... sorry, I should've checked that, too :|
2025-04-29 13:56:08 <hashar> it is 100% CI fault
2025-04-29 13:56:10 <hashar> it is messy
2025-04-29 14:03:02 <wikibugs> ('PS1) ''Hashar: Add job to test Quibble with OpenSearch [integration/config] - ''https://gerrit.wikimedia.org/r/1139866'
2025-04-29 14:03:43 <wikibugs> ('PS2) ''Hashar: Add job to test Quibble with OpenSearch [integration/config] - ''https://gerrit.wikimedia.org/r/1139866 (https://phabricator.wikimedia.org/T386691)'
2025-04-29 14:04:12 <wikibugs> ('PS1) ''Hashar: ci: add script to test OpenSearch [integration/quibble] - ''https://gerrit.wikimedia.org/r/1139867'
2025-04-29 14:04:45 <hashar> jakob_WMDE: I am adding a new CI job to integration/quibble. It uses the image you have prepared (I have finished building it) and invoke utils/ci-opensearch.sh
2025-04-29 14:05:04 <hashar> https://gerrit.wikimedia.org/r/c/integration/quibble/+/1139867/1/utils/ci-opensearch.sh
2025-04-29 14:05:13 <jakob_WMDE> nice, thanks!
2025-04-29 14:06:09 <wikibugs> ('CR) ''Hashar: [C:''+2] Add job to test Quibble with OpenSearch [integration/config] - ''https://gerrit.wikimedia.org/r/1139866 (https://phabricator.wikimedia.org/T386691) (owner: ''Hashar)'
2025-04-29 14:08:02 <wikibugs> ('Merged) ''jenkins-bot: Add job to test Quibble with OpenSearch [integration/config] - ''https://gerrit.wikimedia.org/r/1139866 (https://phabricator.wikimedia.org/T386691) (owner: ''Hashar)'
2025-04-29 14:10:55 <wikibugs> ('PS12) ''Jakob: Add OpenSearch [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691)'
2025-04-29 14:21:11 <jakob_WMDE> ugh, "TimeoutError: Could not connect to port 9200 after 20 seconds" :/
2025-04-29 14:21:35 <jakob_WMDE> but I think the fact that it didn't exit with a bad status before that means that it's just slow?
2025-04-29 14:23:19 <wikibugs> ('PS1) ''Hashar: zuul: set QUIBBLE_OPENSEARCH for Quibble opensearch job [integration/config] - ''https://gerrit.wikimedia.org/r/1139870 (https://phabricator.wikimedia.org/T386691)'
2025-04-29 14:23:27 <hashar> jakob_WMDE: ^:)
2025-04-29 14:23:50 <hashar> that is to make CI to set QUIBBLE_OPENSEARCH=true
2025-04-29 14:23:54 <hashar> on that fullrun opensearch job
2025-04-29 14:24:01 <hashar> that needs to happen when supervisord starts
2025-04-29 14:24:50 <hashar> that should be the correct one
2025-04-29 14:24:55 <jakob_WMDE> I'm confused. it looks like it was already trying to start in https://integration.wikimedia.org/ci/job/integration-quibble-fullrun-opensearch-php74/2/console
2025-04-29 14:25:06 <hashar> oh
2025-04-29 14:25:20 <wikibugs> ('Abandoned) ''Hashar: zuul: set QUIBBLE_OPENSEARCH for Quibble opensearch job [integration/config] - ''https://gerrit.wikimedia.org/r/1139870 (https://phabricator.wikimedia.org/T386691) (owner: ''Hashar)'
2025-04-29 14:25:28 <wikibugs> ('update) ''jnuche: spiderpig: ensure each interaction is notified only once [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/778 (https://phabricator.wikimedia.org/T392487)'
2025-04-29 14:26:34 <hashar> could it be listening on an other port?
2025-04-29 14:26:41 <hashar> or maybe it takes more than 20 seconds to start
2025-04-29 14:26:42 <wikibugs> ('update) ''jnuche: spiderpig: ensure each interaction is notified only once [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/778 (https://phabricator.wikimedia.org/T392487)'
2025-04-29 14:27:11 <wikibugs> ('update) ''jnuche: spiderpig: ensure each interaction is notified only once [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/778 (https://phabricator.wikimedia.org/T392487)'
2025-04-29 14:27:16 <jakob_WMDE> pretty sure the port is correct
2025-04-29 14:27:26 <wikibugs> ('CR) ''CI reject: [V:''-1] Add OpenSearch [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 14:28:13 <jakob_WMDE> taking more than 20s could be. it took longer than 10s on my laptop
2025-04-29 14:32:39 <wikibugs> ('update) ''jnuche: spiderpig: ensure each interaction is notified only once [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/778 (https://phabricator.wikimedia.org/T392487)'
2025-04-29 14:35:17 <hashar> well you can try raising it
2025-04-29 14:35:39 <hashar> also my child change https://gerrit.wikimedia.org/r/c/integration/quibble/+/1139867 should be squashed into your change
2025-04-29 14:39:23 <wikibugs> ('PS13) ''Jakob: Add OpenSearch [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691)'
2025-04-29 14:43:28 <wikibugs> ('PS14) ''Jakob: Add OpenSearch [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691)'
2025-04-29 14:50:02 <hashar> jakob_WMDE: ah
2025-04-29 14:50:15 <hashar> the job uses buster-php74:1.13.0-s1
2025-04-29 14:51:12 <jakob_WMDE> oh, and we still only have it in the bullseye image?
2025-04-29 14:51:15 <jakob_WMDE> that would explain it :)
2025-04-29 14:51:34 <hashar> I screwed it up
2025-04-29 14:58:00 <hashar> oh
2025-04-29 14:58:05 <hashar> I think I have found the issue
2025-04-29 14:59:33 <wikibugs> ('CR) ''CI reject: [V:''-1] Add OpenSearch [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 14:59:59 <jakob_WMDE> :O
2025-04-29 15:00:02 <jakob_WMDE> what is it?
2025-04-29 15:00:10 <wikibugs> ('PS1) ''Hashar: jjb: fix Quibble fullrun always having buster-php74 image [integration/config] - ''https://gerrit.wikimedia.org/r/1139881'
2025-04-29 15:00:58 <wikibugs> 'Release-Engineering-Team, ''Scap: Strange scap error after check_testservers_k8s-1_of_2 after running sync-file - https://phabricator.wikimedia.org/T392910 (''sbassett) ''NEW'
2025-04-29 15:01:27 <hashar> jakob_WMDE: the job template was hardcoded with buster-php74
2025-04-29 15:01:34 <hashar> php81 got added later but did not remove the hardcoded value
2025-04-29 15:01:42 <hashar> I went to do the same and bam
2025-04-29 15:01:47 <jakob_WMDE> ah :D
2025-04-29 15:02:45 <hashar> I have updated the job
2025-04-29 15:02:59 <wikibugs> ('CR) ''Hashar: "The job looks good now!" [integration/config] - ''https://gerrit.wikimedia.org/r/1139881 (owner: ''Hashar)'
2025-04-29 15:03:12 <wikibugs> ('CR) ''Hashar: [C:''+2] jjb: fix Quibble fullrun always having buster-php74 image [integration/config] - ''https://gerrit.wikimedia.org/r/1139881 (owner: ''Hashar)'
2025-04-29 15:03:45 <wikibugs> ('CR) ''Hashar: "recheck after https://gerrit.wikimedia.org/r/c/integration/config/+/1139881"; [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 15:03:56 <hashar> I haven't done those kind of stuff for quite a while
2025-04-29 15:03:58 <hashar> I am rusty
2025-04-29 15:03:58 <wikibugs> ('open) ''dancy: log.py: @version should be "1" [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/779'
2025-04-29 15:04:01 <wikibugs> ('update) ''dancy: log.py: @version should be "1" [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/779'
2025-04-29 15:04:41 <hashar> jakob_WMDE: I am in a meeting then will check a bit the state of mediawiki train
2025-04-29 15:04:47 <hashar> so we can pursue tomorrow
2025-04-29 15:04:47 <wikibugs> ('Merged) ''jenkins-bot: jjb: fix Quibble fullrun always having buster-php74 image [integration/config] - ''https://gerrit.wikimedia.org/r/1139881 (owner: ''Hashar)'
2025-04-29 15:05:10 <jakob_WMDE> sounds good. I also have to sign off now
2025-04-29 15:05:18 <jakob_WMDE> hashar: thanks for all the help! <3
2025-04-29 15:06:31 <wikibugs> ('update) ''dancy: log.py: @version should be "1" [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/779'
2025-04-29 15:10:55 <wikibugs> 'Release-Engineering-Team, ''Scap: Strange scap error after check_testservers_k8s-1_of_2 after running sync-file - https://phabricator.wikimedia.org/T392910#10776917 (''dancy) →''Duplicate dup:''T380958'
2025-04-29 15:10:59 <wikibugs> 'Deployments, ''Release-Engineering-Team (Radar), ''serviceops, ''Wikimedia-production-error: httpb sometimes fails upon deployment with a HTTP 503 - https://phabricator.wikimedia.org/T380958#10776919 (''dancy)'
2025-04-29 15:17:45 <wikibugs> 'Release-Engineering-Team, ''Scap, ''Dumps-Generation: scap needs to be k8s-cluster aware - https://phabricator.wikimedia.org/T388761#10776929 (''Scott_French) @brouberol - This would require changes to scap, specifically the ability to override the set of environments relevant to a particular deployment (r...'
2025-04-29 15:20:07 <wikibugs> ('Restored) ''Hashar: zuul: set QUIBBLE_OPENSEARCH for Quibble opensearch job [integration/config] - ''https://gerrit.wikimedia.org/r/1139870 (https://phabricator.wikimedia.org/T386691) (owner: ''Hashar)'
2025-04-29 15:21:56 <wikibugs> ('CR) ''Hashar: [C:''+2] "QUIBBLE_OPENSEARCH needs to be set when starting the container since supervisord relies on it to start opensearch and it is the entry poin" [integration/config] - ''https://gerrit.wikimedia.org/r/1139870 (https://phabricator.wikimedia.org/T386691) (owner: ''Hashar)'
2025-04-29 15:23:25 <wikibugs> ('Merged) ''jenkins-bot: zuul: set QUIBBLE_OPENSEARCH for Quibble opensearch job [integration/config] - ''https://gerrit.wikimedia.org/r/1139870 (https://phabricator.wikimedia.org/T386691) (owner: ''Hashar)'
2025-04-29 15:26:19 <wikibugs> ('CR) ''Hashar: "recheck after having CI to set QUIBBLE_OPENSEARCH before starting the container/supervisord (I7e4ea39c77719eda1ff096ea94789aaa63271597)" [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 15:38:13 <wikibugs> ('open) ''hnowlan: mw-cli:scripts: add case for mwscriptwikiset [repos/releng/release] - ''https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/168 (https://phabricator.wikimedia.org/T392441)'
2025-04-29 16:01:41 <hnowlan> o/ could I get a review on https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/168 please? I don't have merge rights so a second authoritative set of eyes would be nice
2025-04-29 16:03:13 <wikibugs> ('merge) ''dancy: mw-cli:scripts: add case for mwscriptwikiset [repos/releng/release] - ''https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/168 (https://phabricator.wikimedia.org/T392441) (owner: ''hnowlan)'
2025-04-29 16:12:17 <hnowlan> thanks dancy!
2025-04-29 16:25:52 <mutante> hashar: this is kind of for your wish https://gerrit.wikimedia.org/r/c/operations/puppet/+/1137840/7
2025-04-29 16:31:11 <wikibugs> ('PS15) ''Hashar: Add OpenSearch [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 16:53:43 <hashar> mutante: reviewed! :)
2025-04-29 16:53:53 <hashar> I am off for dinner+night etc
2025-04-29 16:58:04 <mutante> oh, hah! good point that I was pirating the ASCII art:)
2025-04-29 16:58:23 <mutante> adding Apache license just became automatic without questioning it
2025-04-29 16:58:37 <wikibugs> ('CR) ''Hashar: "I have changed the _tcp_wait to poke `127.0.0.1` rather than `localhost` and that solved it. I am pretty sure I previously had the issue " [integration/quibble] - ''https://gerrit.wikimedia.org/r/1137857 (https://phabricator.wikimedia.org/T386691) (owner: ''Jakob)'
2025-04-29 16:59:00 <mutante> but adding an "echo" and a shebang makes it a new work :p jk
2025-04-29 17:06:03 <dancy> Don't wanna get sued by a piece of software.
2025-04-29 17:08:16 <dancy> AI laywer
2025-04-29 17:08:20 <dancy> *lawyer
2025-04-29 17:09:55 <mutante> lol, yea. I also feel like we are the first ever to care about the license of the cowsay output but it's true.
2025-04-29 17:10:20 <mutante> also dont want to get into discussion with WMF-internal whether Artistic license is ok with us
2025-04-29 17:23:12 <wmf-insecte> Project mediawiki-core-doxygen build #10047: 'FAILURE in 4 min 47 sec: https://integration.wikimedia.org/ci/job/mediawiki-core-doxygen/10047/'
2025-04-29 17:48:04 <wikibugs> 'Continuous-Integration-Infrastructure, ''Developer Productivity: Provide recheck option for failed jobs - https://phabricator.wikimedia.org/T392941 (''Jdlrobson-WMF) ''NEW'
2025-04-29 17:48:58 <wikibugs> 'Continuous-Integration-Infrastructure, ''Developer Productivity: Provide recheck option for only failed jobs - https://phabricator.wikimedia.org/T392941#10777830 (''Jdlrobson-WMF)'
2025-04-29 17:55:05 <wm-bot> I trust: urbanecm!.*@user/urbanecm ('admin'), .*@user/urbanecmbackup/x-3733651 ('admin'), .*@wikimedia/Martin-Urbanec ('admin'),
2025-04-29 17:55:05 <bd808> @trusted
2025-04-29 18:22:09 <wikibugs> 'Continuous-Integration-Infrastructure (Zuul upgrade): Setup IRC channel for discussion and coordiation - https://phabricator.wikimedia.org/T392945 (''bd808) ''NEW'
2025-04-29 18:23:40 <wikibugs> 'Continuous-Integration-Infrastructure (Zuul upgrade): Setup IRC channel for discussion and coordiation - https://phabricator.wikimedia.org/T392945#10777944 (''bd808) ''Open→''In progress a:''bd808 https://meta.wikimedia.org/wiki/IRC/Instructions#Instructions_for_channel_ops `lang=irc /join #wikimedia-zuu...'
2025-04-29 18:26:37 <wikibugs> 'Continuous-Integration-Infrastructure (Zuul upgrade): Setup IRC channel for discussion and coordiation - https://phabricator.wikimedia.org/T392945#10777950 (''bd808) https://gitlab.wikimedia.org/toolforge-repos/ircservserv-config/-/merge_requests/24 `lang=irc [16:57] < bd808> !isspull [16:57] <ircservser...'
2025-04-29 18:28:20 <wikibugs> 'Continuous-Integration-Infrastructure (Zuul upgrade): Setup IRC channel for discussion and coordiation - https://phabricator.wikimedia.org/T392945#10777951 (''bd808) https://meta.wikimedia.org/wiki/Wm-bot `lang=irc [17:54] < bd808> @add #wikimedia-zuul [17:54] < wm-bot> Attempting to join #wikimedia-zuu...'
2025-04-29 18:29:07 <wikibugs> 'Continuous-Integration-Infrastructure (Zuul upgrade): Setup IRC channel for discussion and coordiation - https://phabricator.wikimedia.org/T392945#10777953 (''bd808) https://wmopbot.toolforge.org/help `lang=irc [18:04] < bd808> !join #wikimedia-zuul [18:04] < wmopbot> Joined `'
2025-04-29 18:30:14 <wikibugs> 'Continuous-Integration-Infrastructure (Zuul upgrade): Setup IRC channel for discussion and coordiation - https://phabricator.wikimedia.org/T392945#10777956 (''bd808) https://wikitech.wikimedia.org/wiki/Tool:Stashbot#Joining_a_new_channel `lang=irc [18:11] < wm-bot> !log bd808@tools-bastion-12 tools.stashbot...'
2025-04-29 18:30:32 <wikibugs> 'Continuous-Integration-Infrastructure (Zuul upgrade): Setup IRC channel for discussion and coordiation - https://phabricator.wikimedia.org/T392945#10777959 (''bd808) ''In progress→''Resolved https://meta.wikimedia.org/w/index.php?title=IRC/Channels&diff=prev&oldid=28635289'
2025-04-29 19:25:29 <urbanecm> bd808: wanna do something with the wm-bot in here?
2025-04-29 19:25:38 <urbanecm> I got pinged by the trusted listing
2025-04-29 19:27:20 <wikibugs> 'Scap (SpiderPig šŸ•øļø), ''Infrastructure-Foundations: Add deployment group users to spiderpig-access ldap - https://phabricator.wikimedia.org/T392958 (''thcipriani) ''NEW'
2025-04-29 19:39:50 <wikibugs> 'Release-Engineering-Team, ''Projects-Cleanup, ''translatewiki.net, ''Essential-Work: Archive the analytics/gobblin-wmf Gerrit repository - https://phabricator.wikimedia.org/T392854#10778237 (''amastilovic) Thank you @thcipriani !'
2025-04-29 19:40:22 <wikibugs> 'Continuous-Integration-Infrastructure (Zuul upgrade): Setup IRC channel for Zuul Upgrade discussion and coordination - https://phabricator.wikimedia.org/T392945#10778241 (''Aklapper)'
2025-04-29 19:42:49 <Daimona> Hey folks! I have a selenium job that is currently hanging: https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php74-selenium/88019/console. Would someone be willing to gather some debug information from the agent, for T389536?
2025-04-29 19:42:50 <stashbot> T389536: Selenium timeouts can cause the job to remain stuck until the build times out - https://phabricator.wikimedia.org/T389536
2025-04-29 19:43:36 <Daimona> Uhm actually, maybe it isn't stuck. But still, something is wrong with that job, so debug information would help. Chrome logs in particular.
2025-04-29 19:44:44 <Daimona> (And specifically see if we still get the crash observed in https://phabricator.wikimedia.org/T389536#10675707)
2025-04-29 19:49:22 <wikibugs> 'Release-Engineering-Team, ''Projects-Cleanup, ''Essential-Work: Archive the analytics/gobblin-wmf Gerrit repository - https://phabricator.wikimedia.org/T392854#10778251 (''Amire80)'
2025-04-29 19:51:40 <Daimona> Also, https://grafana.wmcloud.org/d/0g9N-7pVz/cloud-vps-project-board?orgId=1&var-project=integration&var-instance=All seems to be on fire
2025-04-29 20:10:24 <thcipriani> yes...there are a massive number of ffmpeg processes running on integration-agent-docker-1048
2025-04-29 20:10:36 <thcipriani> load average: 232.20, 234.51, 231.36
2025-04-29 20:14:30 <Daimona> I'm filing a task to document it. The obvious culprit would be the core patch that runs 100x selenium
2025-04-29 20:15:05 <thcipriani> wmf-quibble-selenium-php81 is the job that's running, currently
2025-04-29 20:15:27 <thcipriani> trying to gather more but the box is...hard to use :)
2025-04-29 20:17:02 <wikibugs> 'Continuous-Integration-Infrastructure: CI is overwhelmed and lots of jobs are failing randomly (2025-04-29) - https://phabricator.wikimedia.org/T392963 (''Daimona) ''NEW'
2025-04-29 20:17:02 <Daimona> Task filed: T392963
2025-04-29 20:17:03 <stashbot> T392963: CI is overwhelmed and lots of jobs are failing randomly (2025-04-29) - https://phabricator.wikimedia.org/T392963
2025-04-29 20:17:07 <mutante> thcipriani: got root. want me to killall ffmpeg?
2025-04-29 20:17:20 <Daimona> Now I can kill the offending job with a task reference and feel better ;)
2025-04-29 20:18:53 <thcipriani> ah, wait, looks like it was running two jobs, probably for the same patch, one for mediawiki-quibble-selenium-vendor-mysql-php74 as well
2025-04-29 20:19:28 <Daimona> Killed both jobs, let's see.
2025-04-29 20:19:58 <mutante> ffmpeg processes still running..so far
2025-04-29 20:20:16 <thcipriani> well this'd do it :D https://gerrit.wikimedia.org/r/c/mediawiki/core/+/721790/25/package.json
2025-04-29 20:21:06 <Daimona> Normally that'd be fine. But that patch is in conjunction with a wdio version bump which I think breaks the ffmpeg termination logic.
2025-04-29 20:23:10 <wikibugs> 'Continuous-Integration-Infrastructure: CI is overwhelmed and lots of jobs are failing randomly (2025-04-29) - https://phabricator.wikimedia.org/T392963#10778330 (''Daimona) From `#wikimedia-releng`: ` <thcipriani> yes...there are a massive number of ffmpeg processes running on integration-agent-docker-1048 <t...'
2025-04-29 20:23:44 <Daimona> That didn't do much huh?
2025-04-29 20:24:24 <mutante> it sure is busy and swapping. but doesnt have a disk space issue and I can use the shell.
2025-04-29 20:24:59 <Daimona> Are there still ffmpeg processes running?
2025-04-29 20:25:14 <mutante> yes. they are still there. kill?
2025-04-29 20:25:23 <thcipriani> could probably kill the docker container
2025-04-29 20:25:41 <Daimona> If there are too many of them, yeah, I'd say kill.
2025-04-29 20:26:42 <mutante> ps aux | grep ffmpeg | wc -l
2025-04-29 20:26:42 <mutante> 59
2025-04-29 20:26:48 <mutante> killall -9 ffmpeg
2025-04-29 20:26:48 <mutante> root@integration-agent-docker-1048:/var/log# ps aux | grep ffmpeg | wc -l
2025-04-29 20:26:51 <mutante> 1
2025-04-29 20:27:12 <Daimona> Great, thanks.
2025-04-29 20:27:20 <mutante> -9 shouldn't be necessary but it was
2025-04-29 20:27:41 <Daimona> I don't understand though: why is a single agent bringing everything down? Isn't there supposed to be any safeguard?
2025-04-29 20:28:19 <mutante> !log integration-agent-docker-1048.integration - killall -9 ffpmeg - T392963
2025-04-29 20:28:26 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
2025-04-29 20:28:26 <stashbot> T392963: CI is overwhelmed and lots of jobs are failing randomly (2025-04-29) - https://phabricator.wikimedia.org/T392963
2025-04-29 20:29:50 <mutante> that grafana board you linked earlier.. load going down. but still shows that other VMs are down
2025-04-29 20:29:54 <Daimona> Also... I'm getting a 403 for https://integration.wikimedia.org/ci/computer/integration%2Dagent%2Ddocker%2D1048/builds
2025-04-29 20:30:25 <mutante> -1062 - reported as down
2025-04-29 20:30:40 <mutante> -1063 - down
2025-04-29 20:31:00 <Daimona> I seem to recall a jenkins feature that we disabled and responds with 403, which was discussed recently somewhere. Is this it?
2025-04-29 20:31:40 <mutante> aha, we now have 28 instances up. just a minute ago it was only 25 up
2025-04-29 20:32:17 <Daimona> They seem to be recovering, yes.
2025-04-29 20:35:58 <mutante> https://integration.wikimedia.org/ci/computer/ - 0 of 3 executors busy now
2025-04-29 20:35:59 <wikibugs> 'Release-Engineering-Team, ''serviceops: train presync failed - https://phabricator.wikimedia.org/T387823#10778370 (''akosiaris) Change to allow #release-engineering-team members to start train-presync, train-clean and view logs has been merged and deployed.'
2025-04-29 20:37:09 <mutante> Daimona: would it make sense to click rebuild on your original selenium job now?
2025-04-29 20:37:16 <mutante> https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php74-selenium/88019/console ?
2025-04-29 20:38:35 <Daimona> The patch is already in gate-and-submit, so waiting to be merged... Sooner or later. Maybe later...
2025-04-29 20:39:06 <mutante> ok
2025-04-29 20:39:34 <Daimona> I've aborted https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-mysql-php74/25384/console to help unblock the queue
2025-04-29 20:39:41 <Daimona> Other patches for that change were already failing.
2025-04-29 20:39:42 <bd808> urbanecm: I was just checking the current @trusted list here to see the config. I was setting up things in a new #wikimedia-zuul channel and trying to remember what was common.
2025-04-29 20:41:08 <Daimona> I should've said: I clicked the thingy to abort the job. But it doesn't seem to be responding.
2025-04-29 20:42:02 <Daimona> Alright, it suddenly did. This confirms that calling jerkins out in IRC is surprisingly effective at unblocking stuff.
2025-04-29 20:43:03 <mutante> ;)
2025-04-29 20:43:43 <bd808> It never hurts to tell Jerkins to behave :)
2025-04-29 20:43:58 <mutante> is it weird why this new one failed ? https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1139932
2025-04-29 20:44:19 <mutante> FAILURE No change detected against the current configuration.
2025-04-29 20:44:30 <mutante> change looks like it does change configuration
2025-04-29 20:46:43 <Daimona> The file name is the same, so that's probably interpreted as unchanged config?
2025-04-29 20:47:14 <mutante> if there was an IRC command to tell Jenkins to behave it should be something like !Leeeeroy
2025-04-29 20:47:40 <mutante> nod, was just curious and to check if CI works as normal now
2025-04-29 20:47:57 <Daimona> This looks problematic though https://integration.wikimedia.org/ci/job/mwext-php74-phan/92825/console
2025-04-29 20:48:21 <Daimona> Phan was killed due to low memory, but this is from just a few minutes ago.
2025-04-29 20:49:45 <Daimona> agent 1051 still struggling it seems https://grafana.wmcloud.org/d/0g9N-7pVz/cloud-vps-project-board?orgId=1&var-project=integration&var-instance=integration-agent-docker-1051&from=now-24h&to=now
2025-04-29 20:50:26 <Daimona> But reported as idle https://integration.wikimedia.org/ci/computer/integration-agent-docker-1051/
2025-04-29 20:50:37 <Daimona> Alright, how many ffmpeg processes running there?
2025-04-29 20:51:38 <Daimona> (Also, LOL for the !Leeeeeroy)
2025-04-29 20:57:27 <Daimona> Also apparently low on memory: 1040, 1041, 1044, 1047, 1051, 1062, 1063, 1064
2025-04-29 20:58:08 <Daimona> And by "low" I mean that the line in the graph is touching the X axis
2025-04-29 20:59:47 <Daimona> May be worth checking them to see if there's a suspicious amount of ffmpeg processes running. There should never ever be more than 10-15 on a single agent at once (and that's already a worst-case scenario).
2025-04-29 21:03:26 <mutante> back. logging in on 1051 ..fails
2025-04-29 21:03:50 <Daimona> Nice!
2025-04-29 21:03:51 <mutante> my bad. got shell.
2025-04-29 21:03:58 <mutante> yes, lots of ffmpeg
2025-04-29 21:04:16 <mutante> killing them
2025-04-29 21:04:51 <mutante> !log integration-agent-docker-1051.integration - killall -9 ffmpeg - T392963
2025-04-29 21:04:53 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
2025-04-29 21:04:53 <stashbot> T392963: CI is overwhelmed and lots of jobs are failing randomly (2025-04-29) - https://phabricator.wikimedia.org/T392963
2025-04-29 21:05:24 <Daimona> The incantation seems to have worked: https://grafana.wmcloud.org/d/0g9N-7pVz/cloud-vps-project-board?orgId=1&var-project=integration&var-instance=integration-agent-docker-1051&from=now-15m&to=now&viewPanel=40
2025-04-29 21:07:04 <mutante> load average: 33.52, 136.06, 189.46
2025-04-29 21:08:34 <wikibugs> 'Continuous-Integration-Infrastructure: CI is overwhelmed and lots of jobs are failing randomly (2025-04-29) - https://phabricator.wikimedia.org/T392963#10778428 (''Daimona) FTR, this is being discussed in IRC: https://wm-bot.wmcloud.org/browser/index.php?start=04%2F29%2F2025&end=04%2F29%2F2025&display=%23wikim...'
2025-04-29 21:08:35 <mutante> looks like something restarted it
2025-04-29 21:08:51 <Daimona> how many?
2025-04-29 21:09:05 <mutante> only 24
2025-04-29 21:10:21 <mutante> its running ffmpeg but seems like that's about half
2025-04-29 21:10:47 <Daimona> But... The agent is idle https://integration.wikimedia.org/ci/computer/integration-agent-docker-1051/
2025-04-29 21:11:36 <Daimona> Maybe killall again? Meanwhile I'm aborting jobs where there's already a failure
2025-04-29 21:11:50 <mutante> ok
2025-04-29 21:12:13 <mutante> done. down to 1.
2025-04-29 21:12:23 <mutante> ehm. 0 :)
2025-04-29 21:13:03 <mutante> the top process is now git
2025-04-29 21:13:13 <mutante> npm ci
2025-04-29 21:13:31 <Daimona> Makes sense, now it's running a real job
2025-04-29 21:14:02 <mutante> yea, looks like normal, no ffmpeg.. instead php, lua and whatnot
2025-04-29 21:14:11 <Daimona> So I'm assuming the other agents I listed above are also flooded by ffmpeg?
2025-04-29 21:14:26 <Daimona> And those would be: 1040, 1041, 1044, 1047, 1062, 1063, 1064
2025-04-29 21:14:54 <mutante> checking
2025-04-29 21:16:45 <mutante> 1040 ā˜‘ļø
2025-04-29 21:17:21 <Daimona> Yep, it went stonks https://grafana.wmcloud.org/d/0g9N-7pVz/cloud-vps-project-board?orgId=1&var-project=integration&var-instance=integration-agent-docker-1040&viewPanel=4&from=now-5m&to=now
2025-04-29 21:18:00 <mutante> 1041 ā˜‘ļø
2025-04-29 21:18:35 <wikibugs> 'Phabricator (Upstream), ''Upstream: Modified files not counted in total when attaching files - https://phabricator.wikimedia.org/T380361#10778444 (''valerio.bozzolan) @Mahabarata73 thanks again for your report. Can I ask how have you discovered this problem? Are you a translator? (I think yes) Or, have you...'
2025-04-29 21:19:05 <mutante> 1044 ā˜‘ļø
2025-04-29 21:19:37 <mutante> on each of them: yes, ffmpeg, and killall more than once
2025-04-29 21:19:50 <mutante> they come back though to some extent
2025-04-29 21:19:53 <Daimona> Sigh
2025-04-29 21:20:05 <Daimona> Some of those agents are running actual jobs so that's expected
2025-04-29 21:20:22 <wikibugs> ('open) ''dancy: spiderpig: Send HTTP access log to syslog if use_syslog enabled [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/780'
2025-04-29 21:20:25 <wikibugs> ('update) ''dancy: spiderpig: Send HTTP access log to syslog if use_syslog enabled [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/780'
2025-04-29 21:20:45 <mutante> 1047 - not busy, no action
2025-04-29 21:21:04 <Daimona> But even if an agent happens to be running 3 selenium jobs (3 being our current max concurrency), and each of those has parallel selenium enabled with 4 threads, there should never be more than 12 ffmpeg processes at any given time
2025-04-29 21:21:25 <mutante> ok, good to know 12 is the number
2025-04-29 21:22:12 <Daimona> Sorry, 1047 was fine. It's 1048 that seems busy again
2025-04-29 21:22:25 <mutante> 1062 - was very busy. killed ffmpeg. now 5 processes
2025-04-29 21:22:30 <Daimona> It seems to be choking slowly https://grafana.wmcloud.org/d/0g9N-7pVz/cloud-vps-project-board?orgId=1&var-project=integration&var-instance=integration-agent-docker-1048&from=now-1h&to=now&viewPanel=40
2025-04-29 21:23:21 <mutante> 1048 - number of ffmpeg proces = 48. killing
2025-04-29 21:24:01 <mutante> now 4 procs
2025-04-29 21:26:01 <mutante> 1063 - 55 ffmpeg procs - killall'ed
2025-04-29 21:28:23 <mutante> 1064 - 52 ffmpeg procs - killall'ed
2025-04-29 21:28:35 <mutante> that's it? last one was the slowest to even get on
2025-04-29 21:28:58 <wikibugs> ('update) ''dancy: spiderpig: Send HTTP access log to syslog if use_syslog enabled [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/780'
2025-04-29 21:29:26 <Daimona> Thanks! All instances are up now.
2025-04-29 21:29:33 <mutante> great
2025-04-29 21:29:33 <Daimona> Let me go through them one by one again
2025-04-29 21:31:00 <Daimona> Is 1044 ok? ~50% available memory but reportedly idle
2025-04-29 21:31:28 <wikibugs> 'Continuous-Integration-Infrastructure: CI is overwhelmed and lots of jobs are failing randomly (2025-04-29) - https://phabricator.wikimedia.org/T392963#10778467 (''Dzahn) ` 21:14 < Daimona> So I'm assuming the other agents I listed above are also flooded by ffmpeg? 21:14 < Daimona> And those would be: 1040, 10...'
2025-04-29 21:31:31 <wikibugs> ('update) ''dancy: spiderpig: Send HTTP access log to syslog if use_syslog enabled [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/780'
2025-04-29 21:32:24 <Daimona> 1051 is quite low on memory too, but currently running stuff so will come back to it later.
2025-04-29 21:34:17 <Daimona> Checked all of them and the rest is fine. I would double-check 1044 and 1051 for ffmpeg processes
2025-04-29 21:34:24 <mutante> Daimona: 1051 - also has 50 ffmpeg
2025-04-29 21:34:29 <Daimona> ...
2025-04-29 21:34:53 <Daimona> Maximum allowed given current jobs is 0, and I think 50 > 0
2025-04-29 21:35:08 <mutante> 1044 - 17 ffmpegs
2025-04-29 21:35:17 <mutante> killed on 1051, left 1044 alone
2025-04-29 21:35:28 <Daimona> Maximum allowed for 1044 is also 0 given current jobs
2025-04-29 21:35:53 <mutante> killed on 1044 as well
2025-04-29 21:36:26 <Daimona> Thank you! Will keep checking the graphs for both, just in case they go stonks again
2025-04-29 21:37:48 <Daimona> I think we should set a timeout when spawning those ffmpeg jobs anyway. I'll do that.
2025-04-29 21:38:48 <wikibugs> ('merge) ''dancy: spiderpig: ensure each interaction is notified only once [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/778 (https://phabricator.wikimedia.org/T392487) (owner: ''jnuche)'
2025-04-29 21:44:18 <mutante> sounds good. ty, out for lunch now
2025-04-29 21:49:57 <wikibugs> ('open) ''volker-e: releases: Bump Codex to 2.0.0-rc.1 [repos/ci-tools/libup-config] - ''https://gitlab.wikimedia.org/repos/ci-tools/libup-config/-/merge_requests/73 (https://phabricator.wikimedia.org/T391012)'
2025-04-29 21:54:48 <Daimona> Leaked picture of agent-1051 right now: https://phabricator.wikimedia.org/F59561871
2025-04-29 21:55:27 <Daimona> 1044 not looking too healthy either
2025-04-29 22:00:32 <mutante> lol @ meme. cleaned up! but at this point it seems like it will come back anyways?
2025-04-29 22:01:31 <Daimona> I imagine these could be from the selenium test retries, so it should stop eventually
2025-04-29 22:01:40 <Daimona> As we only allow 1 retry for each test
2025-04-29 22:02:08 <bd808> selenium is the gift that just keeps on giving
2025-04-29 22:02:11 <mutante> ok
2025-04-29 22:02:20 <Daimona> it really is
2025-04-29 22:03:06 <Daimona> Ahem, what's going on in 1040?
2025-04-29 22:04:21 <mutante> had 43 processes. not anymore
2025-04-29 22:04:49 <Daimona> 1062 also...
2025-04-29 22:05:01 <Daimona> and 1064
2025-04-29 22:05:14 <Daimona> they do keep coming back it seems...
2025-04-29 22:06:31 <Daimona> old-man-yells-at-ffmpeg.jpg
2025-04-29 22:06:38 <mutante> yes, it does. always the same issue again
2025-04-29 22:08:06 <mutante> I did those 2 as well but yea...
2025-04-29 22:08:46 <Daimona> 1048 also not looking well
2025-04-29 22:08:56 <bd808> Does ffmpeg run outside of a Docker container for these tests? Trying to reason about where the processes would leak and how we could clean them up.
2025-04-29 22:09:05 <Daimona> There must be a nicer way to do this right?
2025-04-29 22:09:19 <mutante> yes, it is not inside a container
2025-04-29 22:09:43 <mutante> they are just ffmpeg processes run by user nobody
2025-04-29 22:10:25 <mutante> ffmpeg -f x11grab -video_size 1280x1024 -i :94 -loglevel error -y -pix_fmt yuv420p /workspace/log/API-Missing-Page-should-not-exist-2025-04-29T21-49-01-513Z.mp4
2025-04-29 22:10:56 <Daimona> That is https://gerrit.wikimedia.org/g/mediawiki/core/+/d3090254b0e8b2284b100d77e32c18155df75f0a/tests/selenium/wdio-mediawiki/index.js#65
2025-04-29 22:12:59 <mutante> stopVideo is ffmpeg.kill( 'SIGINT' ); ..
2025-04-29 22:13:21 <mutante> maybe I should send signal 2 (SIGINT) to properly stop them then
2025-04-29 22:14:19 <mutante> do you think they come back because they know they failed to complete the command
2025-04-29 22:16:28 <Daimona> Maybe? I thought it was due to test retries, but on second thought, that is not possible.
2025-04-29 22:17:15 <mutante> on instance 1064 - tried it, sent a SIGINT (2). killall -2. this makes them stop but not abruptly all at once
2025-04-29 22:18:02 <mutante> this reminds me of a classic quote
2025-04-29 22:18:12 <mutante> "Generally, send 15, and wait a second or two, and if that doesn't work, send 2, and if that doesn't work, send 1. If that doesn't, REMOVE THE BINARY because the program is badly behaved! " Don't use kill -9. Don't bring out the combine harvester just to tidy up the flower pot. "
2025-04-29 22:20:43 <mutante> watching the number of processes. it just crossed the 12 threshold :/
2025-04-29 22:21:37 <Daimona> Eeeeeew
2025-04-29 22:22:39 <Daimona> Is there a node process also?
2025-04-29 22:22:48 <mutante> should I see the video thing under https://integration.wikimedia.org/ci/view/Selenium/ ?
2025-04-29 22:23:13 <mutante> yes, multiple /usr/bin/node
2025-04-29 22:23:49 <mutante> sh -c for i in $(seq 1 100); do wdio ./tests/selenium/wdio.conf.js; done
2025-04-29 22:23:50 <Daimona> It should be under the artifacts from the build, like in https://integration.wikimedia.org/ci/job/wmf-quibble-selenium-php81/11735/artifact/log/
2025-04-29 22:23:57 <mutante> /usr/bin/node /workspace/src/node_modules/@wdio/local-runner/build/run.js ./tests/selenium/wdio.conf.js
2025-04-29 22:24:13 <Daimona> Okay yeah, that would be it
2025-04-29 22:24:40 <mutante> there are like 6 of those node processes and 2 of those shell loops
2025-04-29 22:24:42 <Daimona> if it keeps spawning selenium tests...
2025-04-29 22:24:55 <Daimona> those shell loops you can safely kill everywhere you see them
2025-04-29 22:25:03 <wmf-insecte> maintenance-disconnect-full-disks build 697360 integration-agent-docker-1062 (/: 26%, /srv: 100%, /var/lib/docker: 51%): OFFLINE due to disk space
2025-04-29 22:25:19 <mutante> just the loops or also the node though
2025-04-29 22:25:35 <mutante> I guess everything that relates to "wdio.conf"
2025-04-29 22:25:38 <Daimona> as for the node, if it's from the loop, it can be killed
2025-04-29 22:26:13 <Daimona> we can check one agent at a time to see if there are any legit node processes
2025-04-29 22:26:48 <mutante> ok, looks like killing the shell command and waiting a bit is also enough for it to all go away
2025-04-29 22:26:51 <Daimona> another option is to just kill everything, but CI still needs to catch up and it might be preferable not to make that worse
2025-04-29 22:26:54 <mutante> this happened on 1064 now
2025-04-29 22:27:25 <Daimona> Okay great
2025-04-29 22:27:27 <mutante> yea, "ps aux | grep wdio"
2025-04-29 22:27:39 <mutante> empty
2025-04-29 22:29:34 <mutante> 1062: similar but only a single shell loop, not 2, and fewer nodes
2025-04-29 22:29:45 <Daimona> Alright, so, the agents with suspiciously low memory right now are: 1040, 1044, 1048
2025-04-29 22:30:04 <wmf-insecte> maintenance-disconnect-full-disks build 697361 integration-agent-docker-1062 (/: 26%, /srv: 94%, /var/lib/docker: 51%): RECOVERY disk space OK
2025-04-29 22:32:16 <Daimona> 1050 also worth double checking maybe
2025-04-29 22:32:35 <thcipriani> sorry disappeared into meeting and then down a rabbit hole, catching up
2025-04-29 22:33:00 <mutante> 1062 and 1040 - check for changes
2025-04-29 22:33:04 <mutante> should be better
2025-04-29 22:33:23 <thcipriani> thanks for the cleanup mutante <3
2025-04-29 22:33:32 <Daimona> 1062 recovered, 1040 recovering
2025-04-29 22:33:52 <mutante> thcipriani: :) yw. I am now killing shell loops like this:
2025-04-29 22:34:01 <mutante> sh -c for i in $(seq 1 100); do wdio ./tests/selenium/wdio.conf.js
2025-04-29 22:34:15 <mutante> and any node process that uses wdio.conf.js
2025-04-29 22:34:23 <thcipriani> can't you docker kill the running container?
2025-04-29 22:34:49 <mutante> they are outside a container
2025-04-29 22:35:32 <thcipriani> hrm, that is a mismatch for my memory of how this worked
2025-04-29 22:35:46 <thcipriani> although my memory has been known to become outdated quickly
2025-04-29 22:36:45 <Daimona> Just did a pass of killing some redundant jobs. Some failure on agent-1062 due to full disk
2025-04-29 22:37:04 <mutante> thcipriani: can we click somewhere on or near https://integration.wikimedia.org/ci/view/Selenium/ to disable that entire "wdio" test?
2025-04-29 22:37:18 <Daimona> /srv full of garbage apparently
2025-04-29 22:37:25 <mutante> it's definetly that "wdio" conf
2025-04-29 22:38:06 <thcipriani> one of the maintenance jobs should recover 1062 if /srv is full up
2025-04-29 22:38:17 <thcipriani> if / gets full it's a manual thingy
2025-04-29 22:38:34 <mutante> 1044 and 1048 - also cleaned up
2025-04-29 22:38:35 <thcipriani> runs some docker cleanup and brings it back online
2025-04-29 22:38:42 <Daimona> Yep I see it's improving. I guess it went "oh btw here are the 500 ffmpeg video captures you asked for"
2025-04-29 22:39:11 <mutante> thcipriani: so it's basically "ps aux | grep wdio" to see it all at once and if it's gone
2025-04-29 22:39:26 <mutante> most instances had 1 of those "for 1 in 100" shell loops
2025-04-29 22:39:33 <mutante> and like 4 to 6 node processes
2025-04-29 22:39:48 <mutante> but at least one had 2 of the loops at the same time
2025-04-29 22:39:50 <thcipriani> so are we having to kill all selenium jobs at the moment, is that what's happening?
2025-04-29 22:40:14 <mutante> afaict not all selenium jobs. just the one that creates videos
2025-04-29 22:40:57 <Daimona> well in theory all of them create videos. But the ones with the shell loop use wdio 8 which seems broken
2025-04-29 22:41:14 <mutante> thcipriani: https://gerrit.wikimedia.org/g/mediawiki/core/+/d3090254b0e8b2284b100d77e32c18155df75f0a/tests/selenium/wdio-mediawiki/index.js#65
2025-04-29 22:42:29 <mutante> on 1048 it already came back again
2025-04-29 22:43:11 <Daimona> that's a legit job
2025-04-29 22:44:12 <mutante> the "for in in $(seq 1 100)" in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/721790/25/package.json that you linked to earlier
2025-04-29 22:44:40 <Daimona> So, everything with the shell loop is evil and can be killed on sight
2025-04-29 22:44:56 <mutante> what if that would just be reverted ?
2025-04-29 22:45:09 <Daimona> Things without the loop can be legit jobs. But they might also use wdio v8 which is evil. I don't think you can tell them apart by looking at just the command
2025-04-29 22:45:10 <mutante> wait, that's a WIP change
2025-04-29 22:45:33 <Daimona> Correct. It's never been merged. But somehow it outlived the container
2025-04-29 22:46:27 <mutante> maybe that was the case on 1048 just now and it finished legit jobs
2025-04-29 22:46:41 <mutante> because ffmpeg and the node went away without me doing anything this time
2025-04-29 22:47:09 <Daimona> I'm also killing some jobs so yeah
2025-04-29 22:47:59 <Daimona> One way to do this could be to check the agents one by one and see if their node jobs are legit. But surely we can do better?
2025-04-29 22:48:06 <Daimona> s/jobs/processes/
2025-04-29 22:48:24 <thcipriani> there is a cumin instance in integration...or there was
2025-04-29 22:48:41 <mutante> as you said earlier. check number of ffmpeg processes and if it's over 12 then bad, otherwise leave alone
2025-04-29 22:48:43 <thcipriani> we could also write some groovy in jenkins
2025-04-29 22:49:19 <thcipriani> the giant load spike is looking much better: https://grafana.wmcloud.org/d/0g9N-7pVz/cloud-vps-project-board?orgId=1&var-project=integration&var-instance=All
2025-04-29 22:49:38 <Daimona> even within the same agent there might be a mix of good and evil
2025-04-29 22:50:01 <mutante> and we dont know why it started now and not before?
2025-04-29 22:50:03 <wmf-insecte> maintenance-disconnect-full-disks build 697365 integration-agent-docker-1062 (/: 26%, /srv: 99%, /var/lib/docker: 47%): OFFLINE due to disk space
2025-04-29 22:50:09 <bd808> thcipriani: I see a integration-cumin.integration.eqiad1.wikimedia.cloud instance
2025-04-29 22:50:48 <bd808> sinks back into the bushes
2025-04-29 22:51:00 <mutante> we cant just disable the one test that uses that "wdio 8"?
2025-04-29 22:51:01 <thcipriani> according to cumin "O{project:integration}" "ps aux | grep ffmpeg | wc -l" there is no host over 12
2025-04-29 22:51:23 <mutante> great!
2025-04-29 22:51:47 <mutante> if it stays like that.. it's because on some instances the shell loop had not been killed. only ffmpeg itself
2025-04-29 22:51:56 <thcipriani> well..except integration-cumin but that was due to the grep :D
2025-04-29 22:51:57 <mutante> and then later all of it
2025-04-29 22:52:41 <Daimona> 12 is the absolute max though. It's still possible that there are evil processes somewhere.
2025-04-29 22:52:57 <thcipriani> cumin "O{project:integration}" "ps aux | grep [f]fmpeg | wc -l" show 8 hosts with 1 and 24 hosts with 0
2025-04-29 22:53:02 <mutante> ;) a bunch of the numbers I mentioned you can also subtract 1 because I didnt bother to | grep -v grep
2025-04-29 22:53:44 <Daimona> Could I have a list of `wdio` processes across all agents?
2025-04-29 22:53:50 <mutante> thcipriani: maybe let's count any process that has string "wdio" in it
2025-04-29 22:53:52 <mutante> hah
2025-04-29 22:53:53 <Daimona> So I can cross-reference it with the current jobs
2025-04-29 22:54:30 <thcipriani> looks like integration-agent-docker-1052.integration.eqiad1.wikimedia.cloud is the only host running wdio afaict
2025-04-29 22:55:10 <Daimona> That's legit
2025-04-29 22:55:11 <mutante> eh.. it seems I had never connected to 1052
2025-04-29 22:55:32 <mutante> that has one of the shell loops and a single node
2025-04-29 22:55:40 <mutante> but not 2 and 8 or more ..and no ffmpeg
2025-04-29 22:55:46 <mutante> so seems more legit indeed
2025-04-29 22:56:09 <Daimona> It should be https://integration.wikimedia.org/ci/job/wmf-quibble-selenium-php81/11748/console
2025-04-29 22:57:16 <mutante> legit wdio:
2025-04-29 22:57:18 <mutante> node /workspace/src/extensions/ProofreadPage/node_modules/.bin/wdio tests/selenium/wdio.conf.js
2025-04-29 22:57:26 <mutante> /usr/bin/node --no-wasm-code-gc /workspace/src/extensions/ProofreadPage/node_modules/@wdio/local-runner/build/run.js tests/selenium/wdio.conf.js
2025-04-29 22:57:38 <mutante> bad wdio from earlier:
2025-04-29 22:57:40 <mutante> /usr/bin/node /workspace/src/node_modules/@wdio/local-runner/build/run.js ./tests/selenium/wdio.conf.js
2025-04-29 22:57:49 <mutante> if that makes sense
2025-04-29 22:57:55 <thcipriani> load average looking good. looks like mischief managed. Just got to make sure that we run down whatever is going on with wdio 8
2025-04-29 22:58:41 <thcipriani> and probably don't run it in a loop until we do :)
2025-04-29 22:58:48 <mutante> āœ…
2025-04-29 22:58:57 <Daimona> I'm not sure if the difference in the invocation is significant, but at any rate, we should be good
2025-04-29 22:59:14 <mutante> why does it run in a loop when the change adding a loop is not merged
2025-04-29 22:59:34 <Daimona> Loops should be ok per se. I too have done it many times. It doesn't cause harm as long as everything is working correctly... Which doesn't be the case with wdio 8.
2025-04-29 23:00:03 <wmf-insecte> maintenance-disconnect-full-disks build 697367 integration-agent-docker-1062 (/: 26%, /srv: 71%, /var/lib/docker: 46%): RECOVERY disk space OK
2025-04-29 23:00:50 <Daimona> But on the other hand, I imagine that the patch in question really was trying to figure out what's wrong with wdio 8.
2025-04-29 23:01:41 <thcipriani> yeah, probably not anticipating it would eat all CI resources for some reason
2025-04-29 23:01:56 <thcipriani> seems like a legit way to find flakiness
2025-04-29 23:02:41 <Daimona> It's been really useful in the past. It surely didn't have these side effects with wdio 7
2025-04-29 23:03:54 <thcipriani> random guess: `afterTest` is no longer run?
2025-04-29 23:04:03 <thcipriani> so "stopVideo" never gets called
2025-04-29 23:04:33 <thcipriani> so it was starting an ffmpeg process for every test and never stopping it
2025-04-29 23:06:24 <Daimona> That is my understanding. I think I saw something along those lines in the log. But why did it outlive the container?
2025-04-29 23:06:34 <wikibugs> 'Continuous-Integration-Infrastructure, ''Patch-For-Review: CI is overwhelmed and lots of jobs are failing randomly (2025-04-29) - https://phabricator.wikimedia.org/T392963#10778630 (''Daimona) >>! In T392963#10778428, @Daimona wrote: > - Is the theory in T392963#10778330 correct? Based on what we know now:...'
2025-04-29 23:07:21 <wikibugs> ('update) ''bd808: SpiderPig: auto select first backport search match [repos/releng/scap] - ''https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/731 (https://phabricator.wikimedia.org/T392508)'
2025-04-29 23:07:57 <thcipriani> could be that we killed the parent process and there was nothing in the container to reap the child processes? Dunno
2025-04-29 23:10:12 <Daimona> Yeah no idea. But I think it used to work fine with wdio v7
2025-04-29 23:14:37 <Daimona> At any rate, I left it as a "to figure out" in the task. The grafana dashboard looks much better now, thanks mutante for destroying all the ffmpeg crap. Now I'll disappear :)
2025-04-29 23:14:50 <thcipriani> ^ thanks both
2025-04-29 23:17:12 <thcipriani> oh: https://integration.wikimedia.org/ci/job/mediawiki-quibble-selenium-vendor-mysql-php74/25118/console so it looks like chrome was crashing over and over in the afterTest hook, probably causing the stopVideo afterTest hook to never be executed...for some reason. Anyway, I'll dump that theory in the task.
2025-04-29 23:24:08 <mutante> the "stopVideo" code says it is sending SIGINT (signal 2). on one host I used that (killall -2 ffmpeg), which made the processes stop but one by one and a bit more proper. as opposed to just hard kill -9 on others.
2025-04-29 23:27:11 <wikibugs> 'Continuous-Integration-Infrastructure, ''Patch-For-Review: CI is overwhelmed and lots of jobs are failing randomly (2025-04-29) - https://phabricator.wikimedia.org/T392963#10778640 (''Dzahn) We saw ffmpeg processes come back after being killed.. then figure out there were shell loops (sh -c for i in ...) as...'
2025-04-29 23:30:40 <wikibugs> 'Continuous-Integration-Infrastructure, ''Patch-For-Review: CI is overwhelmed and lots of jobs are failing randomly (2025-04-29) - https://phabricator.wikimedia.org/T392963#10778646 (''Dzahn) Where ffmpeg gets spawned: https://gerrit.wikimedia.org/g/mediawiki/core/+/d3090254b0e8b2284b100d77e32c18155df75f0a/t...'
2025-04-29 23:33:30 <wikibugs> 'GitLab (Pipeline Services Migration🐤), ''collaboration-services, ''Patch-For-Review: Move micro sites from Ganeti to Kubernetes and from Gerrit to GitLab - https://phabricator.wikimedia.org/T300171#10778647 (''Dzahn) I removed the static-rt site from the legacy miscweb VMs. Now os-reports (T350794) is the...'
2025-04-29 23:34:03 <wikibugs> 'GitLab (Pipeline Services Migration🐤), ''collaboration-services, ''Patch-For-Review: Move micro sites from Ganeti to Kubernetes and from Gerrit to GitLab - https://phabricator.wikimedia.org/T300171#10778650 (''Dzahn)'
2025-04-29 23:40:20 <wikibugs> 'Continuous-Integration-Infrastructure, ''Patch-For-Review: CI is overwhelmed and lots of jobs are failing randomly (2025-04-29) - https://phabricator.wikimedia.org/T392963#10778662 (''thcipriani) >>! In T392963#10778640, @Dzahn wrote: > We saw ffmpeg processes come back after being killed.. then figure out t...'
2025-04-29 23:44:43 <wikibugs> 'Continuous-Integration-Infrastructure, ''Patch-For-Review: CI is overwhelmed and lots of jobs are failing randomly (2025-04-29) - https://phabricator.wikimedia.org/T392963#10778676 (''thcipriani) ''Open→''Resolved a:''Dzahn I added a comment on the task that spawned this issue that should point fol...'

This page is generated from SQL logs, you can also download static txt files from here