[07:57:34] Need to run to the pharmacy first thing tmrw morning, no retro for me [10:47:06] dcausse: not sure if this is you, but there is a dirty change on deploy2002 in helmfile.d/services/rdf-streaming-updater/helmfile.yaml. might want to reset that. [10:47:14] - atomic: true [10:47:37] hm... looking [10:49:58] vim /srv/deployment-charts/helmfile.d/services/rdf-streaming-updater/helmfile.yaml on pts/2 [10:50:16] seems to be jayme [10:51:38] ah k, its been like that since yesterday, will ping [10:51:50] oh you did. ty [10:59:02] Lunch [11:07:55] lunch 2 [11:44:57] Hello, I hope this is useful. I fixed up my old k8s app logs opensearch (logstash) dashboard: https://logstash.wikimedia.org/app/dashboards#/view/7f883390-fe76-11ea-b848-090a7444f26c [11:44:57] 11:32:21 [11:44:57] you can select k8s cluster and namespace (and app label if necessary) and see logs. Works to easily flink logs. [13:03:00] nice! [14:01:21] dcausse: bking yall coming to TNG thing? you answered yes [14:14:48] dcausse LMK if you want to pair today (and sorry I wasn't much use at that meeting) [15:03:24] inflatador: sure! [15:03:38] inflatador: do you have a meet link? [15:03:49] https://meet.google.com/oxm-jadj-jtp?hs=122&authuser=0 [16:01:12] pfischer, dcausse, mpham: retrospective: https://meet.google.com/eki-rafx-cxi [16:17:00] Just did a little work on the Flink Cluster grafana dashboard, separated out a few panels, made it work with more clusters (there were still some old hardcoded session cluster label filters in there) [16:17:01] https://grafana.wikimedia.org/goto/ID-QOV-Vz?orgId=1 [16:56:23] ottomata: thanks! [16:56:45] ottomata: could you add me as commiter to airflow-dags? :) [16:56:57] oh for sure... [16:57:27] hm, you were a developer. now you are a maintainer, does that work? [16:59:35] ebernhardson: I migrated the scripts the ores_predictions DAG depends on to discolytics but mypy is giving my errors I cannot reproduce locally. [17:00:18] pfischer: hmm, suggests there is perhaps a difference in the version of mypy used? [17:01:26] pfischer: oh, i bet what happened is because you created a mypy.ini file [17:01:39] pfischer: move the contents of that to the end of setup.cfg, by creating a mypy.ini mypy stopped reading setup.cfg [17:02:21] having per-tool .ini files is a more historical method of configuring things, setup.cfg is a slightly more modern take (python packaging/tooling is a mine field) [17:14:00] Good to know, thanks, I’ll do that. [17:15:14] ebernhardson: Does the #-magic in Spark work for directory-like-suffixes, e.g. #dblists/0.dblist, too? [17:15:28] https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/259#note_20272 [17:21:43] pfischer: hmm, i don't think i've ever tried to create subdirs, not sure if it would work or not [17:22:52] pfischer: shouldn't be necessary though, when passing a list of files without the # spark places them in the CWD of the app, so we simply pass the list of file names to the script as --db-lists and it reads them [17:24:13] Alright, that’s what I did now, anyways. [17:30:15] So do we get the contents of mediawiki_config_path = wmf_props.get('mediawiki_config_path', '/srv/mediawiki-config') via HDFS (as with the credentials) or is this directory mounted when spark is run? [17:31:58] pfischer: that git repo gets cloned by puppet onto the airflow instance itself (and it should auto-pull once an hour, more than plenty for our use case). [17:32:52] pfischer: you should also be able to ssh into an-airflow1005.eqiad.wmnet to verify [17:32:55] Okay, so we still have to map the files via # for them to become available when spark runs? [17:34:03] pfischer: i don't think so, what happens is `files='/path/to/foo.txt,/path/to/bar.txt,/path/to/bang.txt#abc.txt' would write foo.txt, bar.txt and abc.txt to the working directory of the spark application [17:34:11] Because right now, the variables are named a bit confusingly. Do files get mapped automatically (even w/o #) to their suffix? [17:35:10] i suppose there is also some magic in there that if you specify /path/to/thing.tgz#thing it will decompress the .tgz into the thing subdir [17:35:14] same with .zip [17:36:05] Currently dblists are passed to —files but w/o #-mapping and are then passed to spark vie —dblists [17:37:07] pfischer: right, so the --files argument is passed to the spark loader, it ensures the files get copied into the working directory of the spark application, the --db-lists argument gets passed to our spark script and tells it the filenames to read from. [17:38:36] and indeed the variables named here aren't super clear :( `local_dblists` is the path to the dblists on the airflow instance, `dblists` is just the filenames without any path, we pass that to our python script and without any leading path it reads them from CWD [17:45:24] pfischer: other thoughts looking at the patch (i'll submit a review later today): the scripts need to be added to the console_scripts section of setup.cfg, thats the part that places them in a top-level bin/ directory so they are easily executable (otherwise the entry_point would be something like lib/python3.10/site-packages/discolytics/cli/some_script.py). That will want a function to [17:45:26] execute in the script, so we need to move the content of the final `if __name__ == "__main__":` into a simple function. See the other scripts in discolytics/cli/ for examples [17:49:17] Alright, will look into that tomorrow. Thanks! [18:24:15] back [18:24:45] hmm, turns out our skein integration in airflow doesn't properly quote things. Passed a cli arg with a | in it and got a complaint about not finding the file to execute :) [18:52:52] reminder: daylight savings starts next sunday in most of the US, so meeting times about to get wonky again :P [18:53:13] until mar 26 when europe switches [19:20:20] ottomata: yes I see the merge buttow now, thanks :) [21:07:24] inflatador: 2 batches of eqiad elastic hosts left of the rolling restart. I'm gonna go step out to go for a run [21:07:54] ryankemper ack, will keep an eye out [21:20:05] looks like it finished successfully [21:44:21] * ebernhardson wonders whats up with the mjolnir failure rate alert...the dashboards show a sold 0 for eqiad and codfw last 7 days. Some oddity in the alert [23:15:21] oh thats annoying, gitlab doesn't do anything to ensure when merging a patch that the main branch will pass CI [23:15:50] it does it backwards, first merging the patch into main and then running the CI. So when the CI fails, that means the main branch doesn't pass [23:46:03] > it does it backwards, first merging the patch into main and then running the CI. So when the CI fails, that means the main branch doesn't pass [23:46:06] wow really? that seems insane [23:46:28] yea, it's quite surprising to me [23:46:41] spoiled by gerrit and zuul i guess :P [23:50:14] in unrelated news, it's always nice to see actual usage of wikidata in the wild, like in this hn comment: https://news.ycombinator.com/item?id=35082275