[00:42:07] gotta love phan's auto-detected types: Argument 1 ($pages) is $pageData of type [00:42:09] array{}|non-empty-list<\IDBAccessObject>|non-empty-list<\MediaWiki\DAO\WikiAwareEntity>|non-empty-list<\MediaWiki\DAO\WikiAwareEntityTrait>|non-empty-list<\MediaWiki\HookContainer\ProtectedHookAccessorTrait>|non-empty-list<\MediaWiki\Page\PageIdentity>|non-empty-list<\MediaWiki\Page\PageRecord>|non-empty-list<\MediaWiki\Page\PageReference>|non-empty-list<\MediaWiki\Page\ProperPageIdentity [00:42:11] >|non-empty-list<\Page>|non-empty-list<\WikiPage>|non-empty-list<\Wikimedia\NonSerializable\NonSerializableTrait>|non-empty-list [10:56:49] lunch [12:03:17] lunch [14:12:39] half day for me, see you in ~4 hrs [14:19:57] inflatador: have fun! [14:20:25] weekly update posted in Asana: https://app.asana.com/0/0/1204152472055058 let me know if I got anything wrong (or missing). [14:20:30] I'll post it on wiki shortly [14:40:19] And available on wiki: https://wikitech.wikimedia.org/wiki/Search_Platform/Weekly_Updates/2023-03-10 [14:55:03] errand [15:26:52] o/ ebernhardson: are you around? I’m struggling with the SkeinOperator/SkeinHook [15:41:33] pfischer: sure, whats up? [15:54:20] Ah, I think I found a solution, let me push this… [15:59:30] So this would be my solution: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/259/diffs?diff_id=15452&start_sha=f097ff6643218dba0bf34aec8baeb49b45018d7b … it compiles and the tests still pass … still feels wierd as I have no clue if it still works 🤷 [16:00:48] pfischer: one way to test would be to render out the application spec into yaml, and then you can manually run the yaml file on the hadoop cluster. Or we have a repo that will start a docker cluster on your local machine and you can test inside that [16:00:57] s/docker cluster/hadoop cluster/ [16:01:55] its some docker images that will start a hadoop cluster, i wrote it for testing of the old airflow 1 instance: https://gerrit.wikimedia.org/r/admin/repos/search/analytics-integration,general [16:02:19] see also https://jcristharif.com/skein/quickstart.html#write-an-application-specification [16:04:18] if we think it's useful we could probably migrate that repo forward as well so it starts up an airflow 2 instance, but the other teams found that what they really needed during testing wasn't just a working hadoop cluster, but real data with all the inconsistencies and data size issues it brings, so they put together the `run_dev_instance.sh` found in the airflow 2 repo which lets you [16:04:20] start an airflow 2 instance on a stat*.eqiad.wmnet machine [16:05:06] we would probably have to do a bit of review through everything about how we are using var_props and decide the best ways to make it easy to swap out arguments to run dags with real inputs and user-specific output paths [16:27:26] Alright, I’ll have a look. [16:27:29] Thanks! [19:12:30]