[07:17:41] (03PS1) 10Elukey: python: Add more info about Docker image rebuild [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/821167 (https://phabricator.wikimedia.org/T301878) [07:18:46] hello folks! Going to take ~1hour to relocate to a new position, ttl! [07:55:06] morning! :) [08:41:21] (03CR) 10AikoChou: articlequality: add code to send events to EventGate (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/820388 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [08:51:33] (03CR) 10AikoChou: "Looks nice! In this way we'll reduce a lot of repetitive code :)" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/820381 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [08:57:50] (03CR) 10AikoChou: [C: 03+1] python: Add more info about Docker image rebuild [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/821167 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [09:00:59] back :) [09:01:02] hello aiko [09:07:09] (03CR) 10Elukey: articlequality: add code to send events to EventGate (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/820388 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [09:17:27] aiko: argh I just found out something not nice [09:18:04] we set REQUESTS_CA_BUNDLE to specify the TLS bundle, as you suggested in the code review [09:18:30] but now that we don't use anymore a requests session, we don't use it [09:20:32] and we don't use it with the AsyncSession code too [09:21:51] I think that aiohttp.ClientSession by default doesn't verify the TLS cert [09:21:52] sigh [09:25:20] aiko: if you are ok we could remove the REQUESTS_CA_BUNDLE variable in isvcs, in theory since we deploy the wmf-certificates deb package we don't need to vary it [09:25:25] what do you think? [09:26:01] then I can follow up and see how to add the tls verify context to the aiohttp connector [09:28:36] elukey: yes, that sounds good [09:30:31] (03CR) 10Elukey: editquality: move rev-id preprocess functions to a separate module (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/820381 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [09:30:39] aiko: ok if I merge https://gerrit.wikimedia.org/r/c/machinelearning/liftwing/inference-services/+/820381 ? [09:30:40] elukey: I was confused that we set a REQUESTS_CA_BUNDLE but we don't use it in model.py [09:30:48] yeah you are definitely right [09:30:52] something is missing [09:31:17] elukey: yes sure! [09:31:29] (03CR) 10AikoChou: [C: 03+1] editquality: move rev-id preprocess functions to a separate module [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/820381 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [09:32:09] thanksss [09:32:16] (03CR) 10Elukey: [C: 03+2] editquality: move rev-id preprocess functions to a separate module [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/820381 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [09:32:35] I want to see how many docker images are rebuilt [09:32:55] in theory, without the integration/config change, only editquality should be rebuilt [09:36:51] mmm no I may be wrong [09:36:52] https://github.com/mediawiki-utilities/python-mwapi/blob/master/mwapi/async_session.py#L85 [09:36:58] we set verify_ssl=true [09:37:30] so at this point the wmf-certificates bundle may be picked up automatically [09:40:18] checking https://docs.aiohttp.org/en/stable/client_reference.html [09:42:15] it says that verify_ssl is superseeded by the ssl value, None by default (using the default context) [09:42:24] if False, it skips tls cert verification [09:42:34] so, IIUC, it does check TLS certs [09:45:07] (03CR) 10Elukey: articlequality: add code to send events to EventGate (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/820388 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [09:49:29] ok I think that certificates are picked up automatically [09:49:31] goood [09:49:56] aiko: ok if I proceed with articlequality's code change as well? [09:59:46] nice! [10:00:04] (03CR) 10AikoChou: [C: 03+1] articlequality: add code to send events to EventGate [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/820388 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [10:00:21] thanks :) [10:00:31] (03CR) 10Elukey: [C: 03+2] articlequality: add code to send events to EventGate [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/820388 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [10:00:50] (03CR) 10AikoChou: [C: 03+1] drafttopic: add code to send events to EventGate [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/820418 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [10:01:11] (03CR) 10AikoChou: [C: 03+1] draftquality: add code to send events to EventGate [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/820401 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [10:01:22] aiko: one thing - I haven't refactored outlink, I wanted to leave everything to you since you know best [10:01:43] but we'll add the config to rebuild predictor/transformer docker images when the python shared dir changes [10:01:46] does it sound good? [10:02:05] elukey: sounds good! [10:02:10] (so by default the modules will be copied over, and we can load anytime) [10:02:16] super, thanks kevinbazira and aiko for the reviews :) [10:02:32] I'll roll them out one by one testing in staging [10:02:40] and then I'll deploy to prod if all looks good [10:05:20] going afk for a bit! [10:08:56] (03Merged) 10jenkins-bot: articlequality: add code to send events to EventGate [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/820388 (https://phabricator.wikimedia.org/T301878) (owner: 10Elukey) [10:36:42] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Deploy Outlinks topic model to production - https://phabricator.wikimedia.org/T287056 (10achou) @Isaac Outlink topic model has been deployed on Lift Wing! :) Internal clients (e.g. from stat machines) can access it through an internal discovery endpoint. Th... [10:54:37] * elukey lunch [10:55:19] aiko: nice guide in https://phabricator.wikimedia.org/T287056#8136894 [11:00:02] elukey: Thanks! :D [13:54:33] Morning all! [13:56:18] Hi Chris [13:59:07] Hello! [13:59:17] Its apparently a day of meetings for me [13:59:38] Outlink is deployed Nice! [14:00:38] hi! [14:00:57] elukey! [14:02:03] kevinbazira: merged your change! [14:02:57] thanks elukey. going to run a deployment on staging now ... [14:08:51] the diff on staging shows more changes than I expected. e.g swift-s3-credentials [14:09:19] elukey are these changes you applied? should I go ahead and deply them alongside mine? [14:09:39] **deploy [14:10:20] kevinbazira: yes it is the first deployment, should be ok [14:10:40] great. thanks for the confirmation. let me proceed ... [14:13:53] the codfw staging deployment has been completed successfully [14:13:54] checking pods now ... [14:14:39] all new pods are up and running. [14:14:40] NAME READY STATUS RESTARTS AGE [14:14:40] arwiki-drafttopic-predictor-default-57qjv-deployment-58456kbxr4 3/3 Running 0 85s [14:14:40] cswiki-drafttopic-predictor-default-95ftb-deployment-74677l4l5k 3/3 Running 0 83s [14:14:40] enwiki-drafttopic-predictor-default-kl2kc-deployment-84bb7wpjll 3/3 Running 0 84s [14:16:13] wooot [14:20:57] nice! [14:22:31] if anybody has time [14:22:37] https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/821247 (and next chained) [14:23:05] very simple, I want to test the event code in articlequality staging and rollout the editquality one to prod [14:29:16] (merged, they were super simple) [14:53:11] mmm weird, the articlequality code doesn't report eventgate errors but I don't see the event in the kafka topic [14:53:40] because I was looking in the wrong topic :D [14:53:41] ahahah it works [14:55:52] starting to sync editquality pods in prod [15:01:21] the rollout takes more than expected, will probably finish it tomorro [15:20:30] * elukey bbiab [16:19:34] 10Lift-Wing, 10Epic, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Send score to eventgate when requested - https://phabricator.wikimedia.org/T301878 (10elukey) @Ottomata Hi! I am slowly rolling out the code to allow to all revscoring-based models to push `mediawiki.revision-score` events to... [16:29:46] folks I have deployed editquality docker images to ml-serve-codfw but the event generation leads to errors, weird.. [16:29:52] will check tomorrow morning :) [16:29:54] o/ [18:17:03] 10Lift-Wing, 10Epic, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Send score to eventgate when requested - https://phabricator.wikimedia.org/T301878 (10Ottomata) Hello! Separate streams for different models seems fine, but perhaps what you want are separate events for each model, not necess... [18:26:57] 10Lift-Wing, 10Machine-Learning-Team: Deploy NSFW model to production - https://phabricator.wikimedia.org/T314810 (10achou) [18:27:47] 10Lift-Wing, 10Machine-Learning-Team: Deploy NSFW model to production - https://phabricator.wikimedia.org/T314810 (10achou) [20:18:27] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Deploy Outlinks topic model to production - https://phabricator.wikimedia.org/T287056 (10Isaac) > Outlink topic model has been deployed on Lift Wing! :) Eeeeeek! Thank you @achou!!!! I originally wrote this with multiple exclamation marks on both sentences b... [22:57:52] 10Lift-Wing, 10Machine-Learning-Team: Deploy NSFW model to production - https://phabricator.wikimedia.org/T314810 (10Aklapper) Hi, what is the relation of this task to e.g. `T250110`? [22:58:05] 10artificial-intelligence, 10SRE, 10Service-deployment-requests: New Service Request 'open_nsfw' - https://phabricator.wikimedia.org/T250110 (10Aklapper) What is the relation of this task to `T214201`? Does this one block the other one (=subtask)?