[01:32:01] (03CR) 10Zabe: [C: 03+2] Fix usage of ApiBase::PARAM_* deprecated constants [extensions/ORES] - 10https://gerrit.wikimedia.org/r/776433 (https://phabricator.wikimedia.org/T275455) (owner: 10Gerrit maintenance bot) [01:52:08] (03Merged) 10jenkins-bot: Fix usage of ApiBase::PARAM_* deprecated constants [extensions/ORES] - 10https://gerrit.wikimedia.org/r/776433 (https://phabricator.wikimedia.org/T275455) (owner: 10Gerrit maintenance bot) [05:17:33] 10Lift-Wing, 10Machine-Learning-Team: Deploy Outlinks topic model to production - https://phabricator.wikimedia.org/T287056 (10Aklapper) a:05ACraze→03None Removing inactive task assignee [05:17:50] 10Lift-Wing, 10artificial-intelligence, 10editquality-modeling, 10Machine-Learning-Team (Active Tasks): Upload model binaries to storage - https://phabricator.wikimedia.org/T301413 (10Aklapper) a:05ACraze→03None Removing inactive task assignee [05:18:14] 10Lift-Wing, 10ORES, 10artificial-intelligence, 10ML-Governance, 10Machine-Learning-Team (Active Tasks): Create Draft Model Deployment Guidelines - https://phabricator.wikimedia.org/T276598 (10Aklapper) a:05ACraze→03None Removing inactive task assignee [06:11:48] hello folks :) [08:17:37] Janis helped me to debug the network policy issue that I had with sidecars, very interesting finding in https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/776856/ [08:37:52] ok so the last issue seems gone for the sidecars, I am going to finish the ml-serve-codfw cluster and then I think that we are done [08:49:09] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Experiment with the Istio TLS mesh - https://phabricator.wikimedia.org/T297612 (10elukey) 05Open→03Resolved We have finally our clusters running with the istio sidecars, a long journey but I hope it will pay off. [08:49:13] 10Lift-Wing, 10Epic, 10Machine-Learning-Team (Active Tasks): Lift Wing proof of concept - https://phabricator.wikimedia.org/T272917 (10elukey) [08:57:01] * elukey bbiab [09:06:30] good morning :) [09:10:00] o/ [09:51:51] Morning (barely) :) [09:53:09] o/ [10:07:19] all right I have sent 4 code reviews related to the change in ip ranges for the prod clusters [10:07:27] (eqiad/codfw for deployment-charts and puppet) [10:08:06] I am a little busy tomorrow morning, maybe we can do codfw in the afteroon? [10:08:45] Sounds good [10:11:34] All LGTM'd [10:11:39] thanks! [10:12:04] I reviewed the ipv6 allocations and a /64 seems enough :D [10:12:21] so we should be good to go [10:12:45] But we're not currently configuring v6, right? [10:14:39] in theory no, calico has the setting but we probably don't use it, never seen an ipv6 assigned to a pod [10:14:47] Alrighty [10:32:13] ging afk for lunch in a few! [13:13:19] 10Machine-Learning-Team, 10Patch-For-Review: Upgrade ORES to Debian Buster or Bullseye - https://phabricator.wikimedia.org/T303801 (10MoritzMuehlenhoff) >>! In T303801#7789064, @elukey wrote: >> My plan was to simply also import scipy from stretch into the component as well, just didn't get to it yet. > > Def... [13:42:12] Morning all! [13:55:28] o/ [13:55:35] back home?? [13:55:43] I’m back! And I didn’t get Covid [14:00:33] nice! [14:03:51] folks in https://phabricator.wikimedia.org/T303801 Moritz added another idea to try to run ORES on python3.5 and Buster [14:04:22] we use python3-scipy from debian upstream, the idea would be to add it as wheel to the ORES wheels repo [14:04:27] and use that [14:04:41] now in puppet (ores::base) there is this comment [14:04:41] # Install scipy via debian package so we don't need to build it via pip [14:04:44] # takes forever and is quite buggy [14:05:03] we are using scipy 0.18.1, from 2006... [14:06:11] so given our timeline for ORES, I'd say that we should invest a little bit of time to move its deps/venv to Python 3.7 [14:07:12] maybe a time-boxed spike to verify how feasible it would be [14:07:31] this probably includes: [14:07:45] 1) bump deps for revscoring (beneficial for docker images too) [14:07:59] 2) bump scipy/numpy versions to something more up-to-date [14:31:15] Picking a newer version of scipy that is still compatible might be tricky. I'd expect a _lot_ of API changes since 2006 [14:33:05] this is a possibility yes, but maybe for our usage it could be fine.. I am worried though that model binaries need to be picked again [14:33:11] this is surely more painful [14:33:37] we can try with the old scipy dependency and see how it goes [14:33:44] and then think about 3.7 [14:35:09] the only weird feeling that I have is to accumulate more and more tech-debt for something that we'll need to maintain for other months (if not more), that may bite us in the future [14:35:10] Yes, the two update sets would be orthogonal, if at all possible [14:35:48] I mean, all of us think the sooner we can turn off ORES, the better, for many different reasons. [14:35:48] I am pretty sure that newer versions of scipy need 3.6+ [14:36:26] I think if we have to focus our update-work, it should be the interpreter first, and whatever packages that need to be updated to do that [14:36:32] This is all giving me a panic attack [14:36:32] https://pypi.org/project/scipy/ last one requires 3.8+ for example :D [14:36:51] Doing it the other way around will only result in us needing a new interpreter anyway [14:39:57] the minimal change for the moment is to remove our dependency from Debian's scipy, and add the 0.18.1 wheel to the ores-wheel repository (and possibly test deploy etc..) [14:40:13] then we could even try to spin up a new VM on Buster, and run the ores venv with 3.7 [14:40:19] Yes, that sounds like a good zeroth step [14:40:25] we test all the models, if we see fireworks we stop [14:40:47] I'd expect at least some sparks and smoke, but that might be simple fixes [14:41:17] the main problem could be pickled/serialized code, but the serialization format/version didn't change from 3.5 to 3.7 [14:41:31] I recall that Amir mentioned a lot of horrors in doing the change anyway [14:41:46] I admittedly am not super knowledgeable on the difference of post-3.5 Python versions, but I am sure glad it's not a 2.x->3.x change :) [14:41:49] if 3.7 fails, we can test Moritz' 3.5 [14:42:01] (on Buster) [14:43:37] Do you think doing it incrementally might be better? going to 3.6 first, I mean [14:44:40] it may be a problem getting the interpreter, by default we have 3.5 on stretch and 3.7 on buster [14:44:52] so we can probably test just those [14:48:24] yeah, might also not be worth it anyway. [15:39:21] * elukey little break before meetings