[12:37:44] hi alll! [12:40:47] Hi mforns [12:43:31] Hiya. Good morning to you. [12:43:54] :] [12:50:43] 10Analytics-Radar, 10Product-Analytics, 10Product-Data-Infrastructure, 10Language-Team (Language-2021-April-June), 10MW-1.37-notes (1.37.0-wmf.11; 2021-06-21): All events in the contenttranslationabusefilter data stream failing validation - https://phabricator.wikimedia.org/T283872 (10Pginer-WMF) The das... [14:19:01] (03PS4) 10Ottomata: Add bin/gobblin wrapper and initial gobblin/ common properties files [analytics/refinery] - 10https://gerrit.wikimedia.org/r/701463 (https://phabricator.wikimedia.org/T271232) [14:19:08] (03CR) 10Ottomata: "Still need to test env var problem" (035 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/701463 (https://phabricator.wikimedia.org/T271232) (owner: 10Ottomata) [14:19:47] (03PS5) 10Ottomata: Add bin/gobblin wrapper and initial gobblin/ common properties files [analytics/refinery] - 10https://gerrit.wikimedia.org/r/701463 (https://phabricator.wikimedia.org/T271232) [14:20:15] (03PS6) 10Ottomata: Add bin/gobblin wrapper and initial gobblin/ common properties files [analytics/refinery] - 10https://gerrit.wikimedia.org/r/701463 (https://phabricator.wikimedia.org/T271232) [14:20:40] (03PS7) 10Ottomata: Add bin/gobblin wrapper and initial gobblin/ common properties files [analytics/refinery] - 10https://gerrit.wikimedia.org/r/701463 (https://phabricator.wikimedia.org/T271232) [14:22:08] (03PS8) 10Ottomata: Add bin/gobblin wrapper and initial gobblin/ common properties files [analytics/refinery] - 10https://gerrit.wikimedia.org/r/701463 (https://phabricator.wikimedia.org/T271232) [14:41:55] (03PS9) 10Ottomata: Add bin/gobblin wrapper and initial gobblin/ common properties files [analytics/refinery] - 10https://gerrit.wikimedia.org/r/701463 (https://phabricator.wikimedia.org/T271232) [14:41:57] ah joal, i never actually set those env vars! [14:41:59] try now ^^ [14:42:38] (03PS10) 10Ottomata: Add bin/gobblin wrapper and initial gobblin/ common properties files [analytics/refinery] - 10https://gerrit.wikimedia.org/r/701463 (https://phabricator.wikimedia.org/T271232) [14:43:14] ottomata: will test! [14:43:51] ottomata: I'm waiting for your approval to release the jar to archiva - 2 things - code (maybe), and jar name [14:46:21] joal! did you add me as reviewer? id on't see it [14:48:00] Ahhh- meh - I forgot we agreed on me adding the code as a review - doing it now (too many things in mind lately) [14:52:45] ottomata: there is a weird thing happening with gerrit [14:53:05] joal is it just the code in this dir? [14:53:05] https://gerrit.wikimedia.org/r/plugins/gitiles/analytics/gobblin/+/49c0a180e6929daaf844d1d5713e535bb60cf370/gobblin-modules/gobblin-wmf/src/main/java/org/apache/gobblin/wmf [14:53:28] ottomata: mostly - some minor changes in other places but nothing important [14:54:21] ok looks fine to me brief pass [14:54:25] let's not worry about gerrit [14:54:30] what is the jar name you want me to double check? [14:54:45] (I assume you an dan reviewed each others code ya?) [14:55:24] ottomata: ack - jar name is currently: gobblin-wmf-0.16.0-wmf1-all.jar [14:55:38] we reviewed our code mostly yes [14:55:55] why not gobblin-0.16.0-wmf1-all.jar? [14:56:04] also, all? should it be -shaded for consistency? [14:56:12] The first -wmf- is for the module, the second ofr wmf version - maybe we don't need the second? [14:56:22] hmm, i see [14:56:26] no i see [14:56:26] ok [14:56:35] the second is good because we do change the code, right? [14:56:40] in the wmf brancH? [14:56:45] elsewhere than the wmf module? [14:56:56] I'll we might change the code in the wmf branch before the gobblin version evolves yes [14:57:18] ok, so gobblin-wmf is the module name [14:57:25] correct [14:57:31] 0.16.0-wmf1 is the version [14:57:32] +1 [14:57:38] and -all vs -shaded? [14:57:48] ack - Will try to see how I can change the -all for -shaded [14:58:05] joal: no big deal, i've just seen that convention elsewhere [14:58:21] ottomata: so have I - gradle builds it this was by default [14:58:26] hm [14:59:14] ottomata: in ERC meeting now, will read but responses will be async [14:59:49] ok joal , i'm fine with -all if -shaded is hard and not correct (was just looking up the difference, apparently shaded is a special thing, not just all deps) [14:59:59] so, +1 to releasing however you think is best with that name [15:00:00] wow ok [15:00:10] https://stackoverflow.com/questions/33779185/what-are-the-differences-between-uberjar-fatjar-and-shadowjar-in-gradle [15:00:39] Wow - interesting ottomata [15:00:45] let's keep it with -all [15:00:48] I'll release as is [15:01:15] k [15:06:46] ottomata: something I forgot about - I'll need to find the way for gradle to generate the pom.xml for us- I think it's doable [15:11:00] 10Analytics, 10Better Use Of Data, 10Metrics-Platform, 10Performance-Team, and 3 others: Switch mw.user.sessionId back to session-cookie persistence - https://phabricator.wikimedia.org/T223931 (10DAbad) Need to reach out to Legal to figure out the next steps to determine if something needs to be done [15:14:15] 10Analytics, 10Better Use Of Data, 10Metrics-Platform, 10Performance-Team, and 3 others: Switch mw.user.sessionId back to session-cookie persistence - https://phabricator.wikimedia.org/T223931 (10DAbad) a:03jlinehan [15:17:43] 10Analytics-EventLogging, 10Analytics-Radar, 10Metrics-Platform, 10Product-Data-Infrastructure, 10Vector (Vector (Tracking)): EventLogging revision popup gets hidden behind content in Vector - https://phabricator.wikimedia.org/T282550 (10DAbad) 05Open→03Declined This isn't critical to MVP of Metrics... [15:23:23] 10Analytics-Radar, 10Better Use Of Data, 10Event-Platform, 10Metrics-Platform, 10Product-Data-Infrastructure: Explore sending batches of events from EPC libraries - https://phabricator.wikimedia.org/T239996 (10DAbad) p:05Lowest→03Medium a:05jlinehan→03Mholloway This is work that we can do. Priori... [15:38:03] razzi, ottomata - o/ there is a warning started days ago in icinga about hive server's heap size usage, that is now around 90%. Not a big issue, it may go down after the roll restarts for the openjdk upgrade, but the graphs are a little weird [15:38:27] https://grafana.wikimedia.org/d/000000379/hive?orgId=1&from=now-60d&to=now [15:38:32] this is the 60d view [15:39:20] the old gen grows steadily and also the MetaSpace (that shouldn't be on the heap but it is weird nonetheless) [15:39:50] I would roll restart the jvms on the coordinators asap (with DNS failover it should be easy) and keep monitor what happens during the next days [15:40:13] 90% of usage is fine but if more requests land to the hive server there is the risk of OOM [15:40:29] (it may also be time to expand a little the heap size to account for extra load) [15:43:28] joal: oh [15:43:35] i mean, i gues we don't really need the pom? [15:43:39] or. do we? [15:43:49] ottomata: hm - not sure [15:45:06] elukey: ack [15:45:17] ottomata: we probably don't need it as the jar is to be used standalone - I wonder if archiva will elt us not upload it [15:45:29] it will joal, you can upload anything [15:45:30] not even just jars [15:45:38] ottomata: will try just after meetings [15:45:40] k [16:28:49] btullis: iiuc you should have +2 rights now on puppet [16:29:05] so you can add yourself to the ops posix group in the admin module data.yaml [16:30:32] ottomata: it may be needed a formal pass from the SRE meeting (just a presentation etc.. I assume), we could add Ben to analytics-admins for these days (to get access to nodes etc..) [16:31:23] elukey: apparently its on his onboarding checklist and other SREs are involved? but that would make sense too, I think I learned from razz i's onboarding was that the defauult was if they got hired as an SRE then they should get root [16:32:45] ottomata: ahhhh okok then my bad, I thought that we still waited for a formal presentation during the SRE meeting (more an hello to everybody, not really to block people :) [16:35:21] (03PS2) 10Joal: Add webrequest and netflow gobblin jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/702075 (https://phabricator.wikimedia.org/T271232) [16:39:40] ottomata: thanks. I just tried to push a branch and got a permission denied. [remote rejected] add_btullis_to_ops_group -> add_btullis_to_ops_group (prohibited by Gerrit: update for creating new commit object not permitted) [16:40:35] btullis: how did you push? (did you use git review?) [16:41:05] btullis: you can try with "git push origin HEAD:refs/for/production" [16:41:15] https://www.mediawiki.org/wiki/Gerrit/Tutorial#Prepare_to_work_with_Gerrit [16:41:17] it should create the code review (basically like git review) [16:41:40] Oh, OK. Thanks. I will try git review first. I just tried 'git push -u origin add_btullis_to_ops_group' [16:46:57] 'remote: SUCCESS' https://gerrit.wikimedia.org/r/c/operations/puppet/+/702424 [16:48:20] yeah gerrit is weird [16:48:34] btullis: if you do [16:48:38] Bug: T285754 [16:48:40] in the commit message [16:48:53] a gerrit bot will automaticallly link it to the phab ticket [16:49:04] you should be able to git commit --amend [16:49:06] edit the message [16:49:12] then git review again [16:49:18] and it will upload another patch [16:49:26] using the same Change-Id [16:52:22] Ah, Jenkins said: Invalid commit message [16:54:45] btullis: if you have docker on your laptop, ./utils/run_ci_locally.sh is very useful (runs the same jenkins things locally) [16:54:54] (it is in the puppet repo) [16:56:01] (in this case there is an extra line between Bug and change-id) [16:57:49] Ah yes, thanks. I had skimmed this: https://wikitech.wikimedia.org/wiki/Puppet_coding#Testing_a_patch but I hadn't seen anything obvious about commit message style and linting. [17:00:11] btullis: yes yes those checks are extremely strict sometimes :( [17:00:18] btullis: you could also skim https://wikitech.wikimedia.org/wiki/Puppet [17:00:28] https://www.mediawiki.org/wiki/Gerrit/Commit_message_guidelines is about commit messages, it was written for mediawiki development but close enough for puppet too [17:00:34] :) [17:00:49] lets verify this one with moritzm too, moritzm could you +1 https://gerrit.wikimedia.org/r/c/operations/puppet/+/702424 [17:02:00] btullis: mostly this part [17:02:00] https://wikitech.wikimedia.org/wiki/Puppet#Updating_operations/puppet [17:02:03] about puppet-merge [17:06:19] Cool, thanks. Looks good. I'm sure it will become second nature. Will install docker for local testing too. [17:15:40] It's getting towards the end of my day, so I'll sign off and see you all tomorrow. [17:34:08] (03CR) 10Joal: "Last round of comments - I think we're good after that" (035 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/701463 (https://phabricator.wikimedia.org/T271232) (owner: 10Ottomata) [17:36:26] Bye btullis :) [17:39:42] ok laters! [17:40:02] ottomata: let me know if you wish me to take over your patch and implement my last comments [17:40:19] ottomata: I'm running my last bits of testing, it all works :) [17:40:28] =joal feel free! [17:40:31] please amend [17:40:33] i'm working on puppett not [17:40:34] now [17:40:38] puppet gobblni [17:40:40] ack - will amend [17:43:07] joal we should make a webrequest_test_text to start with [17:43:13] and do that on test cluster [17:43:14] it uses a different topic [17:43:36] Yessir - will add that in the jobs patch [17:43:58] or do you wish I separate patches maybe ottomata ? [17:44:12] (03PS11) 10Joal: Add bin/gobblin wrapper and initial gobblin/ common properties files [analytics/refinery] - 10https://gerrit.wikimedia.org/r/701463 (https://phabricator.wikimedia.org/T271232) (owner: 10Ottomata) [17:44:24] separate maybe, lets just start with that one at a time [17:44:57] ack [17:45:18] (03PS3) 10Joal: Add webrequest and netflow gobblin jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/702075 (https://phabricator.wikimedia.org/T271232) [17:52:07] (03PS1) 10Joal: Add webrequest_test gobblin job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/702431 (https://phabricator.wikimedia.org/T271232) [17:52:19] ottomata: --^ [17:52:50] (03CR) 10Ottomata: "For consistency with existing camus job lets call this webrequest_test_text" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/702431 (https://phabricator.wikimedia.org/T271232) (owner: 10Joal) [17:53:09] joal: what is the extract.namespace? [17:53:16] and job.group? [17:53:20] (is job.group a MR property?) [17:54:00] oh wait, worry [17:54:02] sorry [17:54:09] the camus job \in test cluster is just called 'webrequest' [17:54:10] hm [17:54:16] I don't think job.group is a MR prop - namespace can be used to precise/refine how to store data in folders, group I don't know, I think it's to facilitate job organization when running in specific gobblin mode [17:54:19] lets do it the same for this, i guess, right? [17:54:36] oh but [17:54:37] right. [17:54:43] ottomata: we'll have 2 webrequest jobs then [17:54:46] nevermind joal i think webrequest_text is correct [17:54:47] right. [17:54:52] webrequest_test*** [17:55:03] ack :) I had actually for once think about that :) [17:55:04] (03CR) 10Ottomata: "Oh, nevermind, webrequest_test is probably correct." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/702431 (https://phabricator.wikimedia.org/T271232) (owner: 10Joal) [17:55:09] :) [17:55:36] joal: how often should this job run? [17:55:45] let me check [17:56:39] every 10 minutes is the current standard [17:56:42] ottomata: --^ [17:57:00] right, wasn't sure if that needed to be changed [17:57:02] ok cool [17:57:04] ottomata: data is very small, we can make it longer if ou prefer [17:57:14] but it shouldn't be any problem [17:57:15] no that is good [17:57:32] joal: so the bin/gobblin wrapper works for ya? [17:57:46] it does indeed ottomata [17:58:04] ok, its getting late for you ya? [17:58:08] ottomata: I had to put the gobblin jar at the beggining of the hadoop path and that's it :) [17:58:14] we could merge that and webrequest_test.pull, deploy to test cluster [17:58:21] run manuualy [17:58:24] then merge the puppet and try? [17:58:44] ottomata: I wish to start the upload of the artifact to archiva - this will take ages for me (bad connection) [17:58:50] oh right [17:58:51] ok [17:59:02] ok start that and we can actually try it out tomorrow [17:59:04] ottomata: I also need some help with gerrit 0 mind a quick batcave? [17:59:10] sure [18:00:59] (03PS1) 10Joal: Rename parent package to follow wmf convention [analytics/gobblin] (wmf) - 10https://gerrit.wikimedia.org/r/702436 (https://phabricator.wikimedia.org/T271232) [18:03:03] (03CR) 10Ottomata: [C: 03+2] Rename parent package to follow wmf convention [analytics/gobblin] (wmf) - 10https://gerrit.wikimedia.org/r/702436 (https://phabricator.wikimedia.org/T271232) (owner: 10Joal) [18:03:11] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Rename parent package to follow wmf convention [analytics/gobblin] (wmf) - 10https://gerrit.wikimedia.org/r/702436 (https://phabricator.wikimedia.org/T271232) (owner: 10Joal) [18:06:25] ottomata: shall I add a tag to git for this release? [18:08:30] we can do that tomorrow :) [18:08:49] Upload of jar to archiva started - ending my day now - later! [18:12:13] joal: yeah probably a good idea!@ [18:12:22] you can actually just do that in gerrit UI i think [18:12:28] laters! [18:19:47] !log unmount and remount /mnt/hdfs on an-test-client1001 for java security update [18:19:50] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:20:40] ottomata: I see some java processes you started on an-test-client1001 that have the old java files open, such as: `java 26361 otto DEL REG 254,1 8394165 /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/dnsns.jar` (from `sudo lsof -Xd DEL`), do you know what to do about those? [18:21:09] ottomata: tag added :) [18:21:32] Working on https://phabricator.wikimedia.org/T283067 on the hadoop test cluster [18:26:19] uh no, looking [18:27:09] oh it might be a running spark shell [18:27:43] razzi: killed some stuff [18:37:34] (03CR) 10Ottomata: "Joal think we can merge this?" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/676075 (owner: 10DCausse) [18:42:41] (03PS1) 10Milimetric: Switch to codfw events until August [analytics/refinery] - 10https://gerrit.wikimedia.org/r/702440 [18:44:25] (03CR) 10Milimetric: [V: 03+2 C: 03+2] "self-merging because it's trivial and I want to deploy. I thought I remembered some other automatic fix, but I don't see anything in the " [analytics/refinery] - 10https://gerrit.wikimedia.org/r/702440 (owner: 10Milimetric) [18:52:30] (03PS1) 10Milimetric: Revert "Switch to codfw events until August" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/702389 [18:52:39] (03PS2) 10Milimetric: Revert "Switch to codfw events until August" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/702389 [18:52:43] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Revert "Switch to codfw events until August" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/702389 (owner: 10Milimetric) [21:39:15] (03PS1) 10MewOphaswongse: Add postedit-task-refresh to helppanel schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/702472 (https://phabricator.wikimedia.org/T272664) [21:40:32] (03PS2) 10MewOphaswongse: Add postedit-task-refresh to analytics/legacy/helppanel [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/702472 (https://phabricator.wikimedia.org/T272664) [21:51:19] (03CR) 10MewOphaswongse: [C: 03+1] link_suggestion_interaction: Add outdatedsuggestions_dialog interface [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/701376 (https://phabricator.wikimedia.org/T283109) (owner: 10Kosta Harlan)