[00:07:39] 10Data-Engineering, 10Platform Engineering Roadmap: Audit/review pageviews test cases - https://phabricator.wikimedia.org/T305502 (10Eevans) See in inline for a few (randomish) comments: >>! In T305502#7837105, @BPirkle wrote: > I did an initial inventory of the current production service test file ([[ https:... [00:35:02] RECOVERY - Check unit status of monitor_refine_eventlogging_legacy on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_eventlogging_legacy https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [01:02:30] 10Data-Engineering, 10Platform Engineering Roadmap: Audit/review pageviews test cases - https://phabricator.wikimedia.org/T305502 (10BPirkle) >>! In T305502#7846935, @Eevans wrote: > The AQS endpoints are read-only. The POSTs you're seeing are a RESTBase'ism. What we're doing instead, is using a test environ... [01:06:54] (03PS1) 10Milimetric: Add more gulp dependencies for fomantic build [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/779155 [01:07:18] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Add more gulp dependencies for fomantic build [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/779155 (owner: 10Milimetric) [07:06:11] (03CR) 10Aqu: "I have fixed all suggestions." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/774383 (https://phabricator.wikimedia.org/T300039) (owner: 10Aqu) [07:06:28] (03CR) 10Aqu: [C: 03+2] Add archiving job for Airflow [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/774383 (https://phabricator.wikimedia.org/T300039) (owner: 10Aqu) [07:09:49] Hi all, in analytics/refinery/source, are tags auto incremented ? We are currently on 0.1.25, and I am about to merge. [07:15:54] (03Merged) 10jenkins-bot: Add archiving job for Airflow [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/774383 (https://phabricator.wikimedia.org/T300039) (owner: 10Aqu) [07:25:34] Hello, I'm trying to give access to `/user/urbanecm/growth_data` in HDFS to anyone (it's used by a superset dashboard I created), but `hdfs dfs -chmod -R u+rx '/user/urbanecm/growth_data'` doesn't seem to change the permissions. Am I doing the magic incorrectly? [07:25:43] this is what happens https://www.irccloud.com/pastebin/fTJn82Qn/ [07:35:39] doesn't the 'u' in your chmod mean 'the current user', not 'everybone else' (which would be 'o') [07:36:02] urbanecm: ^ [07:38:00] ehm...that was a stupid mistake [07:38:05] that did it [07:47:30] (thanks) [10:21:41] 10Data-Engineering, 10MediaWiki-extensions-EventLogging, 10Readers-Web-Backlog: Deprecate/delete the mw.eventLog.Schema class - https://phabricator.wikimedia.org/T305491 (10phuedx) [10:28:02] (03PS6) 10Snwachukwu: [WIP] Create a Hive to Graphite job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/775376 (https://phabricator.wikimedia.org/T304623) [10:41:48] just a heads-up, aqs1007 has a SMART alert [10:42:08] although those device names look odd, I am not sure they're real devices [10:42:59] oh, iDRAC disks [12:51:22] aqu: tags are incremented during the release process (handled by jenkins and maven-release-plugin) [12:51:36] https://boards.greenhouse.io/wikimedia/jobs/4088632?gh_src=bd67b59d1us [12:59:40] ottomata: Thanks! [12:59:53] OOPS [12:59:55] wrong paste [12:59:56] haha [13:00:12] aqu: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Deploy/Refinery-source [13:18:37] (03CR) 10Mforns: "Left some comments on code formatting." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/775376 (https://phabricator.wikimedia.org/T304623) (owner: 10Snwachukwu) [14:08:41] a-team: the weekly deployment train is empty: https://etherpad.wikimedia.org/p/analytics-weekly-train ALL ABOARD! Train leaves in 90 minutes [14:16:28] Heyy milimetric :] [14:16:48] we are code reviewing a refinery-source change, that we'd like to deploy today if possible [14:17:05] I'll ping you as soon as poss [14:21:23] hey empty trains don't make sense, I'll keep it in the station until you're ready :) [14:21:31] no rush [14:25:31] thanks :] [14:44:01] PROBLEM - Check unit status of produce_canary_events on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [14:55:17] RECOVERY - Check unit status of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [14:58:16] Hi team, I'm starting the clouddb hosts upgrade for https://phabricator.wikimedia.org/T299480 [15:00:40] razzi: nice. On Manuel's point about our analytics instance, "Not sure if that requires special treatment in order not to affect any service.": you probably know this, but that's just there to help us sqoop at the beginning of the month. The last sqoop is long done, and the next one isn't for another couple weeks, so you're all clear [15:01:39] mforns: when you have a sec, I'm trying to track down some delayed airflow jobs [15:01:51] yep, agreed re: the analytics instance milimetric [15:02:17] milimetric: ok! [15:02:30] milimetric: we have the airflow sync meeting now, but after! [15:04:39] k [15:04:51] ooh, or I can crash :) [15:05:21] oh no, it's with Gabrielle, nono, yall focus [15:08:42] 10Data-Engineering-Radar, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Upgrade clouddb* hosts to Bullseye - https://phabricator.wikimedia.org/T299480 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=55df59f3-2152-4064-a9d7-eecc09c55982) set by razzi@cumin1001 for... [15:11:05] 10Data-Engineering-Radar, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Upgrade clouddb* hosts to Bullseye - https://phabricator.wikimedia.org/T299480 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by razzi@cumin1001 for host clouddb1013.eqiad.wmnet with OS b... [15:36:09] 10Data-Engineering, 10MediaWiki-extensions-EventLogging, 10Readers-Web-Backlog: Deprecate/delete the mw.eventLog.Schema class - https://phabricator.wikimedia.org/T305491 (10Jdlrobson) I'm down to delete it. My understanding is we introduced it back in a time when WikimediaEvents was not the defacto location... [15:36:30] 10Data-Engineering-Radar, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Upgrade clouddb* hosts to Bullseye - https://phabricator.wikimedia.org/T299480 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by razzi@cumin1001 for host clouddb1013.eqiad.wmnet with OS bulls... [15:40:01] 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow: [Airflow] Research, discuss and decide on DAG/task dependencies VS. success/failure files (Oozie style) - https://phabricator.wikimedia.org/T301568 (10mforns) So, as a recap: * It seems, we need the ability for Airflow jobs to generate success files... [15:40:17] 10Data-Engineering-Radar, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Upgrade clouddb* hosts to Bullseye - https://phabricator.wikimedia.org/T299480 (10razzi) I forgot to tell netboot to treat these hosts as database hosts, which I have now done in https://gerrit.wikimedia.org/r/c/... [15:41:34] 10Analytics: jmads requesting Kerberos password - https://phabricator.wikimedia.org/T250560 (10jmads) 05Resolved→03Open Reopening this ticket in case I need a new Kerberos password after shell access is restored. [15:59:14] 10Analytics, 10Analytics-Jupyter, 10Data-Engineering, 10Data-Engineering-Kanban: Autocomplete is very slow (unusable) in Newpyter - https://phabricator.wikimedia.org/T290008 (10EChetty) p:05Medium→03Low Marking as Low for now since the work around seems to be working and will fix itself when we get thr... [16:10:58] 10Data-Engineering, 10LDAP-Access-Requests, 10SRE: Request to add user gmodena to analytics-research-admins group - https://phabricator.wikimedia.org/T305880 (10jcrespo) @gmodena Did the access work? [16:12:41] 10Data-Engineering, 10SRE, 10SRE-Access-Requests: Request to add user gmodena to analytics-research-admins group - https://phabricator.wikimedia.org/T305880 (10Zabe) [16:19:48] 10Data-Engineering, 10MediaWiki-extensions-EventLogging, 10Readers-Web-Backlog: Deprecate/delete the mw.eventLog.Schema class - https://phabricator.wikimedia.org/T305491 (10Krinkle) I believe the bulk of usage was removed over time by me, due to it encouraging performance anti-patterns. It made it hard to do... [16:23:57] 10Data-Engineering, 10MediaWiki-extensions-EventLogging, 10Readers-Web-Backlog: Deprecate/delete the mw.eventLog.Schema class - https://phabricator.wikimedia.org/T305491 (10Milimetric) +1 to delete outright, deprecation would be nice but I think this API is pretty tightly controlled and used by a small set o... [16:31:25] (03PS7) 10Snwachukwu: [WIP] Create a Hive to Graphite job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/775376 (https://phabricator.wikimedia.org/T304623) [16:43:05] 10Data-Engineering-Radar, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Upgrade clouddb* hosts to Bullseye - https://phabricator.wikimedia.org/T299480 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by razzi@cumin1001 for host clouddb1013.eqiad.wmnet with OS b... [17:02:53] 10Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T264528 (10rook) This seems to be working in a fork https://quarry.wmcloud.org/query/63748 I'll close this for now. Though please re-open it if it can be repeated. [17:03:13] 10Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T264528 (10rook) 05Open→03Declined [17:12:08] 10Quarry, 10Data-Services: SQL requests to DB replicas became work much slower, both from Quarry and from process on Toolforge - https://phabricator.wikimedia.org/T262757 (10rook) This looks like it was resolved, but perhaps not marked so. Is there anything else to be done with this ticket? [17:12:28] 10Quarry, 10Data-Services: Unable to use `force index` on replicas due to view layer intervention (Key 'PRIMARY' doesn't exist in table 'page') - https://phabricator.wikimedia.org/T251980 (10rook) This looks like it was meant to be closed, but perhaps not marked so. Is there anything else to be done with this... [17:20:59] (03CR) 10Mforns: [C: 03+2] "LGTM!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/775376 (https://phabricator.wikimedia.org/T304623) (owner: 10Snwachukwu) [17:24:39] mforns: I have a haircut appointment now, but I'll ping you when I'm back, we can talk deploy and those airflow jobs [17:25:05] ok! milimetric we can also pair on the train if you want :] [17:25:41] (03PS1) 10Luke Bowmaker: Fixed requirement to user_id from performer [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/779517 [17:27:08] (03CR) 10jerkins-bot: [V: 04-1] Fixed requirement to user_id from performer [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/779517 (owner: 10Luke Bowmaker) [17:30:07] (03Merged) 10jenkins-bot: [WIP] Create a Hive to Graphite job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/775376 (https://phabricator.wikimedia.org/T304623) (owner: 10Snwachukwu) [17:30:22] (03PS1) 10NOkafor: updated usage comment [analytics/refinery] - 10https://gerrit.wikimedia.org/r/779521 [17:30:24] (03PS1) 10NOkafor: Merge branch 'master' of ssh://gerrit.wikimedia.org:29418/analytics/refinery [analytics/refinery] - 10https://gerrit.wikimedia.org/r/779522 [17:30:26] (03PS1) 10NOkafor: updated usage from hive to spark2-sql Bug T300025, T302875 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/779523 [17:31:10] 10Data-Engineering, 10MediaWiki-extensions-EventLogging, 10Readers-Web-Backlog: Deprecate/delete the mw.eventLog.Schema class - https://phabricator.wikimedia.org/T305491 (10Ottomata) +1 [17:37:20] 10Data-Engineering, 10Data-Engineering-Kanban, 10Product-Analytics, 10Superset, 10Patch-For-Review: Upgrade Superset to 1.4.2 - https://phabricator.wikimedia.org/T304972 (10kzimmerman) A few of us (@mpopov, @MNeisler , @Iflorez , @cchen , @Mayakp.wiki, and I) will each test a chart or dashboard in stagin... [18:47:24] (03Abandoned) 10Luke Bowmaker: Fixed requirement to user_id from performer [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/779517 (owner: 10Luke Bowmaker) [19:16:02] (03CR) 10Ottomata: Image Suggestions feature schema (039 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/779052 (owner: 10Luke Bowmaker) [19:54:30] 10Data-Engineering-Radar, 10Data-Services, 10cloud-services-team (Kanban): Upgrade clouddb* hosts to Bullseye - https://phabricator.wikimedia.org/T299480 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by razzi@cumin1001 for host clouddb1013.eqiad.wmnet with OS bullseye completed: - cloud... [20:43:59] about to deploy refinery source [20:47:48] (03PS1) 10Milimetric: Update changelog.md to v0.1.26 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/779552 [20:48:02] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Update changelog.md to v0.1.26 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/779552 (owner: 10Milimetric) [20:48:33] Starting build #103 for job analytics-refinery-maven-release-docker [21:00:56] 10Data-Engineering-Radar, 10Data-Services, 10cloud-services-team (Kanban): Upgrade clouddb* hosts to Bullseye - https://phabricator.wikimedia.org/T299480 (10razzi) [21:02:36] Project analytics-refinery-maven-release-docker build #103: 09SUCCESS in 14 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/103/ [21:03:37] Starting build #62 for job analytics-refinery-update-jars-docker [21:04:09] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.1.26 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/779555 [21:04:10] Project analytics-refinery-update-jars-docker build #62: 09SUCCESS in 33 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars-docker/62/ [21:14:39] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Add refinery-source jars for v0.1.26 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/779555 (owner: 10Maven-release-user) [22:12:13] !log deployed and synced refinery-source 0.1.26 to hdfs [22:12:15] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [22:16:17] 10Data-Engineering, 10Data-Engineering-Kanban, 10Product-Analytics: Change ownership of wmf_product.new_editors to analytics-product - https://phabricator.wikimedia.org/T305109 (10Mayakp.wiki) 05Open→03Resolved [22:39:04] (03PS1) 10NOkafor: Merge branch 'master' of ssh://gerrit.wikimedia.org:29418/analytics/refinery Change-Id: If55278116320fb91c2475cb6d0fecd93048af221 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/779564 [22:42:28] (03PS1) 10NOkafor: update usage from hive to spark2-sql Bug: T300025, T302875 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/779565 (https://phabricator.wikimedia.org/T300025) [22:46:33] 10Data-Engineering-Radar, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Upgrade clouddb* hosts to Bullseye - https://phabricator.wikimedia.org/T299480 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=b835e643-0d45-43b4-9fdc-04e643305c67) set by razzi@cumin1001 for... [22:48:24] 10Data-Engineering-Radar, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Upgrade clouddb* hosts to Bullseye - https://phabricator.wikimedia.org/T299480 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by razzi@cumin1001 for host clouddb1014.eqiad.wmnet with OS b... [23:18:00] 10Quarry, 10Data-Services: SQL requests to DB replicas became work much slower, both from Quarry and from process on Toolforge - https://phabricator.wikimedia.org/T262757 (10MBH) 05Open→03Resolved a:03MBH Yeah, now such request executed in 34.38 seconds. [23:23:18] 10Data-Engineering-Radar, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Upgrade clouddb* hosts to Bullseye - https://phabricator.wikimedia.org/T299480 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by razzi@cumin1001 for host clouddb1014.eqiad.wmnet with OS bulls... [23:27:27] 10Data-Engineering-Radar, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Upgrade clouddb* hosts to Bullseye - https://phabricator.wikimedia.org/T299480 (10razzi) [23:28:22] 10Data-Engineering-Radar, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Upgrade clouddb* hosts to Bullseye - https://phabricator.wikimedia.org/T299480 (10razzi) Ok after some help with wmf-pt-kill in https://phabricator.wikimedia.org/T305974 and a patch to update netboot for other cl... [23:39:02] 10Quarry, 10Data-Services: Long-running Quarry query (querry?) produces strangely incorrect results - https://phabricator.wikimedia.org/T135087 (10bd808) 05Open→03Declined 6 years is long enough to rot