[01:25:38] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review, 10User-razzi: Upgrade the Hadoop masters to Debian Buster - https://phabricator.wikimedia.org/T278423 (10razzi) @JAllemandou could you explain what happened with safe mode and the yarn rmadmin? Maybe put a small comment here and then we can crea... [01:28:47] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review, 10User-razzi: Upgrade the Hadoop masters to Debian Buster - https://phabricator.wikimedia.org/T278423 (10razzi) Finally this task is done! Mar - July 2021 🥲 [05:01:28] 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Marostegui) I have switched m3-master from dbproxy1020 to dbproxy1016: https://gerrit.wikimedia.org/r/705789 [05:02:00] 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Marostegui) [06:01:38] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review, 10User-razzi: Upgrade the Hadoop masters to Debian Buster - https://phabricator.wikimedia.org/T278423 (10elukey) First of all, great work :) The problem should be the following: ` == Yarn view: 2021-07-20 16:36:32,071 WARN org.apache.hadoop.h... [06:08:21] 10Analytics: Deprecate profile::analytics::cluster::users - https://phabricator.wikimedia.org/T287063 (10elukey) [06:08:53] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review, 10User-razzi: Upgrade the Hadoop masters to Debian Buster - https://phabricator.wikimedia.org/T278423 (10elukey) Last but not least: T287063 [06:09:32] joal: bonjour, if I forgot anything please let me know :) --^ [07:05:30] 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10MoritzMuehlenhoff) [08:18:23] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-02-03), and 2 others: Add missing normalization to CodeMirror Grafana board - https://phabricator.wikimedia.org/T273748 (10Lena_WMDE) [08:33:55] 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10MoritzMuehlenhoff) [08:36:08] 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE, and 3 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10cmooney) 05Open→03Resolved [08:41:40] great writing elukey - thank you a lot :) [08:51:32] 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10cmooney) [08:52:41] 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10cmooney) [09:40:25] https://issues.apache.org/jira/browse/HADOOP-16795 [09:40:42] so hadoop 3.3 supports java 11, but the bytecode needs to be java-8 [09:53:16] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review, 10User-razzi: Upgrade the Hadoop masters to Debian Buster - https://phabricator.wikimedia.org/T278423 (10BTullis) I was going to mention the small snag that an-master1001 prompted during the installation for which partman recipe to use. ...but... [10:13:15] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review, 10User-razzi: Upgrade the Hadoop masters to Debian Buster - https://phabricator.wikimedia.org/T278423 (10elukey) @BTullis the partman reuse script that we use auto-selects everything in d-i, the operator has only to confirm the partitions layout... [10:23:26] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review, 10User-razzi: Upgrade the Hadoop masters to Debian Buster - https://phabricator.wikimedia.org/T278423 (10BTullis) Oh I see. So the "Enter" might just have been interpreted as "Return to partman menu". In that case, everything would be fine, as y... [12:32:05] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Update Spicerack cookbooks to follow the new class API conventions - https://phabricator.wikimedia.org/T269925 (10BTullis) [13:06:53] 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10fgiunchedi) [13:52:49] 10Analytics: Purge gobblin files - https://phabricator.wikimedia.org/T287084 (10JAllemandou) [13:53:09] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Add analytics-presto.eqiad.wmnet CNAME for Presto coordinator failover - https://phabricator.wikimedia.org/T273642 (10BTullis) For the DNS discovery, will we need to allocate a new IP addresses for these services? Will I need to use LVS? * `analy... [13:54:04] 10Analytics, 10Analytics-Kanban: When gobblin fails, we should know about it - https://phabricator.wikimedia.org/T286559 (10JAllemandou) [14:20:33] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Add analytics-presto.eqiad.wmnet CNAME for Presto coordinator failover - https://phabricator.wikimedia.org/T273642 (10jbond) > For the DNS discovery, will we need to allocate a new IP addresses for these services? Will I need to use LVS? You don't... [14:54:58] (03PS1) 10Mforns: Update languages.json to include newly translated languages [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/705926 [14:55:55] fdans: if you're here, can you please have a quick look to: https://gerrit.wikimedia.org/r/c/analytics/wikistats2/+/705926 [14:56:11] fdans: I tested that and all languages seem to work fine! [14:56:59] (03CR) 10jerkins-bot: [V: 04-1] Update languages.json to include newly translated languages [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/705926 (owner: 10Mforns) [14:57:15] :( [14:59:01] fdans: is this supposed to fail, it complains about a library, but I just changed a language config file... [15:27:58] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Add analytics-presto.eqiad.wmnet CNAME for Presto coordinator failover - https://phabricator.wikimedia.org/T273642 (10BTullis) >> I can do the analytics-test-presto just by giving it the single IP address of an-test-coord1001 - because there's on... [15:47:12] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review, 10User-MoritzMuehlenhoff: Reduce manual kinit frequency on stat100x hosts - https://phabricator.wikimedia.org/T268985 (10BTullis) I have updated the patch with comments from @elukey so that a parameter `enable_autorenew` is present with a defaul... [15:53:24] mforns: I can take it from here... it might be to do with semantic [15:53:32] thank you so much for testing! [16:15:24] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Add analytics-presto.eqiad.wmnet CNAME for Presto coordinator failover - https://phabricator.wikimedia.org/T273642 (10jbond) > So I agree that it is overkill to do it this way, but it would be at least consistent with production. ack understood, n... [16:44:27] 10Analytics: Create aggregate alarms for Hadoop daemons running on worker nodes - https://phabricator.wikimedia.org/T287027 (10BTullis) We discussed this in the SRE sync and agreed that this should be high priority, given the level of IRC logspam caused by a node manager failure. I'm happy to take this on if you... [16:57:51] 10Analytics-Clusters, 10Analytics-Kanban: Disk filling up on `/` on an-coord1001 - https://phabricator.wikimedia.org/T279304 (10BTullis) [16:57:59] (03CR) 10Nettrom: [C: 03+1] "This looks good to me!" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/704402 (https://phabricator.wikimedia.org/T278115) (owner: 10MewOphaswongse) [17:09:17] (03CR) 10Nettrom: [C: 03+1] "Looks good to me!" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/705493 (https://phabricator.wikimedia.org/T268708) (owner: 10MewOphaswongse) [17:24:10] (03PS1) 10Joal: Don't use Gobblin lock but rather yarn check [analytics/refinery] - 10https://gerrit.wikimedia.org/r/705970 (https://phabricator.wikimedia.org/T286559) [17:24:43] (03CR) 10Joal: "Warning: not tested!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/705970 (https://phabricator.wikimedia.org/T286559) (owner: 10Joal) [17:59:02] I'm taking the afternoon off for a mental health half day! See you all tomorrow [18:40:55] 10Analytics-EventLogging, 10Analytics-Radar, 10Discovery: '.event.pageViewId' should be string, '.event.subTest' should be string, '.event.searchSessionId' should be string - https://phabricator.wikimedia.org/T286814 (10Krinkle) [20:46:55] 10Analytics: [EventGate] Failures when getting stream config from MediaWiki API - https://phabricator.wikimedia.org/T286793 (10mforns) Looking a bit more into this I just saw that there's a big amount of requests to streamconfigs per hour: ` select uri_query, count(1) as freq from wmf_raw.webrequest where year=2... [20:55:17] 10Analytics, 10Growth-Team, 10Product-Analytics: Add geolocation information to Growth schemas - https://phabricator.wikimedia.org/T287121 (10nettrom_WMF) [20:57:03] 10Analytics, 10Growth-Team, 10Product-Analytics: Add geolocation information to Growth schemas - https://phabricator.wikimedia.org/T287121 (10nettrom_WMF) From what I've been able to find, this is the first time this has been requested, and so I'm unsure what exactly to ask for and how to do this. I'm hoping... [21:27:36] 10Analytics: [EventGate] Failures when getting stream config from MediaWiki API - https://phabricator.wikimedia.org/T286793 (10Mholloway) >>! In T286793#7228705, @mforns wrote: > Looking a bit more into this I just saw that there's a big amount of requests to streamconfigs per hour: These requests are from rece...