[00:37:04] <icinga-wm>	 PROBLEM - Check unit status of monitor_refine_eventlogging_legacy on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_eventlogging_legacy https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[05:29:18] <wikibugs>	 10Data-Engineering: Deprecate GeoIP Legacy Download - https://phabricator.wikimedia.org/T303464 (10odimitrijevic)
[05:32:31] <wikibugs>	 10Data-Engineering: Migrate to MaxMind GeoIP2 - https://phabricator.wikimedia.org/T302989 (10odimitrijevic) 05Open→03Declined Data engineering already uses GeoIP2 datasets.
[05:38:29] <wikibugs>	 10Data-Engineering: Disable GeoIP Legacy Download - https://phabricator.wikimedia.org/T303464 (10odimitrijevic)
[05:39:51] <wikibugs>	 10Data-Engineering, 10SRE, 10Traffic, 10Trust-and-Safety, 10serviceops: Disable GeoIP Legacy Download - https://phabricator.wikimedia.org/T303464 (10odimitrijevic)
[10:12:13] <wikibugs>	 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow, 10Data-Catalog: Complete monitoring setup of datahubsearch nodes - https://phabricator.wikimedia.org/T302818 (10BTullis)
[10:16:01] <wikibugs>	 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog: Complete monitoring setup of datahubsearch nodes - https://phabricator.wikimedia.org/T302818 (10BTullis) All checks are green, now that the prometheus exporter has been fixed. Marking this ticket as done.
[10:20:20] <wikibugs>	 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog, 10Patch-For-Review: Define LVS load-balancing for OpenSearch cluster - https://phabricator.wikimedia.org/T301458 (10BTullis) I have moved this to the `monitoring_setup` state, so the cluster will be monitored by Icinga, but it will not page. I...
[11:46:59] <icinga-wm>	 RECOVERY - Check unit status of monitor_refine_eventlogging_legacy on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_eventlogging_legacy https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[11:51:58] <wikibugs>	 10Data-Engineering, 10MediaWiki-extensions-EventLogging: Generate $wgEventLoggingSchemas from $wgEventStreams - https://phabricator.wikimedia.org/T303602 (10Ottomata) Okay so I do have thoughts!  We made `$wgEventLoggingStreamNames` before we talked about and decided to do https://wikitech.wikimedia.org/wiki/E...
[11:52:35] <wikibugs>	 10Data-Engineering, 10MediaWiki-extensions-EventLogging: Generate $wgEventLoggingSchemas from $wgEventStreams - https://phabricator.wikimedia.org/T303602 (10Ottomata) OH WAIT, this is exactly what you are proposing!  But without bothering to make the EventStreamConfig API do it.  Okay great!
[11:55:06] <wikibugs>	 10Data-Engineering-Kanban, 10LDAP-Access-Requests: Grant Access to LDAP wmf group for NOkafor - https://phabricator.wikimedia.org/T303512 (10Ottomata) Approved!
[11:58:11] <wikibugs>	 10Data-Engineering-Kanban, 10SRE, 10SRE-Access-Requests: Requesting access to DataEngineering Team Resources for NOkafor - https://phabricator.wikimedia.org/T303516 (10BTullis)
[11:58:44] <wikibugs>	 10Data-Engineering-Kanban, 10LDAP-Access-Requests: Grant Access to LDAP wmf group for NOkafor - https://phabricator.wikimedia.org/T303512 (10BTullis)
[12:04:44] <wikibugs>	 10Data-Engineering-Kanban, 10LDAP-Access-Requests: Grant Access to LDAP wmf group for NOkafor - https://phabricator.wikimedia.org/T303512 (10BTullis) I have added Njideka to the wmf group in LDAP. ` btullis@mwmaint1002:~$ ldapsearch -x cn=wmf|grep nokafor  btullis@mwmaint1002:~$ sudo modify-ldap-group wmf Sear...
[12:11:56] <wikibugs>	 10Data-Engineering-Kanban, 10SRE, 10SRE-Access-Requests: Requesting access to DataEngineering Team Resources for NOkafor - https://phabricator.wikimedia.org/T303516 (10BTullis) LDAP membership of the `wmf` groups has been added in T303512   I have created the kerberos principal.  ` btullis@krb1001:~$ sudo ma...
[12:15:15] <wikibugs>	 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog: Complete monitoring setup of datahubsearch nodes - https://phabricator.wikimedia.org/T302818 (10BTullis) a:05razzi→03BTullis
[12:17:14] <wikibugs>	 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog, 10Patch-For-Review: Define LVS load-balancing for OpenSearch cluster - https://phabricator.wikimedia.org/T301458 (10BTullis) The monitoring check in Icinga for this service is now fixed.
[12:51:56] <wikibugs>	 10Data-Engineering-Radar, 10Growth-Team, 10MediaWiki-extensions-GuidedTour: Finish decommissioning the legacy GuidedTour schemas - https://phabricator.wikimedia.org/T303712 (10phuedx)
[13:00:30] <wikibugs>	 (03PS9) 10Ottomata: [WIP] Add prometheus metrics reporter [analytics/gobblin-wmf] - 10https://gerrit.wikimedia.org/r/767178 (owner: 10Joal)
[13:01:44] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add prometheus metrics reporter [analytics/gobblin-wmf] - 10https://gerrit.wikimedia.org/r/767178 (owner: 10Joal)
[13:18:28] <ottomata>	 gehel:  o/
[13:18:35] <ottomata>	 i'm trying to enable CheckStyle-IDEA
[13:18:42] <ottomata>	 to work with the discovery-parent-pom stuff
[13:18:50] <ottomata>	 i've got the plugin insttalled and i can see where to enable it
[13:19:02] <ottomata>	 but, afaict we don't have a checkstyle.xml config file?
[13:19:59] <ottomata>	 OH WAIT i found docs
[13:20:06] <ottomata>	 https://github.com/wikimedia/wikimedia-discovery-discovery-parent-pom#maven-checkstyle-plugin
[13:20:08] <ottomata>	 should have looked firrst sorry!
[13:21:02] <ottomata>	 btw, the maven central link is broken
[13:30:36] <wikibugs>	 (03PS10) 10Ottomata: [WIP] Add prometheus metrics reporter [analytics/gobblin-wmf] - 10https://gerrit.wikimedia.org/r/767178 (owner: 10Joal)
[13:32:48] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add prometheus metrics reporter [analytics/gobblin-wmf] - 10https://gerrit.wikimedia.org/r/767178 (owner: 10Joal)
[13:34:31] <wikibugs>	 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Send some existing Gobblin metrics to prometheus - https://phabricator.wikimedia.org/T294420 (10Ottomata) Update:  It won't be possible (at least not without a lot more work) to get anything but metrics from the Gobblin...
[13:37:28] <wikibugs>	 (03PS11) 10Ottomata: Add PrometheusEventReporter [analytics/gobblin-wmf] - 10https://gerrit.wikimedia.org/r/767178 (owner: 10Joal)
[13:37:36] <ottomata>	 joal:  https://gerrit.wikimedia.org/r/c/analytics/gobblin-wmf/+/767178 is ready for review!
[13:38:05] <ottomata>	 jenkins is failing because of some javadoc issues in copy/ module (won't fix), and because of some spotbugs thing i don't quite understand
[13:38:58] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Add PrometheusEventReporter [analytics/gobblin-wmf] - 10https://gerrit.wikimedia.org/r/767178 (owner: 10Joal)
[13:47:38] <wikibugs>	 (03CR) 10Ottomata: "Ready for review!" [analytics/gobblin-wmf] - 10https://gerrit.wikimedia.org/r/767178 (owner: 10Joal)
[14:14:05] <joal>	 Hey ottomata - will review :)
[14:23:18] <ottomata>	 ty!
[14:31:42] <ottomata>	 joal:  max.incremental.fetch.session.cache.slots=2000 ready to go
[14:32:03] <ottomata>	 i can deploy now, perhaps it will be good to deploy and let it sit for a day or so before you try your thing?
[14:32:10] <ottomata>	 to see if we don't go above 2000 in regular operations?
[14:37:21] <joal>	 works for me ottomata :)
[14:37:27] <ottomata>	 okay
[14:37:50] <ottomata>	 elukey:  FYI and any objections to https://gerrit.wikimedia.org/r/c/operations/puppet/+/770505 ?
[14:41:20] <joal>	 ottomata: also, after talking with dcausse I have tested with a different config for reading data (read from consumer-groups) -> At second run the job fails almost instantly, meaning my problem seems related to data more than anything else
[14:42:12] <elukey>	 ottomata: np +1, I am wondering though what a cache slot represents, is it a client->partition consumer? (Super ignorant about it)
[14:45:23] <ottomata>	 i was too!
[14:45:24] <ottomata>	 elukey:  https://cwiki.apache.org/confluence/display/KAFKA/KIP-227%3A+Introduce+Incremental+FetchRequests+to+Increase+Partition+Scalability
[14:46:02] <ottomata>	 apparently it is a cache kept on brokers that maps FetchSessionIds (e.g. a consumer client process, or a replica fetcher process)  to metadata about the partitions they are interested int
[14:46:24] <ottomata>	 so that, the amount of data transferred on new connection can be reduced
[14:47:55] <ottomata>	 okay meetings about to start, i will probably merge and apply  tomomrrow my morn
[14:48:30] <elukey>	 sure sure, seems very complicated to judge the effects on kafka, but probably good for the jumbo use case if we are hitting limits
[14:49:27] <wikibugs>	 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow: Investigate using a HiveToGraphite connector job instead of individual jobs - https://phabricator.wikimedia.org/T303308 (10Snwachukwu) a:03Snwachukwu
[14:55:45] <ottomata>	 elukey:  indeed.  i think the only consequences will be slightly more memory used for this cache
[14:55:56] <ottomata>	 so very slightly less memory for messages in page cache
[14:56:09] <ottomata>	 but i don't think it will be much, it is just partition metadata
[14:58:55] <elukey>	 yep seems something good to try
[15:03:45] <ottomata>	 a-team standup
[15:34:10] <wikibugs>	 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow: Unifying HDFS Sensor and FSSPEC Sensor - https://phabricator.wikimedia.org/T302392 (10EChetty)
[15:43:51] <wikibugs>	 10Data-Engineering, 10Airflow, 10Platform Engineering: Replace Airflow's HDFS client (snakebite) with pyarrow - https://phabricator.wikimedia.org/T284566 (10EChetty) 05Open→03Declined
[15:43:53] <wikibugs>	 10Data-Engineering, 10Airflow, 10Epic, 10Platform Team Workboards (Image Suggestion API): Airflow collaborations - https://phabricator.wikimedia.org/T282033 (10EChetty)
[16:00:25] <wikibugs>	 10Data-Engineering, 10Data-Catalog: Set up karapace instance for datahub - https://phabricator.wikimedia.org/T301562 (10EChetty) a:05BTullis→03razzi
[16:06:59] <elukey>	 .7
[16:07:01] <elukey>	 uff
[16:07:05] <elukey>	 :)
[16:40:33] <ottomata>	 razzi o/
[16:40:48] <razzi>	 hi ottomata 
[16:41:04] <ottomata>	 sooOoOO what's up how can I help!/
[16:41:04] <ottomata>	 ?
[16:41:42] <razzi>	 I'm thinking about how to get the python dependencies for karapace into a superset_deploy style repository
[16:42:08] <ottomata>	 oh right cuz you need more than just the dependencies
[16:42:15] <ottomata>	 HMMM razzi  want to try the new conda_dist stuff?
[16:42:16] <ottomata>	 ???
[16:42:26] <ottomata>	 instead of putting all deps in git?
[16:42:32] <razzi>	 yeah show me the way
[16:42:45] <ottomata>	 https://gitlab.wikimedia.org/repos/data-engineering/workflow_utils#building-project-conda-environments-for-distribution
[16:43:02] <ottomata>	 but, we might be able to do that a little better with gitlab CI
[16:43:28] <ottomata>	 but, ultimately, if you have worfklow_utils with conda-dist CLI installed on your build box (local? docker?)
[16:43:33] <ottomata>	 in your python project (karapace)
[16:43:37] <ottomata>	 hopefully you can just run
[16:43:38] <ottomata>	 conda-dist
[16:43:41] <ottomata>	 and it will do all thte right stuff
[16:44:12] <razzi>	 Where does the dist environment get stored?
[16:44:54] <ottomata>	 conda dist will justt make a .tgz file of it
[16:45:00] <ottomata>	 then we'll have to put it somewhere
[16:45:07] <ottomata>	 the intention is to use conda-dist in your project's CI
[16:45:20] <ottomata>	 to generate the conda .tgz, and then upload it somewhere, probably to gitlab
[16:45:28] <ottomata>	 but...then we have to get it from gitlab to your server
[16:45:45] <ottomata>	 perhaps...for karapace since we will want to remove it anyway, just copying it there manually will be okay for now?
[16:46:30] <ottomata>	 or, we could use scap and the artifact syncing stuff like we set up for airflow
[16:56:34] <ottomata>	 hmm, razzi we might need tot add an conda-environment.yml file with the python dep specified
[16:58:05] <razzi>	 can you screenshare me ottomata ? 
[17:01:24] <ottomata>	 (we sharin)
[18:41:51] <wikibugs>	 (03CR) 10Vivian Rook: [C: 03+2] view.js: Show full run date in UTC [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/517145 (https://phabricator.wikimedia.org/T215831) (owner: 10Framawiki)
[18:46:10] <wikibugs>	 (03Merged) 10jenkins-bot: view.js: Show full run date in UTC [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/517145 (https://phabricator.wikimedia.org/T215831) (owner: 10Framawiki)
[18:58:26] <wikibugs>	 10Quarry, 10Patch-For-Review: Show query run date above outputs section - https://phabricator.wikimedia.org/T215831 (10rook) 05Open→03Resolved
[21:03:14] <bearloga>	 a-team: sorry, I tried to run a big hive query on stat1004 and it went very sour and I can't even kill it (pid 15674)
[21:03:33] <razzi>	 bearloga: would you like me to try to kill it?
[21:03:40] <bearloga>	 razzi: yes please
[21:04:28] <razzi>	 ok it is done bearloga 
[21:04:39] <bearloga>	 razzi: thank you!!!
[21:04:49] <razzi>	 Didn't respond to the usual kill signal so I gave it the -9
[21:05:23] <bearloga>	 oooh I saw that somewhere but didn't know how to use it or that I should
[21:05:39] <razzi>	 !log `sudo kill -9 15674` to stop unresponsive hive query
[21:05:41] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[21:07:24] <razzi>	 bearloga: it'd probably be fine if it's your own process, but since the process wouldn't have a chance to clean up so it can lead to a messy computer state
[21:08:20] <razzi>	 all looks well to me, carry on querying :)
[21:16:53] <bearloga>	 thanks! :D
[21:36:29] <wikibugs>	 10Data-Engineering, 10Data-Catalog, 10Patch-For-Review: Create debian package of karapace - https://phabricator.wikimedia.org/T301565 (10razzi) a:03razzi I have been working on this and there is a deb at `deneb.codfw.wmnet:/home/razzi/karapace-temp/karapace_2.1.3-py3.7-0_amd64.deb`. To build this deb, I us...
[21:36:58] <wikibugs>	 10Data-Engineering, 10Data-Catalog, 10Patch-For-Review: Create debian package of karapace - https://phabricator.wikimedia.org/T301565 (10razzi) Still todo: upload the .deb to apt.wikimedia.org and iterate on https://gerrit.wikimedia.org/r/c/operations/puppet/+/770605 to install the package and set up a syste...
[21:46:56] <wikibugs>	 10Data-Engineering-Radar, 10Product-Analytics: Support on understanding traffic and behaviors for users on legacy browsers (somewhat timely) - https://phabricator.wikimedia.org/T303301 (10mpopov) Not sure how it bypassed triage column and appeared straight in the backlog