[00:09:05] <wikibugs>	 (03CR) 10Jforrester: [C: 03+1] "I don't have merge/deploy rights in this world, but it LGTM." [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/879595 (https://phabricator.wikimedia.org/T326825) (owner: 10Cicalese)
[01:34:39] <wikibugs>	 (03CR) 10Gergő Tisza: [C: 03+1] image-suggestions-feedback: Bump to version 2.0.0 (035 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan)
[02:26:46] <icinga-wm>	 RECOVERY - MegaRAID on an-worker1086 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[02:58:34] <icinga-wm>	 PROBLEM - MegaRAID on an-worker1086 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[09:10:32] <wikibugs>	 10Data-Engineering, 10Equity-Landscape: Affiliates input metric - https://phabricator.wikimedia.org/T309275 (10KCVelaga_WMF) @JAnstee_WMF Affiliate inputs QA at https://docs.google.com/spreadsheets/d/1yx4x96407HT9fTq1KrQxB_ZChKK8bJ9_NKGRPqynNjA/edit?pli=1#gid=0&range=Q3
[09:16:06] <wikibugs>	 10Data-Engineering, 10Equity-Landscape: Affiliates input metric - https://phabricator.wikimedia.org/T309275 (10KCVelaga_WMF) a:05ntsako→03JAnstee_WMF
[09:50:22] <icinga-wm>	 RECOVERY - MegaRAID on an-worker1086 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[10:13:05] <wikibugs>	 (03CR) 10Joal: "Two small things:" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/864772 (https://phabricator.wikimedia.org/T309769) (owner: 10Snwachukwu)
[10:22:10] <icinga-wm>	 PROBLEM - MegaRAID on an-worker1086 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[10:36:00] <wikibugs>	 10Data-Engineering-Planning, 10Epic, 10Patch-For-Review, 10Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-05)): Decide on installation details for new ceph cluster - https://phabricator.wikimedia.org/T326945 (10BTullis) Thanks for those insights @MatthewVernon - I think I'll go ahead and try the packag...
[11:18:12] <wikibugs>	 10Data-Engineering, 10Equity-Landscape: Affiliates output rank metrics - https://phabricator.wikimedia.org/T306619 (10KCVelaga_WMF) @JAnstee_WMF: affiliate outputs are QC'ed  Transformations within the sheet from the inputs: https://docs.google.com/spreadsheets/d/1yx4x96407HT9fTq1KrQxB_ZChKK8bJ9_NKGRPqynNjA/ed...
[11:38:10] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines, 10Patch-For-Review, 10Technical-Debt: Productionize HDFS fsimage data analysis job - https://phabricator.wikimedia.org/T261283 (10EChetty)
[11:38:27] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines, 10Patch-For-Review, 10Technical-Debt: Productionize HDFS fsimage data analysis job - https://phabricator.wikimedia.org/T261283 (10EChetty)
[11:38:58] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines, 10Patch-For-Review, 10Technical-Debt: Productionize HDFS fsimage data analysis job - https://phabricator.wikimedia.org/T261283 (10EChetty)
[11:39:46] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines, 10Patch-For-Review, 10Technical-Debt: Productionize HDFS fsimage data analysis job - https://phabricator.wikimedia.org/T261283 (10EChetty)
[11:39:56] <wikibugs>	 10Data-Engineering-Planning, 10Product-Analytics, 10Data Pipelines (Sprint 05-06): Investigate wikimedia and wikidata unique devices per-project-family overcount offset - https://phabricator.wikimedia.org/T301403 (10EChetty) 05Open→03Resolved
[11:40:58] <wikibugs>	 10Data-Engineering, 10Data Pipelines: Migrate 1+ Druid load jobs - https://phabricator.wikimedia.org/T307508 (10EChetty)
[11:41:05] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines: Back-fill Wikidata reliability Graphite metrics - https://phabricator.wikimedia.org/T321838 (10EChetty)
[11:41:10] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines: Add Python Linter Checks to CI - https://phabricator.wikimedia.org/T318346 (10EChetty)
[11:41:15] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines, 10Product-Analytics: Review why total_edits on Mediawiki_History differs from the total_edits on Editors_Daily - https://phabricator.wikimedia.org/T316896 (10EChetty)
[11:41:19] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines: Implement periodical cleaning of Airflow databases - https://phabricator.wikimedia.org/T322036 (10EChetty)
[11:41:23] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines: NEW FEATURE REQUEST: sqoop (all) user properties from mariadb to wmf_raw.mediawiki_user_properties - https://phabricator.wikimedia.org/T323456 (10EChetty)
[11:41:48] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines, 10Patch-For-Review: Update sqoop for CheckUser table - https://phabricator.wikimedia.org/T326330 (10EChetty)
[11:42:12] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines, 10Product-Analytics: Add TikTok's in-app browser to ua-parser library - https://phabricator.wikimedia.org/T325611 (10EChetty)
[11:45:47] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines: When moving oozie webrequest-load to airflow/spark avoid the error-check corner case - https://phabricator.wikimedia.org/T324757 (10EChetty)
[11:46:01] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines: Drop MediaViewer and MultimediaViewer* tables - https://phabricator.wikimedia.org/T311229 (10EChetty)
[11:47:12] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines (Sprint 07): When moving oozie webrequest-load to airflow/spark avoid the error-check corner case - https://phabricator.wikimedia.org/T324757 (10EChetty)
[11:47:26] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines: NEW FEATURE REQUEST: Dataset with active and non-active Wikis - https://phabricator.wikimedia.org/T323662 (10EChetty) p:05Triage→03Medium
[11:47:46] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines (Sprint 07): Drop MediaViewer and MultimediaViewer* tables - https://phabricator.wikimedia.org/T311229 (10EChetty)
[11:48:58] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines, 10Patch-For-Review: Update sqoop for CheckUser table - https://phabricator.wikimedia.org/T326330 (10Zabe) In 3 days or so the `cuc_comment_id` will be fully populated (it already is everywhere except wikidatawiki), thus you can also migrate to read from that ins...
[12:12:02] <wikibugs>	 10Data-Engineering-Planning, 10Patch-For-Review, 10Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-05)): NEW FEATURE REQUEST: Upgrade superset to 1.5.3 - https://phabricator.wikimedia.org/T323458 (10BTullis)
[12:49:48] <icinga-wm>	 PROBLEM - Host aqs2007 is DOWN: PING CRITICAL - Packet loss = 100%
[12:49:49] <icinga-wm>	 PROBLEM - Host aqs2008 is DOWN: PING CRITICAL - Packet loss = 100%
[12:50:26] <icinga-wm>	 PROBLEM - Host aqs2006 is DOWN: PING CRITICAL - Packet loss = 100%
[12:50:26] <icinga-wm>	 PROBLEM - Host aqs2005 is DOWN: PING CRITICAL - Packet loss = 100%
[13:00:39] <btullis>	 --^ There is an issue at the moment affecting codfw - It's being discussed in #mediawiki_security but I don't believe that we need to do anything at the moment.
[13:00:42] <wikibugs>	 (03PS1) 10Simone Cuomo: Add new action to be able to track sessions [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/880953 (https://phabricator.wikimedia.org/T326663)
[13:00:54] <joal>	 ack btullis - thanks
[13:01:12] <joal>	 btullis: Would you mind checking if this will affect webrequest traffic data please?
[13:04:55] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines: NEW FEATURE REQUEST: Dataset with active and non-active Wikis - https://phabricator.wikimedia.org/T323662 (10EChetty) @kzimmerman   Do we have an existing definition of active we want to use here?  dan has:     from editors   where edits > 4  and      from active_...
[13:07:19] <wikibugs>	 (03PS2) 10Simone Cuomo: Update searchPreview schema to be inline with required changes [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/880953 (https://phabricator.wikimedia.org/T326663)
[13:07:47] <wikibugs>	 (03CR) 10CI reject: [V: 04-1] Update searchPreview schema to be inline with required changes [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/880953 (https://phabricator.wikimedia.org/T326663) (owner: 10Simone Cuomo)
[13:20:31] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2006 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/top/{project}/{access}/{year}/{month}/{day} (Get top page views) is CRITICAL: Test Get top page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top countries by page views returned the unexpected 
[13:20:31] <icinga-wm>	 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[13:20:43] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2007 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/aggregate/{project}/{access}/{agent}/{granularity}/{start}/{end} (Get aggregate page views) is CRITICAL: Test Get aggregate page views returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[13:20:49] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2008 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/top/{project}/{access}/{year}/{month}/{day} (Get top page views) is CRITICAL: Test Get top page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/aggregate/{referer}/{media_type}/{agent}/{granularity}/{start}/{end} (Get aggregate mediarequests) is CRITICAL: Test Get aggregate mediarequests returned
[13:20:49] <icinga-wm>	 expected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[13:21:11] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2005 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/aggregate/{project}/{access}/{agent}/{granularity}/{start}/{end} (Get aggregate page views) is CRITICAL: Test Get aggregate page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/per-file/{referer}/{agent}/{file_path}/{granularity}/{start}/{end} (Get per file requests) is CRITICAL: Test Get per fil
[13:21:11] <icinga-wm>	 sts returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[13:24:33] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[13:28:12] <btullis>	 joal: Yes, I will check.
[13:32:01] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2006 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[13:34:19] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2005 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/per-article/{project}/{access}/{agent}/{article}/{granularity}/{start}/{end} (Get per article page views) is CRITICAL: Test Get per article page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/aggregate/{referer}/{media_type}/{agent}/{granularity}/{start}/{end} (Get aggregate mediarequests) is CR
[13:34:19] <icinga-wm>	  Test Get aggregate mediarequests returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[13:40:25] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2008 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[13:41:45] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2006 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/aggregate/{project}/{access}/{agent}/{granularity}/{start}/{end} (Get aggregate page views) is CRITICAL: Test Get aggregate page views returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[13:45:15] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2008 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/per-article/{project}/{access}/{agent}/{article}/{granularity}/{start}/{end} (Get per article page views) is CRITICAL: Test Get per article page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/per-file/{referer}/{agent}/{file_path}/{granularity}/{start}/{end} (Get per file requests) is CRITICAL: 
[13:45:15] <icinga-wm>	 t per file requests returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[13:54:34] <wikibugs>	 (03CR) 10Matthias Mullie: Update searchPreview schema to be inline with required changes (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/880953 (https://phabricator.wikimedia.org/T326663) (owner: 10Simone Cuomo)
[13:54:35] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2006 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[13:57:59] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2007 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:01:34] <wikibugs>	 (03PS3) 10Simone Cuomo: Update searchPreview schema to be inline with required changes [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/880953 (https://phabricator.wikimedia.org/T326663)
[14:02:09] <wikibugs>	 (03CR) 10Simone Cuomo: "Yeah I just realised that while testing the UI! All fixed now" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/880953 (https://phabricator.wikimedia.org/T326663) (owner: 10Simone Cuomo)
[14:04:25] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2007 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/per-article/{project}/{access}/{agent}/{article}/{granularity}/{start}/{end} (Get per article page views) is CRITICAL: Test Get per article page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/aggregate/{referer}/{media_type}/{agent}/{granularity}/{start}/{end} (Get aggregate mediarequests) is CR
[14:04:25] <icinga-wm>	  Test Get aggregate mediarequests returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:06:01] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2007 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:12:37] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2008 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:12:57] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:18:41] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2006 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/aggregate/{project}/{access}/{agent}/{granularity}/{start}/{end} (Get aggregate page views) is CRITICAL: Test Get aggregate page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/per-file/{referer}/{agent}/{file_path}/{granularity}/{start}/{end} (Get per file requests) is CRITICAL: Test Get per fil
[14:18:41] <icinga-wm>	 sts returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:21:01] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2005 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/per-article/{project}/{access}/{agent}/{article}/{granularity}/{start}/{end} (Get per article page views) is CRITICAL: Test Get per article page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/pageviews/aggregate/{project}/{access}/{agent}/{granularity}/{start}/{end} (Get aggregate page views) is CRITICAL: Tes
[14:21:01] <icinga-wm>	 ggregate page views returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:22:37] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:23:35] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2006 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:28:27] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2006 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top countries by page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/per-file/{referer}/{agent}/{file_path}/{granularity}/{start}/{end} (Get per file requests) is CRITICAL: Test Get per file 
[14:28:27] <icinga-wm>	 s returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:28:39] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2007 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/aggregate/{project}/{access}/{agent}/{granularity}/{start}/{end} (Get aggregate page views) is CRITICAL: Test Get aggregate page views returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:29:07] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2005 is CRITICAL: /analytics.wikimedia.org/v1/mediarequests/per-file/{referer}/{agent}/{file_path}/{granularity}/{start}/{end} (Get per file requests) is CRITICAL: Test Get per file requests returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:32:21] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:34:47] <wikibugs>	 (03CR) 10Snwachukwu: Refactor and Expand External referer classification (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/864772 (https://phabricator.wikimedia.org/T309769) (owner: 10Snwachukwu)
[14:35:14] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2008 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/aggregate/{project}/{access}/{agent}/{granularity}/{start}/{end} (Get aggregate page views) is CRITICAL: Test Get aggregate page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top countries by pa
[14:35:14] <icinga-wm>	 s returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:35:15] <jinxer-wm>	 (VarnishkafkaNoMessages) firing: varnishkafka on cp2034 is not sending enough cache_upload requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=codfw%20prometheus/ops&var-cp_cluster=cache_upload&var-instance=cp2034%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[14:36:43] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2007 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:37:12] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2005 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/top/{project}/{access}/{year}/{month}/{day} (Get top page views) is CRITICAL: Test Get top page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/per-file/{referer}/{agent}/{file_path}/{granularity}/{start}/{end} (Get per file requests) is CRITICAL: Test Get per file requests returned the unexpecte
[14:37:12] <icinga-wm>	 s 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/aggregate/{referer}/{media_type}/{agent}/{granularity}/{start}/{end} (Get aggregate mediarequests) is CRITICAL: Test Get aggregate mediarequests returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:38:05] <jinxer-wm>	 (VarnishkafkaNoMessages) resolved: varnishkafka on cp2034 is not sending enough cache_upload requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=codfw%20prometheus/ops&var-cp_cluster=cache_upload&var-instance=cp2034%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[14:41:39] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2007 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/per-article/{project}/{access}/{agent}/{article}/{granularity}/{start}/{end} (Get per article page views) is CRITICAL: Test Get per article page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/pageviews/aggregate/{project}/{access}/{agent}/{granularity}/{start}/{end} (Get aggregate page views) is CRITICAL: Tes
[14:41:39] <icinga-wm>	 ggregate page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top countries by page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/aggregate/{referer}/{media_type}/{agent}/{granularity}/{start}/{end} (Get aggregate mediarequests) is CR
[14:41:39] <icinga-wm>	  Test Get aggregate mediarequests returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:42:07] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:46:57] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2005 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/aggregate/{project}/{access}/{agent}/{granularity}/{start}/{end} (Get aggregate page views) is CRITICAL: Test Get aggregate page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/per-file/{referer}/{agent}/{file_path}/{granularity}/{start}/{end} (Get per file requests) is CRITICAL: Test Get per fil
[14:46:57] <icinga-wm>	 sts returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/aggregate/{referer}/{media_type}/{agent}/{granularity}/{start}/{end} (Get aggregate mediarequests) is CRITICAL: Test Get aggregate mediarequests returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:54:59] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[14:55:51] <icinga-wm>	 RECOVERY - MegaRAID on an-worker1086 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[14:59:49] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2005 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/top/{project}/{access}/{year}/{month}/{day} (Get top page views) is CRITICAL: Test Get top page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top countries by page views returned the unexpected 
[14:59:49] <icinga-wm>	 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[15:01:56] <wikibugs>	 (03CR) 10Mforns: [V: 03+2 C: 03+2] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/873013 (https://phabricator.wikimedia.org/T293583) (owner: 10Addshore)
[15:05:45] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2007 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[15:06:13] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[15:10:37] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2007 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/per-article/{project}/{access}/{agent}/{article}/{granularity}/{start}/{end} (Get per article page views) is CRITICAL: Test Get per article page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top
[15:10:37] <icinga-wm>	 ies by page views returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[15:11:03] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs2005 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top countries by page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/aggregate/{referer}/{media_type}/{agent}/{granularity}/{start}/{end} (Get aggregate mediarequests) is CRITICAL: Test Get a
[15:11:03] <icinga-wm>	 e mediarequests returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[15:22:27] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[15:27:39] <icinga-wm>	 PROBLEM - MegaRAID on an-worker1086 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[15:31:29] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2006 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[15:33:15] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2007 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[15:34:57] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs2008 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[15:50:17] <wikibugs>	 (03CR) 10Ottomata: image-suggestions-feedback: Bump to version 2.0.0 (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan)
[16:00:45] <wikibugs>	 10Data-Engineering, 10Equity-Landscape: Affiliates output rank metrics - https://phabricator.wikimedia.org/T306619 (10KCVelaga_WMF) a:05KCVelaga_WMF→03JAnstee_WMF
[16:02:15] <wikibugs>	 10Data-Engineering, 10Equity-Landscape: Overall Engagement output rank metric - https://phabricator.wikimedia.org/T306622 (10KCVelaga_WMF) @JAnstee_WMF The QA of overall  engagement metric is ready for your review: https://docs.google.com/spreadsheets/d/1GnKHC9yT5tN_xmEltCGdHEONI5GqjTiWaVI9zXNp4rQ/edit#gid=155...
[16:02:20] <wikibugs>	 10Data-Engineering, 10Equity-Landscape: Overall Engagement output rank metric - https://phabricator.wikimedia.org/T306622 (10KCVelaga_WMF) a:05KCVelaga_WMF→03JAnstee_WMF
[16:07:11] <icinga-wm>	 PROBLEM - Host furud is DOWN: PING CRITICAL - Packet loss = 100%
[16:09:22] <icinga-wm>	 RECOVERY - Host furud is UP: PING OK - Packet loss = 0%, RTA = 30.19 ms
[16:20:43] <icinga-wm>	 RECOVERY - MegaRAID on an-worker1086 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[16:25:12] <jinxer-wm>	 (VarnishkafkaNoMessages) firing: varnishkafka on cp2036 is not sending enough cache_upload requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=codfw%20prometheus/ops&var-cp_cluster=cache_upload&var-instance=cp2036%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[16:28:21] <btullis>	 The incident affecting the network at codfw is largely over, in that the network connectivity appears stable again. codfw is about to be repooled, I believe.
[16:29:53] <btullis>	 joal: There aren't expected to be any issues with webrequest or anything else related to the event platform. Neither kafka nor hadoop was affected. aqs/cassandra is almost back to normal.
[16:30:12] <jinxer-wm>	 (VarnishkafkaNoMessages) resolved: varnishkafka on cp2036 is not sending enough cache_upload requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=codfw%20prometheus/ops&var-cp_cluster=cache_upload&var-instance=cp2036%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[16:40:49] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on an-worker1086 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough Btullis Another BBU failure - I will add it to: T326127 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[16:45:33] <wikibugs>	 10Data-Engineering, 10SRE, 10ops-eqiad: Check BBU on an-worker1080, an-worker1084, and an-worker1086 - https://phabricator.wikimedia.org/T325984 (10BTullis)
[16:47:50] <inflatador>	 (☞ﾟヮﾟ)☞
[16:48:13] <wikibugs>	 (03PS13) 10Snwachukwu: Refactor and Expand External referer classification [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/864772 (https://phabricator.wikimedia.org/T309769)
[16:50:33] <btullis>	 !log shutdown an-worker1086 for RAID BBU replacement
[16:50:35] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[16:52:59] <wikibugs>	 (03CR) 10CI reject: [V: 04-1] Refactor and Expand External referer classification [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/864772 (https://phabricator.wikimedia.org/T309769) (owner: 10Snwachukwu)
[17:00:45] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines (Sprint 07), 10Patch-For-Review: Update sqoop for CheckUser table - https://phabricator.wikimedia.org/T326330 (10EChetty)
[17:01:20] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines: Drop MediaViewer and MultimediaViewer* tables - https://phabricator.wikimedia.org/T311229 (10EChetty)
[17:04:59] <joal>	 thanks btullis for the heads up on codfw issue
[17:09:12] <jinxer-wm>	 (VarnishkafkaNoMessages) firing: varnishkafka on cp2030 is not sending enough cache_upload requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=codfw%20prometheus/ops&var-cp_cluster=cache_upload&var-instance=cp2030%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[17:14:03] <wikibugs>	 (03CR) 10Joal: "All my comments have been tackled - the jenkins tests don't pass though :(" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/864772 (https://phabricator.wikimedia.org/T309769) (owner: 10Snwachukwu)
[17:14:12] <jinxer-wm>	 (VarnishkafkaNoMessages) resolved: (2) varnishkafka on cp2030 is not sending enough cache_upload requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka  - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[17:24:46] <wikibugs>	 10Data-Engineering-Planning, 10DC-Ops, 10SRE, 10Shared-Data-Infrastructure, 10ops-eqiad: Q1:rack/setup/install druid10[09-11] - https://phabricator.wikimedia.org/T314335 (10Papaul) @BTullis any update on this?
[17:25:22] <wikibugs>	 (03CR) 10BPirkle: [C: 03+2] Update pingback MediaWiki versions to include new values [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/879595 (https://phabricator.wikimedia.org/T326825) (owner: 10Cicalese)
[17:27:17] <wikibugs>	 (03CR) 10Snwachukwu: Refactor and Expand External referer classification (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/864772 (https://phabricator.wikimedia.org/T309769) (owner: 10Snwachukwu)
[17:27:45] <wikibugs>	 10Data-Engineering-Planning, 10Epic, 10Patch-For-Review, 10Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-05)): Decide on installation details for new ceph cluster - https://phabricator.wikimedia.org/T326945 (10BTullis) Having examined the puppet manifests that we have for ceph, I believe that we can r...
[17:28:20] <wikibugs>	 (03CR) 10BPirkle: [V: 03+2 C: 03+2] Update pingback MediaWiki versions to include new values [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/879595 (https://phabricator.wikimedia.org/T326825) (owner: 10Cicalese)
[17:30:50] <wikibugs>	 10Data-Engineering-Planning, 10Epic, 10Patch-For-Review, 10Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-05)): Decide on installation details for new ceph cluster - https://phabricator.wikimedia.org/T326945 (10BTullis)
[18:10:12] <jinxer-wm>	 (VarnishkafkaNoMessages) firing: varnishkafka on cp2034 is not sending enough cache_upload requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=codfw%20prometheus/ops&var-cp_cluster=cache_upload&var-instance=cp2034%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[18:26:33] <ebernhardson>	 are there any examples of deployments that ship custom conda environments via spark or skein onto hadoop workers?
[18:27:49] <ebernhardson>	 starting to review our update to spark3, and probably the main thing is moving from virtualenv's to conda
[19:00:37] <joal>	 ebernhardson: I don't know of any :(
[19:06:41] <ebernhardson>	 no worries, i'm sure i'll figure something out :) might save a day or two if there were examples but it looks doable from docs
[19:09:17] <joal>	 Thanks a lot for you debunking this ebernhardson - I guess we'll take examples :)
[19:14:56] <ebernhardson>	 the sizes are a bit scary though :S first creation of a conda env is 300M without even installing anything.  But hopefully can figure out how to get it to reference the conda-analytics that's already on the nodes
[19:17:40] <joal>	 ebernhardson: this is a known issue unfortunately :( your feedback will be very welcome on that front
[19:18:40] <ottomata>	 joal: we do, no?  our airflow deployment does it
[19:19:08] <ottomata>	 https://wikitech.wikimedia.org/wiki/Data_Engineering/Systems/Airflow/Developer_guide#Artifacts
[19:19:08] <joal>	 ottomata: I can't recall we do - possbily I didn't know we do!
[19:19:43] <ottomata>	 that plus our SparkSubmitOperator with launcher=skein
[19:19:56] <ottomata>	 ebernhardson: ...want to switch to airflow 2 and our airflow-dags repo? :)
[19:20:04] <ottomata>	 we can make you a new airflow instance
[19:20:28] <ottomata>	 i think maybe the image suggestions folks do this with spark3 now?  cc xcollazo ?
[19:21:19] <ottomata>	 https://wikitech.wikimedia.org/wiki/Data_Engineering/Systems/Airflow/Developer_guide#SparkSubmitOperator
[19:21:20] <ebernhardson>	 ottomata: hmm, maybe. I'd have to review how much work that would be. it's about 20 dags
[19:22:03] <ottomata>	 https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/main/wmf_airflow_common/operators/spark.py#L63-69
[19:22:12] <ottomata>	 ebernhardson:  if you are making a new spark 3 / conda job anyway, you could just start with that one?
[19:22:38] <ottomata>	 and migrate the others as a separate task
[19:23:12] <ebernhardson>	 well, we have one spark3 job already and all it involved was changing the spark-submit executable in the airflow connection. But that one is a plain pyspark with no additional deps
[19:23:41] <ebernhardson>	 switched it last week
[19:24:01] <ottomata>	 i'm sure you could do it all in yours, and/or we could move our SparkSubmitOperator to a more easily lib (we kept it in airflow-dags to make it easier to develop together)
[19:24:31] <ebernhardson>	 i also have a custom SparkSubmitOperator, would have too see how they vary :) 
[19:24:34] <ottomata>	 if you switch though, you get artifact (conda env) deployment and Spark skein stuff built in
[19:24:56] <ottomata>	 Operator: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/main/wmf_airflow_common/operators/spark.py
[19:25:00] <ottomata>	 but more interesting is hook
[19:25:14] <ottomata>	 we ahve a simple skein  and  a spark skein one
[19:25:14] <ottomata>	 https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/tree/main/wmf_airflow_common/hooks
[19:25:28] <ottomata>	 the SparkSkein one
[19:25:29] <ottomata>	 https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/main/wmf_airflow_common/hooks/spark.py#L185
[19:25:35] <ottomata>	 handles doing the right thign with the artifacts
[19:25:53] <ebernhardson>	 whats the benefit of skein over spark in cluster mode?
[19:25:55] <ottomata>	 e.g. 
[19:25:55] <ottomata>	 https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/main/wmf_airflow_common/hooks/spark.py#L289
[19:26:17] <ottomata>	 in docs of hook:
[19:26:20] <ottomata>	 spark in cluster mode does keep work off of the airflow executor, but
[19:26:20] <ottomata>	       still requires that e.g. python scripts or other resources needed to
[19:26:20] <ottomata>	       launch the spark job are deployed locally to the executor.  By using
[19:26:20] <ottomata>	       skein, we can pull down files/archives.
[19:26:51] <ottomata>	 for java there is no real diff, as you can do e.g. hdfs://path/to/app.jar 
[19:27:08] <ebernhardson>	 ahh, i suppose so far i've always shipped those with --files, which decompresses into the target. but indeed i didn't figure out how to have custom setup other than whats in the zip being unzip'd
[19:27:10] <ottomata>	 and the yarn app master in cluster mode will handle launching from that jar
[19:27:27] <ottomata>	 for python, you can't launch unless your python script where you are launching from.
[19:28:42] <ottomata>	 skein yarn client also works a little nicer with airflow UI, you get spark master logs in airflow UI
[19:29:58] <ebernhardson>	 yea, we ship the python script to run with --files a well.  I can check this all out, it seems to replicate what we already have in spark2 
[19:32:05] <ottomata>	 oh ebernhardson  we also have a nice for_virtualenv factory for SparkSubmitOperator
[19:32:06] <ottomata>	 example: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/main/platform_eng/dags/image_suggestions_dag.py#L265-272
[19:32:56] <ottomata>	 https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/main/wmf_airflow_common/operators/spark.py#L211-290
[19:48:07] <wikibugs>	 10Data-Engineering-Planning, 10serviceops, 10Discovery-Search (Current work), 10Event-Platform Value Stream (Sprint 07), 10Patch-For-Review: Flink on Kubernetes Helm charts - https://phabricator.wikimedia.org/T324576 (10Ottomata) Rats, neither the [[ https://gerrit.wikimedia.org/r/879618 | NetworkPolicy...
[19:57:44] <wikibugs>	 10Data-Engineering: Requesting Kerberos identity for Hxi-ctr - https://phabricator.wikimedia.org/T325857 (10mpopov) > Would that affect my ability to login to Jupyter because I haven't been able to?  Yep. The original ticket has been re-opened and the username will need to be updated before you're able to log in.
[20:36:55] <wikibugs>	 (03CR) 10Kosta Harlan: image-suggestions-feedback: Bump to version 2.0.0 (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan)
[20:44:10] <wikibugs>	 (03CR) 10Ottomata: image-suggestions-feedback: Bump to version 2.0.0 (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan)
[20:54:53] <xcollazo>	 !log dropping old partitions from image_suggestions Hive tables as per https://phabricator.wikimedia.org/T325837
[20:54:55] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[20:55:57] <wikibugs>	 10Data-Engineering-Planning, 10Event-Platform Value Stream (Sprint 07), 10Patch-For-Review: Flink application and flink-kubernetes-operator production docker images - https://phabricator.wikimedia.org/T316519 (10Ottomata) Hm, am confused by a production-images vs blubber user thing.  In operation/production-...
[22:26:12] <jinxer-wm>	 (VarnishkafkaNoMessages) firing: varnishkafka on cp1081 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=eqiad%20prometheus/ops&var-cp_cluster=cache_text&var-instance=cp1081%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[22:31:12] <jinxer-wm>	 (VarnishkafkaNoMessages) resolved: varnishkafka on cp1081 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=eqiad%20prometheus/ops&var-cp_cluster=cache_text&var-instance=cp1081%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[22:43:12] <jinxer-wm>	 (VarnishkafkaNoMessages) firing: varnishkafka on cp2037 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=codfw%20prometheus/ops&var-cp_cluster=cache_text&var-instance=cp2037%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[22:48:12] <jinxer-wm>	 (VarnishkafkaNoMessages) resolved: varnishkafka on cp2037 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=codfw%20prometheus/ops&var-cp_cluster=cache_text&var-instance=cp2037%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[22:53:42] <jinxer-wm>	 (VarnishkafkaNoMessages) firing: (5) varnishkafka on cp2027 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka  - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[22:53:42] <jinxer-wm>	 (VarnishkafkaNoMessages) firing: (3) varnishkafka on cp2027 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka  - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[22:54:12] <jinxer-wm>	 (VarnishkafkaNoMessages) firing: (3) varnishkafka on cp2032 is not sending enough cache_upload requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka  - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[22:57:36] <wikibugs>	 10Data-Engineering, 10Data-Engineering-Kanban: Apache atlas build fails due to expired certificate (https://maven.restlet.com) - https://phabricator.wikimedia.org/T297841 (10TimTheK) I just tried 2.3.0 and I got the same error:  Failed to execute goal on project atlas-testtools: Could not resolve dependencies...
[22:58:42] <jinxer-wm>	 (VarnishkafkaNoMessages) resolved: (5) varnishkafka on cp2027 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka  - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[22:58:42] <jinxer-wm>	 (VarnishkafkaNoMessages) resolved: (3) varnishkafka on cp2027 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka  - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages
[22:59:12] <jinxer-wm>	 (VarnishkafkaNoMessages) resolved: (3) varnishkafka on cp2032 is not sending enough cache_upload requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka  - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages