[10:22:53] errand+lunch [12:38:55] Do cirrusSearch jobs hit parsoid? [13:08:36] claime not that I know of... gehel dcausse any thoughts on ^^ ? [13:09:43] nevermind, it's something else [13:09:46] heh [13:09:57] It's a big commons pagebundle reparse [13:10:48] which would probably explain the time correlation with a bump in cirrusSearch jobs at about the same time [13:13:21] no problema, good luck! [13:59:51] o/ [14:00:08] claime: sorry I'm late, no, CirrusSearch still uses the old php parser [14:00:52] dcausse: ack, ty [15:06:59] \o [15:08:36] according to #wikimedia-sre , flink in eqiad was accidentally depooled for a short time...still checking dashboards, but it doesn't seem to have affected anything [15:10:18] kk [15:18:33] o/ [15:24:21] trying to dig up elastic settings checks...looks like they exist per https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/icinga/manifests/monitor/elasticsearch/cirrus_settings_check.pp#25 , but I can't find that file on a prod codfw host [15:25:37] inflatador: settings are global to the cluster, so they are not checked on every host. I think we check on all master eligible nodes. [15:27:21] gehel ACK, looks like you're correct (as usual ;P )...I see them on a CODFW master host [15:31:53] we need to figure out how to do this with blackbox/remote polling as well...but for now, will just use icinga [16:15:01] workout, back in ~40 [16:41:02] errand [17:31:31] OK, so the icinga check is going to get nasty...wondering if we could maybe just run a systemd timer to set Elastic running config to the rendered config in cirrus_settings_check.yml every hour/day/whatever? [17:43:26] I guess we could render the health checks dynamically thru puppet a la https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/icinga/manifests/monitor/elasticsearch/cirrus_settings_check.pp#9 . Looks nasty though [17:43:37] hmm [17:44:00] there generally wouldn't be any harm just applying the config every hour. [17:44:21] I don't know if elastic is smart enough to throw out cluster state updates that don't do anything, but even if it didn't...it's once an hour [17:46:08] Yeah...I don't like either option. I guess we're going to have to move this logic out of icinga anyway since it's going away. hmm... [17:46:28] would be nice to have a source of truth outside of a config mgmt repo, but of course I don't have any helpful suggestions ;) [17:46:39] nmap? :P [17:46:54] packets don't lie! [17:47:18] lol [17:48:20] anyway, grabbing a quick lunch, back in time for pairiing [18:04:35] dinner [18:32:21] * ebernhardson is mildly surprised a java instant can represent year -1 billion [18:45:35] back [19:24:54] I'm hitting _cluster/settings on the Elastic API. Can I give it a query string so it only gives back the CCS settings, or do I have to filter that client-side? [19:25:58] hmm, looks like client-side https://www.elastic.co/guide/en/elasticsearch/reference/7.10/cluster-get-settings.html [19:33:48] ryankemper gehel SRE pairing if y'all wanna join [20:44:04] https://gerrit.wikimedia.org/r/c/operations/puppet/+/1012703 (CCS monitoring) is ready...this is just for the short-term fix [21:12:51] another small change to turn off alerts for a host that's about to be decommed: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1012751 [22:13:02] thanks for the reviews! Heading out for the day... [22:30:45] ebernhardson: are you around? Would you have a moment to help me with airflow/discolytics (T358472)? [22:30:46] T358472: Search dag image_suggestions_weekly failed with: Empty dataframe provided - https://phabricator.wikimedia.org/T358472 [22:31:00] sure [22:31:25] pfischer: certainly, whats up? [22:32:18] meet.google.com/qhu-iezx-bab