[07:20:33] hello folks, an-airflow1001's root partition seems getting full [07:22:04] some GBs are in ebernhardson's home dir, but long term we may want to add another virtual disk / partition [08:34:42] elukey: thanks for the ping! I think we will be able to decommission this host now that we've migrated to Airflow 2. To be confirmed by inflatador / ebernhardson. [10:18:09] lunch [12:30:27] https://wiki.bitplan.com/index.php/Wikidata_Import_2023-04-26 -munging is done now. Loading was started but i fear the logging options are too "DEBUG". The log file grows way to quickly. I'd need to get some more options that are set via ENV variables and config files that are references in runBlazegraph.sh e.g. /etc/default/wdqs-blazegraph LOG_CONFIG env var too avoid too big a logfile ... RWStore.properties and may be [12:30:27] others [12:36:28] seppl2023: there should be a "logback*.xml" file somewhere to configure logging. It should be automatically reloaded, so you should not need to restart blazegraph for the logging changes to take effect. [12:37:07] @team: goals for Q4 are published on wiki (yes, I know, quite a bit late): https://wikitech.wikimedia.org/wiki/Search_Platform/Goals/OKR_Q4_2022-2023 [12:37:09] I have already rebooted the server to start over again and get rid of the multi GB logfile [12:40:48] Isn't loaddata.sh the file to be called that used to be loadAll? https://github.com/wikimedia/wikidata-query-rdf/blob/master/docs/getting-started.md - how is the progress traced these days there used to be ".good" files to mark the successful loads - is that still so? [12:44:06] logback files https://www.irccloud.com/pastebin/XnqcMmMg/ [12:44:13] the new one does not [12:49:39] We are tracking the data reload via the logs. I'm not sure I've ever used those .good files, so I don't remember when that option was removed (if ever). [12:50:28] where did you get the Blazegraph binaries from? [12:50:33] seppl2023: ^^ [12:50:52] the binaries are the result of the mvn package command [12:52:11] https://www.irccloud.com/pastebin/4ksUaYsf/ [12:52:16] and you're using the resulting tar.gz? [12:53:18] https://www.irccloud.com/pastebin/lscUNXV9/ [12:53:56] so ia ssume blazegraph-service-0.3.124-SNAPSHOT.war has the core binaries [12:54:34] I'll be away for a few minutes and then try to continue [12:57:22] looks like runBlazegraph.sh does not configure a logback file by default. Which makes sense, since we bundle a default configuration. But you can use the LOG_CONFIG environment variable to supply a logback config, that will override the provided default. [12:58:05] what should be in the LOG_CONFIG to get warning only? [12:58:21] or better even ERROR only [13:02:16] seppl2023: you can have a look at the config we use in production: https://github.com/wikimedia/operations-puppet/blob/production/modules/query_service/templates/logback.xml.erb and adapt it to your needs. The full logback documentation might help you as well: https://logback.qos.ch/manual/configuration.html [13:02:34] Trey314159: congratulation on the blog post! (https://diff.wikimedia.org/2023/04/28/language-harmony-and-unpacking-a-year-in-the-life-of-a-search-nerd/) [13:02:51] Trey314159: do you want to add it to https://meta.wikimedia.org/wiki/Talk:Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/Product_%26_Technology/OKRs#SDS3:_Using_and_distributing_data ? [14:10:08] legoktm: thanks for the links! [14:11:06] gehel: Thanks! I will add a link there. [14:42:14] @gehel - thanks for trying to explain the LOG configuration. Unforunately i don't have any clue about Embedded Ruby template syntax nor do i know the intrinsics of https://logback.qos.ch/manual/configuration.html. I just want to make sure that i don't get a ton of log messages that will eat up the SSD disk space and make the import impossible. Is there a way to reduce the complexity here and get a more straightforward [14:42:14] log configuration working? [14:50:34] doh, sec i gotta find my 2-factor... [15:04:17] pfischer: triage meeting: https://meet.google.com/eki-rafx-cxi [15:05:59] apparently it's a public holiday in germany today [15:09:03] yes it is [16:37:06] * ebernhardson just learned that `git checkout -p` works much like `git add -p` [16:40:30] ryankemper: small clusters still weren't being collected for the new *_titlesuggest metric. After a ponder realized it was data collection issue: https://gerrit.wikimedia.org/r/c/operations/puppet/+/913959 [17:27:12] ebernhardson: excellent, looking now [17:28:58] Very straightforward, merging [17:58:38] lunch/errands, back in ~1h [17:59:25] looks good, getting 6 clusters in the metrics now [18:28:45] gehel: finishing up some food, 2' late to 1:1 [18:28:51] ack [19:05:10] back [19:37:14] Dispatched WDQS uptime SLO email to SRE's; discovery-private is BCC'd [19:39:27] ryankemper nice email! [19:39:40] thanks! [21:04:10] ryankemper I'm in the Google Meet if you're around [21:06:09] inflatador: lost track of time, brt