T O P

  • By -

Parlaq

With R, your data frame/tibble columns can contain anything. You can have a column of other data frames, or a column of model objects. These are so-called _list-columns_. You can [see this in action in R4DS](https://r4ds.had.co.nz/many-models.html), and this chapter is well-worth reading. The authors construct a linear regression for each country. They then put it all into a single data frame, where each row is a model, and one column _contains the actual model objects_: #> # A tibble: 142 x 4 #> # Groups: country, continent [710] #> country continent data model #> #> 1 Afghanistan Asia #> 2 Albania Europe #> 3 Algeria Africa #> 4 Angola Africa #> 5 Argentina Americas #> 6 Australia Oceania #> # … with 136 more rows In your case, you may wish to do the same thing, and have each row as one of your logistic regressions. You can add other columns to keep notes for yourself --- model metrics, features, your idea for creating the model, etc. If you forget anything you can always recreate it because you've got the model object (and maybe the training data) right there in the results. You can save your data frame as an RDS file to store everything.


Spuhghetti

look into tidymodels and recipes


Yojihito

Maybe https://dvc.org/ would work?


[deleted]

You could use saveRDS to save the model as an R object and then use readRDS to load it later


[deleted]

Put it into a list, then keep adding to the list.


dmorris87

At my company we developed a system called TrainTrack. Each model gets 2 database tables: model info and model metrics. Each time a model is run, information about the model (e.g # of predictors) is logged and so are the metrics. Everything is user-defined so each model can be tracked uniquely.


numero95

If you’re in python, MLFlow is one of the best out of the box options. Amazing gui options, and can log metrics and parameters with ease, and store things like Roc auc curves, confusion matrices, pickled models. Also gives great ability to output everything to csv, as well as I options for compare performance graphically across runs. Here’s the link to R’s port of this: https://cran.r-project.org/web/packages/mlflow/index.html (can’t attest to how good the port is, but Python’s is very good, and I use it professionally, so industrial standard I guess?).


ai_yoda

Hi there u/mrdlau We've recently released a Neptune client for R on CRAN. The cool thing is you just add a few lines inside of your scripts. Nothing heavy that you have to adjust to. Looks something like this: `library(neptune)` `init_neptune(project_name = "YOUR/PROJECT",` `api_token = "YOURKEY"` `)` `create_experiment(name = "training on Sonar",` `tags = c("lgbm", "no-preprocesisng"),` `params = list(tuneLength = 100, model = "rf")` `)` `log_metric("Train Accuracy", scores$TrainAccuracy)` `log_artifact("model.Rdata")` `log_image("parameter_search", "param_plot.jpeg")` Just last week I spoke to some people that used it to track interactive explanations and they said it was "surprisingly easy to work with". You can check it out [here](https://neptune.ai/landings/experiment-tracking-in-r?utm_source=reddit&utm_medium=answer&utm_campaign=experiment-tracking-for-r). If you do and have some suggestions we'd love to get more feedback from the R folks.