broom - Convert Statistical Objects into Tidy Tibbles
Summarizes key information about statistical objects in tidy tibbles. This makes it easy to report results, create plots and consistently work with large numbers of models at once. Broom provides three verbs that each provide different types of information about a model. tidy() summarizes information about model components such as coefficients of a regression. glance() reports information about an entire model, such as goodness of fit measures like AIC and BIC. augment() adds information about individual observations to a dataset, such as fitted values or influence measures.
Last updated 14 days ago
modelingtidy-data
21.66 score 1.4k stars 1.4k packages 32k scripts 817k downloadsinfer - Tidy Statistical Inference
The objective of this package is to perform inference using an expressive statistical grammar that coheres with the tidy design framework.
Last updated 2 months ago
15.94 score 726 stars 14 packages 3.1k scripts 38k downloadsworkflows - Modeling Workflows
Managing both a 'parsnip' model and a preprocessor, such as a model formula or recipe from 'recipes', can often be challenging. The goal of 'workflows' is to streamline this process by bundling the model alongside the preprocessor, all within the same object.
Last updated 6 days ago
13.77 score 207 stars 38 packages 812 scripts 31k downloadsworkflowsets - Create a Collection of 'tidymodels' Workflows
A workflow is a combination of a model and preprocessors (e.g, a formula, recipe, etc.) (Kuhn and Silge (2021) <https://www.tmwr.org/>). In order to try different combinations of these, an object can be created that contains many workflows. There are functions to create workflows en masse as well as training them and visualizing the results.
Last updated 28 days ago
12.32 score 92 stars 16 packages 286 scripts 26k downloadsstacks - Tidy Model Stacking
Model stacking is an ensemble technique that involves training a model to combine the outputs of many diverse statistical models, and has been shown to improve predictive performance in a variety of settings. 'stacks' implements a grammar for 'tidymodels'-aligned model stacking.
Last updated 29 days ago
11.66 score 295 stars 796 scripts 2.0k downloadsbonsai - Model Wrappers for Tree-Based Models
Bindings for additional tree-based model engines for use with the 'parsnip' package. Models include gradient boosted decision trees with 'LightGBM' (Ke et al, 2017.), conditional inference trees and conditional random forests with 'partykit' (Hothorn and Zeileis, 2015. and Hothorn et al, 2006. <doi:10.1198/106186006X133933>), and accelerated oblique random forests with 'aorsf' (Jaeger et al, 2022 <doi:10.5281/zenodo.7116854>).
Last updated 28 days ago
9.07 score 51 stars 522 scripts 2.2k downloadsshinymodels - Interactive Assessments of Models
Launch a 'shiny' application for 'tidymodels' results. For classification or regression models, the app can be used to determine if there is lack of fit or poorly predicted points.
Last updated 28 days ago
shiny
6.37 score 46 stars 49 scripts 182 downloadsanyflights - Query 'nycflights13'-Like Air Travel Data for Given Years and Airports
Supplies a set of functions to query air travel data for user- specified years and airports. Datasets include on-time flights, airlines, airports, planes, and weather.
Last updated 1 years ago
5.43 score 45 stars 20 scripts 418 downloadssyrup - Measure Memory and CPU Usage for Parallel R Code
Measures memory and CPU usage of R code by regularly taking snapshots of calls to the system command 'ps'. The package provides an entry point (albeit coarse) to profile usage of system resources by R code run in parallel.
Last updated 4 months ago
4.95 score 15 stars 20 scripts 157 downloadsgbfs - Interface with Live Bikeshare Data
Supplies a set of functions to interface with bikeshare data following the General Bikeshare Feed Specification, allowing users to query and accumulate tidy datasets for specified cities/bikeshare programs.
Last updated 10 months ago
4.65 score 37 stars 24 scripts 373 downloadsforested - Forest Attributes in Washington State
A small subset of plots in Washington State are sampled and assessed "on-the-ground" as forested or non-forested by the U.S. Department of Agriculture, Forest Service, Forest Inventory and Analysis (FIA) Program, but the FIA also has access to remotely sensed data for all land in the state. The 'forested' package contains a data frame by the same name intended for use in predictive modeling applications where the more easily-accessible remotely sensed data can be used to predict whether a plot is forested or non-forested.
Last updated 4 months ago
4.62 score 7 stars 30 scripts 169 downloadsdetectors - Prediction Data from GPT Detectors
Researchers carried out a series of experiments passing a number of essays to different GPT detection models. Juxtaposing detector predictions for papers written by native and non-native English writers, the authors argue that GPT detectors disproportionately classify real writing from non-native English writers as AI-generated.
Last updated 10 months ago
3.62 score 7 stars 12 scripts 151 downloadsreadmission - Hospital Readmission Data for Patients with Diabetes
Clinical care data from 130 U.S. hospitals in the years 1999-2008. Each row describes an "encounter" with a patient with diabetes, including variables on demographics, medications, patient history, diagnostics, payment, and readmission.
Last updated 12 months ago
3.00 score 1 stars 10 scripts 132 downloads