
broom - Convert Statistical Objects into Tidy Tibbles
Summarizes key information about statistical objects in tidy tibbles. This makes it easy to report results, create plots and consistently work with large numbers of models at once. Broom provides three verbs that each provide different types of information about a model. tidy() summarizes information about model components such as coefficients of a regression. glance() reports information about an entire model, such as goodness of fit measures like AIC and BIC. augment() adds information about individual observations to a dataset, such as fitted values or influence measures.
Last updated 4 months ago
modelingtidy-data
21.56 score 1.5k stars 1.4k dependents 37k scripts 747k downloads
infer - Tidy Statistical Inference
The objective of this package is to perform inference using an expressive statistical grammar that coheres with the tidy design framework.
Last updated 6 months ago
15.69 score 734 stars 17 dependents 3.5k scripts 28k downloads
workflows - Modeling Workflows
Managing both a 'parsnip' model and a preprocessor, such as a model formula or recipe from 'recipes', can often be challenging. The goal of 'workflows' is to streamline this process by bundling the model alongside the preprocessor, all within the same object.
Last updated 24 days ago
13.80 score 207 stars 43 dependents 876 scripts 29k downloadsworkflowsets - Create a Collection of 'tidymodels' Workflows
A workflow is a combination of a model and preprocessors (e.g, a formula, recipe, etc.) (Kuhn and Silge (2021) <https://www.tmwr.org/>). In order to try different combinations of these, an object can be created that contains many workflows. There are functions to create workflows en masse as well as training them and visualizing the results.
Last updated 5 months ago
12.21 score 93 stars 19 dependents 294 scripts 24k downloads
stacks - Tidy Model Stacking
Model stacking is an ensemble technique that involves training a model to combine the outputs of many diverse statistical models, and has been shown to improve predictive performance in a variety of settings. 'stacks' implements a grammar for 'tidymodels'-aligned model stacking.
Last updated 5 months ago
11.50 score 295 stars 840 scripts 1.8k downloads
bonsai - Model Wrappers for Tree-Based Models
Bindings for additional tree-based model engines for use with the 'parsnip' package. Models include gradient boosted decision trees with 'LightGBM' (Ke et al, 2017.), conditional inference trees and conditional random forests with 'partykit' (Hothorn and Zeileis, 2015. and Hothorn et al, 2006. <doi:10.1198/106186006X133933>), and accelerated oblique random forests with 'aorsf' (Jaeger et al, 2022 <doi:10.5281/zenodo.7116854>).
Last updated 1 months ago
9.97 score 52 stars 2 dependents 620 scripts 2.1k downloadsshinymodels - Interactive Assessments of Models
Launch a 'shiny' application for 'tidymodels' results. For classification or regression models, the app can be used to determine if there is lack of fit or poorly predicted points.
Last updated 5 months ago
shiny
6.21 score 48 stars 48 scripts 236 downloads
anyflights - Query 'nycflights13'-Like Air Travel Data for Given Years and Airports
Supplies a set of functions to query air travel data for user- specified years and airports. Datasets include on-time flights, airlines, airports, planes, and weather.
Last updated 2 months ago
5.90 score 49 stars 23 scripts 570 downloads
syrup - Measure Memory and CPU Usage for Parallel R Code
Measures memory and CPU usage of R code by regularly taking snapshots of calls to the system command 'ps'. The package provides an entry point (albeit coarse) to profile usage of system resources by R code run in parallel.
Last updated 8 months ago
5.03 score 18 stars 20 scripts 141 downloadsgbfs - Interface with Live Bikeshare Data
Supplies a set of functions to interface with bikeshare data following the General Bikeshare Feed Specification, allowing users to query and accumulate tidy datasets for specified cities/bikeshare programs.
Last updated 2 months ago
4.68 score 38 stars 25 scripts 468 downloads
forested - Forest Attributes in Washington State
A small subset of plots in Washington State are sampled and assessed "on-the-ground" as forested or non-forested by the U.S. Department of Agriculture, Forest Service, Forest Inventory and Analysis (FIA) Program, but the FIA also has access to remotely sensed data for all land in the state. The 'forested' package contains a data frame by the same name intended for use in predictive modeling applications where the more easily-accessible remotely sensed data can be used to predict whether a plot is forested or non-forested.
Last updated 7 months ago
4.66 score 7 stars 33 scripts 157 downloadsstreamy - Inline Asynchronous Generator Results into Documents
Given a 'coro' asynchronous generator instance that produces text, write that text into a document selection in 'RStudio' and 'Positron'. This is particularly helpful for streaming large language model responses into the user's editor.
Last updated 1 months ago
4.43 score 1 stars 3 dependents 656 downloadsdetectors - Prediction Data from GPT Detectors
Researchers carried out a series of experiments passing a number of essays to different GPT detection models. Juxtaposing detector predictions for papers written by native and non-native English writers, the authors argue that GPT detectors disproportionately classify real writing from non-native English writers as AI-generated.
Last updated 1 years ago
3.62 score 7 stars 12 scripts 181 downloadsreadmission - Hospital Readmission Data for Patients with Diabetes
Clinical care data from 130 U.S. hospitals in the years 1999-2008. Each row describes an "encounter" with a patient with diabetes, including variables on demographics, medications, patient history, diagnostics, payment, and readmission.
Last updated 1 years ago
2.70 score 1 stars 10 scripts 176 downloads