R packages by simonpcouch

broom - Convert Statistical Objects into Tidy Tibbles

Summarizes key information about statistical objects in tidy tibbles. This makes it easy to report results, create plots and consistently work with large numbers of models at once. Broom provides three verbs that each provide different types of information about a model. tidy() summarizes information about model components such as coefficients of a regression. glance() reports information about an entire model, such as goodness of fit measures like AIC and BIC. augment() adds information about individual observations to a dataset, such as fitted values or influence measures.

Last updated 4 months ago

modelingtidy-data

21.56 score 1.5k stars 1.4k dependents 37k scripts 747k downloads

infer - Tidy Statistical Inference

The objective of this package is to perform inference using an expressive statistical grammar that coheres with the tidy design framework.

Last updated 6 months ago

15.69 score 734 stars 17 dependents 3.5k scripts 28k downloads

workflows - Modeling Workflows

Managing both a 'parsnip' model and a preprocessor, such as a model formula or recipe from 'recipes', can often be challenging. The goal of 'workflows' is to streamline this process by bundling the model alongside the preprocessor, all within the same object.

Last updated 24 days ago

13.80 score 207 stars 43 dependents 876 scripts 29k downloads

workflowsets - Create a Collection of 'tidymodels' Workflows

A workflow is a combination of a model and preprocessors (e.g, a formula, recipe, etc.) (Kuhn and Silge (2021) <https://www.tmwr.org/>). In order to try different combinations of these, an object can be created that contains many workflows. There are functions to create workflows en masse as well as training them and visualizing the results.

Last updated 5 months ago

12.21 score 93 stars 19 dependents 294 scripts 24k downloads

stacks - Tidy Model Stacking

Model stacking is an ensemble technique that involves training a model to combine the outputs of many diverse statistical models, and has been shown to improve predictive performance in a variety of settings. 'stacks' implements a grammar for 'tidymodels'-aligned model stacking.

Last updated 5 months ago

11.50 score 295 stars 840 scripts 1.8k downloads

bonsai - Model Wrappers for Tree-Based Models

Bindings for additional tree-based model engines for use with the 'parsnip' package. Models include gradient boosted decision trees with 'LightGBM' (Ke et al, 2017.), conditional inference trees and conditional random forests with 'partykit' (Hothorn and Zeileis, 2015. and Hothorn et al, 2006. <doi:10.1198/106186006X133933>), and accelerated oblique random forests with 'aorsf' (Jaeger et al, 2022 <doi:10.5281/zenodo.7116854>).

Last updated 1 months ago

9.97 score 52 stars 2 dependents 620 scripts 2.1k downloads

chores - A Collection of Large Language Model Assistants

Provides a collection of ergonomic large language model assistants designed to help you complete repetitive, hard-to-automate tasks quickly. After selecting some code, press the keyboard shortcut you've chosen to trigger the package app, select an assistant, and watch your chore be carried out. While the package ships with a number of chore helpers for R package development, users can create custom helpers just by writing some instructions in a markdown file.

Last updated 22 days ago

7.91 score 90 stars 6 scripts

gander - High Performance, Low Friction Large Language Model Chat

Introduces a 'Copilot'-like completion experience, but it knows how to talk to the objects in your R environment. 'ellmer' chats are integrated directly into your 'RStudio' and 'Positron' sessions, automatically incorporating relevant context from surrounding lines of code and your global environment (like data frame columns and types). Open the package dialog box with a keyboard shortcut, type your request, and the assistant will stream its response directly into your documents.

Last updated 24 days ago

6.39 score 55 stars 1 scripts

shinymodels - Interactive Assessments of Models

Launch a 'shiny' application for 'tidymodels' results. For classification or regression models, the app can be used to determine if there is lack of fit or poorly predicted points.

Last updated 5 months ago

shiny

6.21 score 48 stars 48 scripts 236 downloads

anyflights - Query 'nycflights13'-Like Air Travel Data for Given Years and Airports

Supplies a set of functions to query air travel data for user- specified years and airports. Datasets include on-time flights, airlines, airports, planes, and weather.

Last updated 2 months ago

5.90 score 49 stars 23 scripts 570 downloads

syrup - Measure Memory and CPU Usage for Parallel R Code

Measures memory and CPU usage of R code by regularly taking snapshots of calls to the system command 'ps'. The package provides an entry point (albeit coarse) to profile usage of system resources by R code run in parallel.

Last updated 8 months ago

5.03 score 18 stars 20 scripts 141 downloads

gbfs - Interface with Live Bikeshare Data

Supplies a set of functions to interface with bikeshare data following the General Bikeshare Feed Specification, allowing users to query and accumulate tidy datasets for specified cities/bikeshare programs.

Last updated 2 months ago

4.68 score 38 stars 25 scripts 468 downloads

forested - Forest Attributes in Washington State

A small subset of plots in Washington State are sampled and assessed "on-the-ground" as forested or non-forested by the U.S. Department of Agriculture, Forest Service, Forest Inventory and Analysis (FIA) Program, but the FIA also has access to remotely sensed data for all land in the state. The 'forested' package contains a data frame by the same name intended for use in predictive modeling applications where the more easily-accessible remotely sensed data can be used to predict whether a plot is forested or non-forested.

Last updated 7 months ago

4.66 score 7 stars 33 scripts 157 downloads

streamy - Inline Asynchronous Generator Results into Documents

Given a 'coro' asynchronous generator instance that produces text, write that text into a document selection in 'RStudio' and 'Positron'. This is particularly helpful for streaming large language model responses into the user's editor.

Last updated 1 months ago

4.43 score 1 stars 3 dependents 656 downloads

detectors - Prediction Data from GPT Detectors

Researchers carried out a series of experiments passing a number of essays to different GPT detection models. Juxtaposing detector predictions for papers written by native and non-native English writers, the authors argue that GPT detectors disproportionately classify real writing from non-native English writers as AI-generated.

Last updated 1 years ago

3.62 score 7 stars 12 scripts 181 downloads

readmission - Hospital Readmission Data for Patients with Diabetes

Clinical care data from 130 U.S. hospitals in the years 1999-2008. Each row describes an "encounter" with a patient with diabetes, including variables on demographics, medications, patient history, diagnostics, payment, and readmission.

Last updated 1 years ago

2.70 score 1 stars 10 scripts 176 downloads