Title: | Prediction Data from GPT Detectors |
---|---|
Description: | Researchers carried out a series of experiments passing a number of essays to different GPT detection models. Juxtaposing detector predictions for papers written by native and non-native English writers, the authors argue that GPT detectors disproportionately classify real writing from non-native English writers as AI-generated. |
Authors: | Simon Couch [cre, aut] |
Maintainer: | Simon Couch <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0.9000 |
Built: | 2024-11-01 11:30:05 UTC |
Source: | https://github.com/simonpcouch/detectors |
Data derived from the paper GPT detectors are biased against non-native English writers. The study authors carried out a series of experiments passing a number of essays to different GPT detection models. Juxtaposing detector predictions for papers written by native and non-native English writers, the authors argue that GPT detectors disproportionately classify real writing from non-native English writers as AI-generated.
detectors
detectors
A data frame with 6,185 rows and 9 columns:
Whether the essay was written by a "Human"
or "AI"
.
The class probability from the GPT detector that the inputted text was written by AI.
The uncalibrated class prediction, encoded as
if_else(.pred_AI > .5, "AI", "Human")
The name of the detector used to generate the predictions.
For essays written by humans, whether the essay was written
by a native English writer or not. These categorizations are coarse;
values of "Yes"
may actually be written by people who do not write with
English natively. NA
indicates that the text was not written by a human.
A label for the experiment that the predictions were generated from.
For essays that were written by AI, the name of the model that generated the essay.
A unique identifier for the supplied essay. Some essays were supplied to multiple detectors. Note that some essays are AI-revised derivatives of others.
For essays that were written by AI, a descriptor for the form of "prompt engineering" passed to the model.
For more information on these data, see the source paper.
doi:10.1016/j.patter.2023.100779
detectors
detectors