
rio - A Swiss-Army Knife for Data I/O
Streamlined data import and export by making assumptions that the user is probably willing to make: 'import()' and 'export()' determine the data format from the file extension, reasonable defaults are used for data import and export, web-based import is natively supported (including from SSL/HTTPS), compressed files can be read directly, and fast import packages are used where appropriate. An additional convenience function, 'convert()', provides a simple method for converting between file types.
Last updated 3 months ago
csvcsvydatadata-scienceexcelioriosasspssstata
17.08 score 605 stars 71 dependents 7.8k scripts 44k downloadsreadODS - Read and Write ODS Files
Read ODS (OpenDocument Spreadsheet) into R as data frame. Also support writing data frame into ODS file.
Last updated 2 months ago
cpp
12.77 score 55 stars 24 dependents 808 scripts 12k downloadsoolong - Create Validation Tests for Automated Content Analysis
Intended to create standard human-in-the-loop validity tests for typical automated content analysis such as topic modeling and dictionary-based methods. This package offers a standard workflow with functions to prepare, administer and evaluate a human-in-the-loop validity test. This package provides functions for validating topic models using word intrusion, topic intrusion (Chang et al. 2009, <https://papers.nips.cc/paper/3700-reading-tea-leaves-how-humans-interpret-topic-models>) and word set intrusion (Ying et al. 2021) <doi:10.1017/pan.2021.33> tests. This package also provides functions for generating gold-standard data which are useful for validating dictionary-based methods. The default settings of all generated tests match those suggested in Chang et al. (2009) and Song et al. (2020) <doi:10.1080/10584609.2020.1723752>.
Last updated 16 days ago
textanalysistopicmodelingvalidation
7.57 score 54 stars 23 scripts 314 downloads
minty - Minimal Type Guesser
Port the type guesser from 'readr' (so-called 'readr' first edition parsing engine, now superseded by 'vroom').
Last updated 2 months ago
cpp
7.16 score 5 stars 26 dependents 5 scripts 9.3k downloadsrang - Reconstructing Reproducible R Computational Environments
Resolve the dependency graph of R packages at a specific time point based on the information from various 'R-hub' web services <https://blog.r-hub.io/>. The dependency graph can then be used to reconstruct the R computational environment with 'Rocker' <https://rocker-project.org>.
Last updated 1 months ago
reproducibilityreproducible-research
6.32 score 80 stars 13 scripts 258 downloadsgrafzahl - Supervised Machine Learning for Textual Data Using Transformers and 'Quanteda'
Duct tape the 'quanteda' ecosystem (Benoit et al., 2018) <doi:10.21105/joss.00774> to modern Transformer-based text classification models (Wolf et al., 2020) <doi:10.18653/v1/2020.emnlp-demos.6>, in order to facilitate supervised machine learning for textual data. This package mimics the behaviors of 'quanteda.textmodels' and provides a function to setup the 'Python' environment to use the pretrained models from 'Hugging Face' <https://huggingface.co/>. More information: <doi:10.5117/CCR2023.1.003.CHAN>.
Last updated 22 days ago
5.91 score 41 stars 3 scripts 235 downloadssweater - Speedy Word Embedding Association Test and Extras Using R
Conduct various tests for evaluating implicit biases in word embeddings: Word Embedding Association Test (Caliskan et al., 2017), <doi:10.1126/science.aal4230>, Relative Norm Distance (Garg et al., 2018), <doi:10.1073/pnas.1720347115>, Mean Average Cosine Similarity (Mazini et al., 2019) <arXiv:1904.04047>, SemAxis (An et al., 2018) <arXiv:1806.05521>, Relative Negative Sentiment Bias (Sweeney & Najafian, 2019) <doi:10.18653/v1/P19-1162>, and Embedding Coherence Test (Dev & Phillips, 2019) <arXiv:1901.07656>.
Last updated 1 months ago
bias-detectiontextanalysiswordembeddingcpp
4.80 score 30 stars 14 scripts 501 downloadsngramrr - A Simple General Purpose N-Gram Tokenizer
A simple n-gram (contiguous sequences of n items from a given sequence of text) tokenizer to be used with the 'tm' package with no 'rJava'/'RWeka' dependency.
Last updated 9 years ago
4.48 score 10 stars 2 dependents 5 scripts 678 downloadssehrnett - A Very Nice Interface to 'WordNet'
A very nice interface to Princeton's 'WordNet' without 'rJava' dependency. 'WordNet' data is not included. Princeton University makes 'WordNet' available to research and commercial users free of charge provided the terms of their license (<https://wordnet.princeton.edu/license-and-commercial-use>) are followed, and proper reference is made to the project using an appropriate citation (<https://wordnet.princeton.edu/citing-wordnet>).
Last updated 2 years ago
wordnet
3.48 score 6 stars 3 scripts 167 downloads