270 results for “topic:training-data”
A system for quickly generating training data with weak supervision
The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.
Synthetic data generators for tabular and time-series data
skweak: A software toolkit for weak supervision applied to NLP tasks
Computer vision based ML training data generation tool :rocket:
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels for supervised learning.
Pure Python, lightweight, Pillow-based solver for Amazon's text captcha.
Web application for image labeling and segmentation
🏖TagEditor - Annotation tool for spaCy
A lightweight web application for brushing labels onto time series data; useful for building training sets.
Augmenty is an augmentation library based on spaCy for augmenting texts.
Aubo i5 Dual Arm Collaborative Robot - RealSense D435 - 3D Object Pose Estimation - ROS
Natural Language Data Augmentation Tool for Conversational Systems
Generating training data from the Carla driving simulator in the KITTI dataset format
Collection of casual conversations that can be used with the Rasa Stack
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 languages, generated using PaLM 2 and summarize-then-ask prompting.
Evidence-based endurance coaching protocol for any AI/LLM. Deterministic training guidance with Intervals.icu integration.
Convert all files in git repository to .txt files. Useful for training LLMs on your codebase.
COVID-19 Coughs files for training AI models
Full resources supporting the publication "A Pragmatic Guide to Geoparsing Evaluation."
Data Programming by Demonstration (DPBD) for Document Classification
🔎 Classification helper for sex classification feature of InstaPy
Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling
Labeled training data for detection of aircraft in Planet satellite imagery
PyTorch reimplementation of computing Shapley values via Truncated Monte Carlo sampling from "What is your data worth? Equitable Valuation of Data" by Amirata Ghorbani and James Zou [ICML 2019]
Benchmarking tools for applying AI/ML to data assimilation
Machine Learning project aimed at converting images into .obj 3D models by representing them as Blender hair-type particle systems.
AI Assisted Image and Video Training Data Labeling @ Scale
KL3M training data collection and preprocessing