dataset

synthetic-healthcare-tool-calling-grpo-rlvr-1k

pranavvmurthy26

or hover any field below to flag it

Overview

Name

Source

pranavvmurthy26

Episodes

Robot count

Format

parquet

Description

🏥 Synthetic Healthcare Tool Calling Dataset for GRPO and RLVR This is a synthetic dataset designed for training language models on clinical decision support tool calling using GRPO (Group Relative Policy Optimization) with verifiable rewards (RLVR). The dataset contains ~1.1K examples of clinical scenarios paired with expected tool calls and answers. Dataset sample schema: { "prompt": [ { "role": "system", "content": "You are a clinical decision support assistant… See the full description on the dataset page: https://huggingface.co/datasets/pranavvmurthy26/synthetic-healthcare-tool-calling-grpo-rlvr-1k.

Robots used

null

Links

HuggingFace dataset

pranavvmurthy26/synthetic-healthcare-tool-calling-grpo-rlvr-1k