dataset
Nemotron-RL-Agentic-Conversational-Tool-Use-Pivot-v1
nvidia
or hover any field below to flag it
Overview
Name
Nemotron-RL-Agentic-Conversational-Tool-Use-Pivot-v1
Source
nvidia
Episodes
0
Robot count
0
Format
json
Description
Dataset Description:
We created an RL dataset for conversational tool-use by utilizing existing expert tool-use trajectories. We pose each assistant step of the trajectory as a separate behavior cloning problem where the policy model is incentivized to match the tool call choices of the expert model. Each trajectory includes the use of tools for authentication, data lookup, servicing (i.e. booking reservations, changing them, getting discounts, etc), and more across 838 different… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/Nemotron-RL-Agentic-Conversational-Tool-Use-Pivot-v1.
Robots used
null
Links
HuggingFace dataset