policy

wordgen_curiosity

eremissimo · PyTorch

or hover any field below to flag it

Overview

Name
wordgen_curiosity
Author
eremissimo
Framework
PyTorch
License
unknown
Skill type
other
Evidence level
untested
Task description
Checking out an idea of using curiosity intrinsic reward in RL tuning of language models to get more variability in the sampling output

Spaces

Action space
other · 0-dim · 0Hz
Observation space
  • type: other

Links

HuggingFace repo
null
Paper (arXiv)
null

Compatible robots

3+17 mentioned but not in catalog yet

Compatible environments

0

No environments list wordgen_curiosity yet.

Datasets that reference this policy

0

No datasets reference wordgen_curiosity yet.