policy
Self-Critical-Sequential-training-with-RL-for-chatbots
ajayn1997 · PyTorch
or hover any field below to flag it
Overview
Name
Self-Critical-Sequential-training-with-RL-for-chatbots
Author
ajayn1997
Framework
PyTorch
License
unknown
Skill type
other
Evidence level
untested
Task description
A chatbot implemented as a seq-2-seq model and trained using cross entropy method. The performance of the chatbot is improved by using Sequence Level Training using REINFORCE algorithm. In order to apply the REINFORCE algorithm (Williams, 1992; Zaremba & Sutskever, 2015) to the problem of sequence g
Spaces
Action space
other · 0-dim · 0Hz
Observation space
- type: other
Links
HuggingFace repo
null
Paper (arXiv)
null
Compatible robots
3+17 mentioned but not in catalog yetCompatible environments
0No environments list Self-Critical-Sequential-training-with-RL-for-chatbots yet.
Datasets that reference this policy
0No datasets reference Self-Critical-Sequential-training-with-RL-for-chatbots yet.