policy

Self-Critical-Sequential-training-with-RL-for-chatbots

ajayn1997 · PyTorch

or hover any field below to flag it

Overview

Name
Self-Critical-Sequential-training-with-RL-for-chatbots
Author
ajayn1997
Framework
PyTorch
License
unknown
Skill type
other
Evidence level
untested
Task description
A chatbot implemented as a seq-2-seq model and trained using cross entropy method. The performance of the chatbot is improved by using Sequence Level Training using REINFORCE algorithm. In order to apply the REINFORCE algorithm (Williams, 1992; Zaremba & Sutskever, 2015) to the problem of sequence g

Spaces

Action space
other · 0-dim · 0Hz
Observation space
  • type: other

Links

HuggingFace repo
null
Paper (arXiv)
null

Compatible robots

3+17 mentioned but not in catalog yet

Compatible environments

0

No environments list Self-Critical-Sequential-training-with-RL-for-chatbots yet.

Datasets that reference this policy

0

No datasets reference Self-Critical-Sequential-training-with-RL-for-chatbots yet.