policy

Self-Critical-Sequential-training-with-RL-for-chatbots

ajayn1997 · PyTorch

or hover any field below to flag it

Overview

Name

Author

ajayn1997

Framework

PyTorch

License

unknown

Skill type

other

Evidence level

untested

Task description

A chatbot implemented as a seq-2-seq model and trained using cross entropy method. The performance of the chatbot is improved by using Sequence Level Training using REINFORCE algorithm. In order to apply the REINFORCE algorithm (Williams, 1992; Zaremba & Sutskever, 2015) to the problem of sequence g

Spaces

Action space

other · 0-dim · 0Hz

Observation space

type: other

Links

HuggingFace repo

null

Paper (arXiv)

null

Compatible robots

3+17 mentioned but not in catalog yet

SpotBoston Dynamics T1Booster Robotics ApolloApptronik

Compatible environments

No environments list Self-Critical-Sequential-training-with-RL-for-chatbots yet.

Datasets that reference this policy

No datasets reference Self-Critical-Sequential-training-with-RL-for-chatbots yet.