dataset

navigation-corpus-ewe-speech

ghananlpcommunity

or hover any field below to flag it

Overview

Name
navigation-corpus-ewe-speech
Source
ghananlpcommunity
Episodes
0
Robot count
0
Format
parquet
Description
Ewe Speech Segments (sentence splitting) 49348 speech-text pairs split from long recordings. Processing pipeline Source audio from ghananlpcommunity/navigation-corpus-speech-full-ewe Full-file CTC forced alignment (MMS-300M) for word-level timestamps Sentence-boundary splits (. ? !) — long sentences re-chunked to 16 words Leading/trailing silence trimmed with VAD (-40 dBFS threshold) Filtered: min 1.0s, max 15.0s Original sample rate preserved Usage from… See the full description on the dataset page: https://huggingface.co/datasets/ghananlpcommunity/navigation-corpus-ewe-speech.
Robots used
null

Links