VoxConverse

Download

The labels for the development set can be downloaded from here.
The wav files can be downloaded from here:

File		MD5 Checksum
Dev WAV files	Download	`2a6e07e7473d9841abb132554a698a36`
Test WAV files	Download	`834558bbd9b1ffd2d4893181556ceddd`

License

The VoxConverse dataset is available to download for research purposes under a Creative Commons Attribution 4.0 International License. The copyright remains with the original owners of the video.

In order to obtain videos with a large amount of overlapping speech, we used data consisting of political debates and news segments. The views and opinions expressed by speakers in the dataset are those of the individual speakers and do not necessarily reflect positions of the University of Oxford, Naver Corporation, KAIST or the authors of the paper.

We would also like to note that the distribution of identities in this dataset may not be representative the global human population. Please be careful of unintended societal, gender, racial, linguistic and other biases when training or deploying models trained on this data.

Publications

Please cite the following if you make use of the dataset.

Spot the conversation: speaker diarisation in the wild
J. S. Chung*, J. Huh*, A. Nagrani*, T. Afouras, A. Zisserman
Interspeech, 2020
PDF