Speech+Music data

The signal and annotation for music part detection are available!

Speech+Music data for scene detection

Speech data from ATR speech database (ATR/SDB) [1] and popular music from RWC music database [2].

Format: wave
Sampling Rate: 22050 [Hz]
Quantization Bit Rate: 16 [bit]

Signal is here.
Annotation is here.


[1] A. Kurematsu, K. Takeda, Y. Sagisaka, S. Katagiri, H. Kuwabara and K. Shikano: "ATR Japanese Speech Database as a Tool of Speech Recognition and Synthesis", Speech Communication, Vol. 9, 4, pp. 357-363 (1990).
[2] M. Goto, H. Hashiguchi, T. Nishimura and R. Oka: " RWC Music Database: Popular, Classical, and Jazz Music Databases", Proceedings of the 3rd International Conference on Music Information Retrieval (ISMIR 2002), pp. 287-288 (2002).