Japanese TTS Arena: Benchmarking Japanese TTS Models in the Wild

Vote to help the community find the best available text-to-speech model!

This arena is inspired and built on TTS Arena.

We are actively maintaining this project. Suggestions via contact/discussion are welcome!

๐Ÿ—ณ๏ธ Vote

  • Input text (Japanese only) to synthesize audio (or press ๐ŸŽฒ for random text).
  • Listen to the two audio clips, one after the other.
  • Vote on which audio sounds more natural to you.
  • Note: Model names are revealed after the vote is cast.

Note: It may take up to 30 seconds to synthesize audio.

If you use this data in your publication, please cite us!

Copy the BibTeX citation to cite this source:

@misc{tts-arena-ja,
        title        = {Japanese Text to Speech Arena},
        author       = {Kotoba Technologies.},
        year         = 2024,
        publisher    = {Hugging Face},
        howpublished = "\url{https://huggingface.co/spaces/kotoba-tech/TTS-Arena-JA}"
}

Please remember that all generated audio clips should be assumed unsuitable for redistribution or commercial use.