XLS-R: Self-supervised cross-lingual speech representation learning at scale A Babu, C Wang, A Tjandra, K Lakhotia, Q Xu, N Goyal, K Singh, ... arXiv preprint arXiv:2111.09296, 2021 | 604 | 2021 |
Transformer-based acoustic modeling for hybrid speech recognition Y Wang, A Mohamed, D Le, C Liu, A Xiao, J Mahadeokar, H Huang, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 264 | 2020 |
Listening while speaking: Speech chain by deep learning A Tjandra, S Sakti, S Nakamura 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2017 | 209 | 2017 |
Scaling speech technology to 1,000+ languages V Pratap, A Tjandra, B Shi, P Tomasello, A Babu, S Kundu, A Elkahky, ... Journal of Machine Learning Research 25 (97), 1-52, 2024 | 208 | 2024 |
Compressing recurrent neural network with tensor train A Tjandra, S Sakti, S Nakamura 2017 International Joint Conference on Neural Networks (IJCNN), 4451-4458, 2017 | 143 | 2017 |
VQVAE unsupervised unit discovery and multi-scale code2spec inverter for zerospeech challenge 2019 A Tjandra, B Sisman, M Zhang, S Sakti, H Li, S Nakamura arXiv preprint arXiv:1905.11449, 2019 | 87 | 2019 |
Machine speech chain with one-shot speaker adaptation A Tjandra, S Sakti, S Nakamura arXiv preprint arXiv:1803.10525, 2018 | 67 | 2018 |
Tensor decomposition for compressing recurrent neural network A Tjandra, S Sakti, S Nakamura 2018 International Joint Conference on Neural Networks (IJCNN), 1-8, 2018 | 60 | 2018 |
Combining depth image and skeleton data from Kinect for recognizing words in the sign system for Indonesian language (SIBI [Sistem Isyarat Bahasa Indonesia]) E Rakun, M Andriani, IW Wiprayoga, K Danniswara, A Tjandra 2013 International Conference on Advanced Computer Science and Information …, 2013 | 57 | 2013 |
Local monotonic attention mechanism for end-to-end speech and language processing A Tjandra, S Sakti, S Nakamura arXiv preprint arXiv:1705.08091, 2017 | 56 | 2017 |
Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks A Tjandra, C Liu, F Zhang, X Zhang, Y Wang, G Synnaeve, S Nakamura, ... arXiv preprint arXiv:1910.10324, 2019 | 50 | 2019 |
Audiobox: Unified audio generation with natural language prompts A Vyas, B Shi, M Le, A Tjandra, YC Wu, B Guo, J Zhang, X Zhang, ... arXiv preprint arXiv:2312.15821, 2023 | 49 | 2023 |
Speech-to-speech translation between untranscribed unknown languages A Tjandra, S Sakti, S Nakamura 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 49 | 2019 |
End-to-end feedback loss in speech chain framework via straight-through estimator A Tjandra, S Sakti, S Nakamura ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 47 | 2019 |
Improved language identification through cross-lingual self-supervised learning A Tjandra, DG Choudhury, F Zhang, K Singh, A Conneau, A Baevski, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 46 | 2022 |
Transformer vq-vae for unsupervised unit discovery and speech synthesis: Zerospeech 2020 challenge A Tjandra, S Sakti, S Nakamura arXiv preprint arXiv:2005.11676, 2020 | 46 | 2020 |
Machine speech chain A Tjandra, S Sakti, S Nakamura IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, 976-989, 2020 | 44 | 2020 |
Sequence-to-sequence ASR optimization via reinforcement learning A Tjandra, S Sakti, S Nakamura 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 36 | 2018 |
Gated recurrent neural tensor network A Tjandra, S Sakti, R Manurung, M Adriani, S Nakamura 2016 International Joint Conference on Neural Networks (IJCNN), 448-455, 2016 | 36 | 2016 |
Speech chain for semi-supervised learning of japanese-english code-switching asr and tts S Nakayama, A Tjandra, S Sakti, S Nakamura 2018 IEEE Spoken Language Technology Workshop (SLT), 182-189, 2018 | 34 | 2018 |