Publication

2020

  • Aswin Sivaraman and Minje Kim, “Sparse Mixture of Local Experts for Efficient Speech Enhancement,” in Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech), Shanghai, China, October 25-29, 2020 [pdf, demo, code] (accepted for publication)
  • Sanna Wager, George Tzanetakis, Cheng-i Wang, and Minje Kim, “Deep Autotuner: A Pitch Correcting Network for Singing Performances,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 4-8, 2020. [pdf, demo, code, presentation video]
  • Sunwoo Kim, Haici Yang, and Minje Kim, “Boosted Locality Sensitive Hashing: Discriminative Binary Codes for Source Separation,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 4-8, 2020. [pdf, demo, code, presentation video]
    <Finalist for the Best Student Paper Award>
  • Kai Zhen, Mi Suk Lee, Jongmo Sung, Seungkwon Beack, and Minje Kim, “Efficient and Scalable Neural Residual Waveform Coding with Collaborative Quantization,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 4-8, 2020. [pdf, demo, code, presentation video]
  • Kai Zhen, Mi Suk Lee, and Minje Kim, “A Dual-Staged Context Aggregation Method Towards Efficient End-to-End Speech Enhancement,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 4-8, 2020. [pdf, demo, presentation video]
  • Sanna Wager, Aparna Khare, Minhua Wu, Kenichi Kumatani, and Shiva Sundaram, “Fully Learnable Front-end For Multi-channel Acoustic Modeling Using Semi-supervised Learning,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 4-8, 2020. [pdf]
  • Qian Lou, Feng Guo, Minje Kim, Lantao Liu, and Lei Jiang, “AutoQ: Automated Kernel-Wise Neural Network Quantization“, in Proceedings of the International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia, Apr. 26-30, 2020. [pdf]

2019

  • Kai Zhen, Jongmo Sung, Mi Suk Lee, Seungkwon Beack, and Minje Kim, “Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding,” in Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech), Graz, Austria, September 15-19, 2019. [pdf]
  • Geoffrey Fox, James A. Glazier, JCS Kadupitiya, Vikram Jadhao, Minje Kim, Judy Qiu, James P. Sluka, Endre Somogyi, Madhav Marathe, Abhijin Adiga, Jiangzhuo Chen, Oliver Beckstein, and Shantenu Jha, “Learning Everywhere: Pervasive Machine Learning for Effective High-Performance Computation,” in Proceedings of the IEEE International Workshop on High-Performance Big Data, Deep Learning, and Cloud Computing (HPBDC), Rio de Janeiro, Brazil, May 20, 2019. [pdf]
  • Vibhatha Abeykoon, Geoffrey Fox, and Minje Kim, “Performance Optimization on Model Synchronization in Parallel Stochastic Gradient Descent Based SVM,” in Proceedings of the High Performance Machine Learning Workshop (HPML), Cyprus, May 14, 2019. [pdf]
  • Sunwoo KimMrinmoy Maity, and Minje Kim, “Incremental Binarization On Recurrent Neural Networks for Single-Channel Source Separation,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, UK, May 12-17, 2019. [pdf, code]
  • Sanna Wager, George Tzanetakis, Stefan Sullivan, Cheng-i Wang, John Shimmin, Minje Kim, Perry Cook (2019), “Intonation: a Dataset of Quality Vocal Performances Refined by Spectral Clustering on Pitch Congruence,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, UK, May 12-17, 2019. [pdf, dataset]

2018

  • Michael Bechtel, Elise McEllhiney, Minje Kim and Heechul Yun, “DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car,” in Proceedings of the 24th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA), Hakodate, Japan, Aug. 28-31, 2018. [pdf, code]
  • Sanna Wager and Minje Kim, “Collaborative speech dereverberation: regularized tensor factorization for crowdsourced multi-channel recordings,” in Proceedings of the 26th European Signal Processing Conference (EUSIPCO), Rome, Italy, Sep. 3-7, 2018. [pdf]
  • Matt Setzler, Tyler Marghetis, and Minje Kim, “Creative leaps in musical ecosystems: early warning signals of critical transitions in professional jazz,” in Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci), Madison, WI, Jul. 25-28, 2018. [pdf]
  • Lijiang Guo and Minje Kim, “Bitwise Source Separation on Hashed Spectra: An Efficient Posterior Estimation Scheme Using Partial Rank Order Metrics,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Calgary, Canada, Apr. 15-20, 2018. [pdf]
  • Minje Kim and Paris Smaragdis, “Bitwise Neural Networks for Efficient Single-Channel Source Separation,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) Calgary, Canada, Apr. 15-20, 2018. [pdf, demo]

2017

  • Lei Jiang, Minje Kim, Wujie Wen and Danghui Wang, “XNOR-POP: A Processing-in-Memory Architecture for Binary Convolutional Neural Networks in Wide-IO2 DRAMs,” In Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), Taipei, Taiwan, Jul. 24-26, 2017. [pdf]
  • Hongwei Wang, Yunlong Gao, Shaohan Hu, Shiguang Wang, Renato Mancuso, Minje Kim, Poliang Wu, Lu Su, Lui Sha, and Tarek Abdelzaher, “On Exploiting Structured Human Interactions to Enhance Sensing Accuracy in Cyber-physical Systems,” ACM Transactions on Cyber-Physical Systems vol. 1, no. 3, article 16, pp. 16:1-16:19 Jul. 2017. [pdf]
  • Minje Kim, “Collaborative Deep Learning for Speech Enhancement: A Run-Time Model Selection Method Using Autoencoders,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), New Orleans, LA, Mar. 5-9, 2017. [pdf]
  • Sanna Wager, Liang Chen, Minje Kim, and Christopher Raphael, “Towards Expressive Instrument Synthesis Through Smooth Frame-By-Frame Reconstruction: From String To Woodwind,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), New Orleans, LA, Mar. 5-9, 2017. [pdf, demo]

2016

  • Minje Kim and Paris Smaragdis, “Efficient Neighborhood-Based Topic Modeling for Collaborative Audio Enhancement on Massive Crowdsourced Recordings,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, March 20-25, 2016. [pdf]

2015

  • Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, and Paris Smaragdis, “Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 12, pp. 2136-2147, Dec. 2015. [pdf]
  • Minje Kim and Paris Smaragdis, “Adaptive Denoising Autoencoders: A Fine-tuning Scheme to Learn from Test Mixtures,” In Proceedings of the International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), Liberec, Czech Republic, August 25-28, 2015. [pdf]
    <Finalist for the best student paper on audio signal processing>
  • Minje Kim, Paris Smaragdis, and Gautham J. Mysore, “Efficient Manifold Preserving Audio Source Separation Using Locality Sensitive Hashing,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brisbane, Australia, April 19-24, 2015. [pdf]
  • Yunlong Gao, Shaohan Hu, Renato Mancuso, Hongwei Wang, Minje Kim, Poliang Wu, Lu Su, Lui Sha, and Tarek Abdelzaher, “Exploiting Structured Human Interactions to Enhance Estimation Accuracy in Cyber-physical Systems,” in Proceedings of the International Conference on Cyber-Physical Systems (ICCPS), Seattle, WA, April 14-16, 2015. [pdf]
  • Minje Kim and Paris Smaragdis, “Bitwise Neural Networks,” International Conference on Machine Learning (ICML) Workshop on Resource-Efficient Machine Learning, Lille, France, Jul. 6-11, 2015. [pdf]
  • Minje Kim and Paris Smaragdis (2015), “Mixtures of Local Dictionaries for Unsupervised Speech Enhancement,” IEEE Signal Processing Letters, vol. 22, no. 3, pp. 288 – 292, Mar. 2015. [pdf]
    (Also presented in ICASSP 2015)

2014

  • Minje Kim and Paris Smaragdis, “Collaborative Audio Enhancement: Crowdsourced Audio Recording,” Neural Information Processing Systems (NIPS) Workshop on Crowdsourcing and Machine Learning, Montreal, Canada, Dec. 8-13, 2014. [pdf]
  • Minje Kim and Paris Smaragdis, “Efficient Model Selection for Speech Enhancement Using a Deflation Method for Nonnegative Matrix Factorization,” in Proceedings of the IEEE Global Conference on Signal and Information Processing (Global SIP), Atlanta, GA, December 3-5, 2014. [pdf]
  • Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, and Paris Smaragdis, “Singing-Voice Separation From Monaural Recordings Using Deep Recurrent Neural Networks,” in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Taipei, Taiwan, Oct. 27-31, 2014. [pdf]
  • Ding Liu, Paris Smaragdis, and Minje Kim, “Experiments on Deep Learning for Speech Denoising,” in Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech), Singapore, September 14-18, 2014. [pdf]
  • Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, and Paris Smaragdis, “Deep Learning for Monaural Speech Separation,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florence, Italy, May 4-9, 2014. [pdf, demoBib]
    <Starkey Signal Processing Research Student Grant>
  • Johannes Traa, Minje Kim, Paris Smaragdis, “Phase and Level Difference Fusion for Robust Multichannel Source Separation,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florence, Italy, May 4-9, 2014. [pdf, bib]

2013

  • Paris Smaragdis and Minje Kim, “Non-Negative Matrix Factorization for Irregularly-Spaced Transforms,” in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, Oct. 20-23, 2013. [pdf, bib]
  • Minje Kim and Paris Smaragdis, “Single Channel Source Separation Using Smooth Nonnegative Matrix Factorization with Markov Random Fields,” in Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Southampton, UK, Sep. 22-25, 2013. [pdf, bib]
  • Minje Kim and Paris Smaragdis, “Manifold Preserving Hierarchical Topic Models for Quantization and Approximation,” in Proceedings of the International Conference on Machine Learning (ICML), Atlanta, GA, Jun. 16-21, 2013. [pdf, bib]
  • Minje Kim and Paris Smaragdis, “Collaborative Audio Enhancement Using Probabilistic Latent Component Sharing,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vancouver, BC, Canada, May 26-31, 2013 [pdf, demobib
    <Google ICASSP Student Travel Grant>
    <Best Student Paper in the Audio and Acoustic Signal Processing (AASP) area>
  • C. Zhang, G.G. Ko, J.W. Choi, S.-N. Tsai, Minje Kim, A.G. Rivera, R. Rutenbar, P. Smaragdis, M.S. Park, V. Narayanan, H. Xin, O. Mutlu, B. Li, L. Zhao, M. Chen, and R. Iyer, “EMERALD: Characterization of Emerging Applications and Algorithms for Low-power Devices,” in Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Austin, TX, Apr. 21-23, 2013. [pdf, bib]

2012

  • Minje Kim, Paris Smaragdis, Glenn G. Ko, and Rob A. Rutenbar, “Stereophonic Spectrogram Segmentation Using Markov Random Fields,” in Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Santander, Spain, Sep. 23-26, 2012. [pdf, demobib]

2011

  • Seungkwon Beack, Taejin Lee, Minje Kim, and Kyeongok Kang (2011), “An Efficient Time-Frequency Representation for Parametric-Based Audio Object Coding,” ETRI Journal, vol. 33, no. 6, pp. 945-948, Dec. 2011, [pdf, bib]
  • Minje Kim, Jiho Yoo, Kyeongok Kang, and Seungjin Choi (2011), “Nonnegative Matrix Partial Co-Factorization for Spectral and Temporal Drum Source Separation,” IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 6, pp. 1192-1204, Oct. 2011. [pdf, demobib]
  • Minje Kim, Seungkwon Beack, Keunwoo Choi, and Kyeongok Kang, “Gaussian Mixture Model for Singing Voice Separation From Stereophonic Music,” in Proceedings of the Audio Engineering Society 43rd International Conference (AES Conference), Pohang, Korea, Sep. 29 – Oct. 1, 2011. [pdf, demobib]

2010

  • Minje Kim, Jiho Yoo, Kyeongok Kang, and Seungjin Choi, “Blind Rhythmic Source Separation: Nonnegativity and Repeatability,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, TX, Mar. 14-19, 2010. [pdf, demo, bib]
  • Jiho Yoo, Minje Kim, Kyeongok Kang, and Seungjin Choi, “Nonnegative Matrix Partial Co-Factorization for Drum Source Separation,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, TX, Mar. 14-19, 2010. [pdf, demo, bib]

2008

  • Minje Kim, Seungkwon Beack, Taejin Lee, Daeyoung Jang, and Kyeongok Kang, “Segmented Dimensionality Reduction Coding on Frequency Domain Signal,” in Proceedings of the Audio Engineering Society 34th International Conference (AES Conference), Jeju Island, Korea, Aug. 28-30, 2008. [pdf, bib]

2007

  • Minje Kim, Minsik Park, Seung-jun Yang, Ji Hoon Choi, and Han-kyu Lee, “System Aspects of TV-Anytime Metadata Codec in a Uni-directional Broadcasting Environment,” in Proceedings of the IEEE International Symposium on Consumer Electronics (ISCE), Dallas, TX, Jun. 20-23, 2007. [pdf, bib]
  • Seung-jun Yang, Jung Won Kang, Dong-San Jun, Minje Kim, and Han-kyu Lee, “TV-Anytime Metadata Authoring Tool for Personalized Broadcasting Services,” in Proceedings of the IEEE International Symposium on Consumer Electronics (ISCE), Dallas, TX, Jun. 20-23, 2007. [pdf, bib]

2006

  • Minje Kim and Seungjin Choi, “ICA-based clustering for resolving permutation ambiguity in frequency-domain convolutive source separation,” in Proceedings of the IEEE International Conference on Pattern Recognition (ICPR), Hong Kong, Aug. 20-24, 2006. [pdf, bib]
  • Minje Kim and Seungjin Choi, “Monaural Music Source Separation: Nonnegativity, Sparseness, and Shift-invariance,” in Proceedings of the International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), pp. 617-624, Charleston, SC, Mar. 5-8, 2006, (LNCS 3889). [pdf, demobib]

2005

  • Minje Kim and Seungjin Choi, “On Spectral Basis Selection for Single Channel Polyphonic Music Separation,” in Proceedings of the International Conference on Artificial Neural Networks (ICANN), Warsaw, Poland, Sep. 11-15 2005, (LNCS 3697). [pdf, demo, bib].

Ph.D. Dissertation

  • Minje Kim, “Audio Computing in the Wild: Frameworks for Big Data and Small Computers,” Ph.D. Dissertation, Department of Computer Science, University of Illinois at Urbana-Champaign, May, 2016. [pdf]

M.S. Thesis

  • Minje Kim, “Monaural Music Source Separation: Nonnegativity, Sparseness, and Shift-Invariance,” Master’s Thesis, Department of Computer Science and Engineering, POSTECH, Feb, 2006. [pdf]

Talks, Posters, Other Presentations

  • [Invited Talk] Dept. of Electrical and Computer Engineering, University of Rochester, Rochester, NY, Dec. 11, 2019
  • [Invited Talk] Amazon Lab126, Sunnyvale, CA, Dec. 6, 2019
  • [Talk/Workshop] Midwest Music and Audio Day, Bloomington, IN, May 12, 2018
  • [Poster/workshop] “Bitwise Source Separation on Hashed Spectra: An Efficient Posterior Estimation Scheme Using Partial Rank Order Metrics,” Speech and Audio in the Northeast (SANE) 2018, Oct. 18, 2018
  • [Talk/Workshop] Seventh Annual Midwest Cognitive Science Conference, Bloomington, IN, May 12, 2018
  • [Poster/workshop] U.S. Air Force Science and Technology 2030, Bloomington, IN, May 10, 2018
  • [Invited Talk] Data Science Online Immersion Weekend, Indiana University, Bloomington, IN, Mar. 3, 2018
  • [Invited Talk] Intelligent & Interactive Systems Talk Series, School of Informatics and Computing, Indiana University, Bloomington, IN, Feb. 5, 2018
  • [Poster/workshop] Neural Information Processing Systems (NIPS) Workshop on Machine Learning for Audio, Long Beach, CA, Dec. 8, 2017 (two posters)
  • [Poster/workshop] IEEE EnCON, Indiana University, Bloomington, IN, Nov. 10-11, 2017
  • [Invited Talk/workshop] International Conference on Parallel Architectures and Compilation Techniques (PACT) Workshop on Computational Intelligence and Soft Computing (CISC 2017), Portland, OR, Sep. 10, 2017
  • [Talk/workshop] Midwest Music and Audio Day, Northwestern University, Evanston, IL, Jun. 23, 2017
  • [Talk/workshop] Applied Research Institute Sensor Fusion Workshop, Indiana University, Bloomington, IN, Jun. 2, 2017
  • [Talk/workshop] Indiana University Bloomington/Bielefeld University Cognitive Interaction Technology Workshop, Indiana University, Bloomington, IN, May 17, 2017
  • [Talk/workshop] IBM CIO’s visit to IUB, May 3, 2017
  • [Talk/workshop] Improvising Brain III: Cultural Variation and Analytical Techniques Symposium, Atlanta, GA, Feb, 2017.
  • [Invited Talk] Department of Statistics Colloquium Series, Indiana University, Bloomington, IN, Oct. 31, 2016
  • [Invited Talk] Intelligent & Interactive Systems Talk Series, School of Informatics and Computing, Indiana University, Bloomington, IN, Oct. 31, 2016
  • [Invited Talk] Graduate School of Culture Technology, KAIST, Daejeon, Korea, Oct. 7, 2016
  • [Invited Talk] Graduate School of Convergence Science and Technology, Seoul National University, Suwon, Korea, Oct. 6, 2016
  • [Invited Talk] Qualcomm Korea, Seoul, Korea, Oct. 6, 2016

Selected Patents

Out of more than 50 patent applications:

  • “Recurrent multimodal attention system based on expert gated networks,” US Patent App. 16/417,554
  • “Audio Signal Encoding Method and Device, and Audio Signal Decoding Method and Device,” US Patent App. 16/541,959
  • “Audio signal encoding method and apparatus and audio signal decoding method and apparatus using psychoacoustic-based weighted error function,” US Patent App. 16/122,708
  • “Irregular Pattern Identification Using Landmark Based Convolution,“ US Patent No. 10,002,622, 2018
  • “Irregularity detection in music,“ US Patent No. 9,734,844, 2017
  • “Automatic detection of dense ornamentation in music,” US Patent No. 9,514,722, 2016
  • “Pattern Matching of Sound Data Using Hashing,“ US Patent No. 9,449,085, 2016
  • “Multichannel Sound Source Identification and Localization,“ US Patent No. 9,351,093, 2016
  • “Sound Data Identification,“ US Patent No. 9,215,539, 2015.
  • “Method and System for Separating Music Sound Source Using Time and Frequency Characteristics,“ US
    Patent No. 8,563,842, 2013
  • “Method and System for Separating Music Sound Source,“ US Patent No. 8,340,943, 2012
  • “Method and system for separating musical sound source without using sound source database,“ US Patent
    No. 8,080,724, 2011