Minguk Kang

I am a Founding Research Scientist at Pika. I received my Ph.D. from the Graduate School of AI at POSTECH in 2026, advised by Prof. Suha Kwak (2023-2026) and Prof. Jaesik Park (2020-2023). Previously, I was a Research Scientist Intern at Adobe Research, where my work on GigaGAN contributed to Adobe Firefly. I received my B.S. from Pusan National University.

At Pika, I have contributed to PikaStream1.0, an audio-driven performance model, and video generation models including Pika 1.5, 2.0, 2.1, and Pika 2.2. My work spans tokenizers, diffusion distillation, fast super-resolution, and components for real-time video agent systems.

My research focuses on efficient generative modeling for real-time content generation across video, audio, and multimodal media. I am particularly interested in high-compression, low-latency tokenizers, few-step diffusion distillation, fast super-resolution, tokenizer design and diffusibility, and multimodal generative modeling.

Email: minguk@pika.art, mingukkang1994@gmail.com

Education

Feb, 2020 - Feb, 2026	Pohang University of Science and Technology (POSTECH), Pohang, South Korea Ph.D. in Graduate School of AI Advisors: Prof. Suha Kwak (2023-2026) and Prof. Jaesik Park (2020-2023) Thesis: Efficient Deep Generative Models for Visual Content Generation
Mar, 2013 - Aug, 2019	Pusan National University, Pusan, South Korea B.S. in Engineering (Major: Mechanical Engineering; Minor: Statistics) Graduated summa cum laude, ranked 1st among 394 students in the College of Engineering.

Experience

Nov, 2024 - Present	Pika Labs, South Korea (Remote) Founding Research Scientist Core contributor to video generation models, an audio-driven performance model, and PikaStream1.0. Developed tokenizers, distillation pipelines, fast super-resolution methods, and real-time video agent system components.
Jun, 2024 - Oct, 2024	Pika Labs, Korea (Remote) / Palo Alto, USA Research Scientist Intern Worked with Chenlin Meng on video generation research.
Jul, 2022 - May, 2024	Adobe Research Creative Intelligence Lab, Korea (Remote) / San Francisco, USA Research Scientist Intern Mentors: Taesung Park, Connelly Barnes, Eli Shechtman, Jun-Yan Zhu, Richard Zhang, Sylvain Paris Our GigaGAN (CVPR 2023) contributed to the development of Adobe Firefly.
Feb, 2020 - Feb, 2026	Computer Vision Lab, POSTECH, Pohang, South Korea Graduate Researcher / Ph.D. Student Advisors: Prof. Suha Kwak (2023-2026) and Prof. Jaesik Park (2020-2023)
Aug, 2017 - Jan, 2020	Vision and Intelligent System Lab, Pusan National University, Pusan, South Korea Undergraduate Research Student Advisor: Prof. Dongjoong Kang

Products & Software

PikaStream1.0: core contributor to a real-time video agent system for group video chat, focusing on low-latency generation and multimodal capabilities.

Audio-Driven Performance Model: developed generation and acceleration pipelines.

Pika Video Generation Models: contributed to Pika 1.5, Pika 2.0, Pika 2.1, and Pika 2.2, with work on tokenizers, distillation, and fast super-resolution.

Adobe Firefly is Adobe’s visual generative AI product suite; my GigaGAN research contributed to its development.

PyTorch StudioGAN is an open-source PyTorch library for representative GAN training and evaluation.

Publications

FlashDecoder: Real-Time Latent-to-Pixel Streaming Decoder with Transformers

Minguk Kang, and Suha Kwak

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026

Paper Code
Distilling Diffusion Models into Conditional GANs

Minguk Kang, Richard Zhang, Connelly Barnes, Sylvain Paris, Suha Kwak, Jaesik Park, Eli Shechtman, Jun-Yan Zhu, and Taesung Park

European Conference on Computer Vision (ECCV), 2024

Paper Code Website
Extending CLIP’s Image-Text Alignment to Referring Image Segmentation

Seoyeon Kim, Minguk Kang, Dongwon Kim, Jaesik Park, and Suha Kwak

Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024

Paper Code
Fill-Up: Balancing Long-Tailed Data with Generative Models

Joonghyuk Shin, Minguk Kang, and Jaesik Park

arXiv preprint arXiv:2306.07200 2023

Paper Website
StudioGAN: A Taxonomy and Benchmark of GANs for Image Synthesis

Minguk Kang, Joonghyuk Shin, and Jaesik Park

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2023

Paper Code
Holistic Evaluation of Text-to-Image Models

Tony Lee, Michihiro Yasunaga, Chenlin Meng, Yifan Mai, Joon Sung Park, Agrim Gupta, Yunzhi Zhang, Deepak Narayanan, Hannah Benita Teufel, Marco Bellagente, Minguk Kang, Taesung Park, Jure Leskovec, Jun-Yan Zhu, Li Fei-Fei, Jiajun Wu, Stefano Ermon, and Percy Liang

Advances in Neural Information Processing Systems (NeurIPS), Datasets and Benchmarks Track, Spotlight 2023

Paper Website
Scaling up GANs for Text-to-Image Synthesis

Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, and Taesung Park

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Highlight; top 2.5% among 9,155 submissions, 2023

Paper Website
Context-Aware Image Completion

Jinoh Cho, Minguk Kang, Vibhav Vineet, and Jaesik Park

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshop, 2023

Paper
Rebooting ACGAN: Auxiliary Classifier GANs with Stable Training

Minguk Kang, Woohyeon Shim, Minsu Cho, and Jaesik Park

Advances in Neural Information Processing Systems (NeurIPS), 2021

Paper Code
ContraGAN: Contrastive Learning for Conditional Image Generation

Minguk Kang, and Jaesik Park

Advances in Neural Information Processing Systems (NeurIPS), 2020

Paper Code