Real-Time High-Resolution Background Matting | Paper Review
This post is a summary of the paper by Lin et al 20201 where they proposes a two stage deep neural network model for real time segmentation of subjects from background.
Its pretty remarkable what has been accomplished here. Future work might benefit from:
- Efforts to definitely would benefit from further optimizing for latency while maintaining fine grained segmentation (e.g. using residual U-blocks as proposed in the U2Net Paper. This might lead to usable FPS values on commodity CPU machines.
- Efforts to optimize for usability by elmininating the background image requirement. E.g. by reframing the ML problem, we can train the network to jointly predict the background (image completion) in addition to the alpha matte values, and leverage this knowledge in predicting better matte values. Ideally, this formulation will utilize background images (as labels) during training but not require them during inference.
Overall, well written paper and well produced video explaining their work.
I will be updating this post as I experiment with the model itself.
- Lin, S., Ryabtsev, A., Sengupta, S., Curless, B., Seitz, S., & Kemelmacher-Shlizerman, I. (2020). Real-Time High-Resolution Background Matting. arXiv preprint arXiv:2012.07810. CVPR 2021.↩