1 minute read

Real-Time High-Resolution Background Matting | Paper Review

This post is a summary of the paper by Lin et al 20201 where they proposes a two stage deep neural network model for real time segmentation of subjects from background.

The technique employed is based on background matting, where an additional frame of the background is captured and used in recovering the alpha matte and the foreground layer. Source: Lin et al 2020.

Review TLDR;

Its pretty remarkable what has been accomplished here. Future work might benefit from:

  • Efforts to definitely would benefit from further optimizing for latency while maintaining fine grained segmentation (e.g. using residual U-blocks as proposed in the U2Net Paper. This might lead to usable FPS values on commodity CPU machines.
  • Efforts to optimize for usability by elmininating the background image requirement. E.g. by reframing the ML problem, we can train the network to jointly predict the background (image completion) in addition to the alpha matte values, and leverage this knowledge in predicting better matte values. Ideally, this formulation will utilize background images (as labels) during training but not require them during inference.

Overall, well written paper and well produced video explaining their work.

I will be updating this post as I experiment with the model itself.


  1. Lin, S., Ryabtsev, A., Sengupta, S., Curless, B., Seitz, S., & Kemelmacher-Shlizerman, I. (2020). Real-Time High-Resolution Background Matting. arXiv preprint arXiv:2012.07810. CVPR 2021.
Interested in more articles like this? Subscribe to get a monthly roundup of new posts and other interesting ideas at the intersection of Applied AI and HCI.
Powered by Revue. Privacy Policy.

RELATED POSTS | research, paper review, machine learning

Join the Newsletter.

Powered by Revue. Privacy Policy.

Subscribe to get a monthly newsletter on Applied AI and HCI .

Feel free to reach out! Twitter, GitHub, LinkedIn