1 minute read

Real-Time High-Resolution Background Matting | Paper Review

A review of the CVPR 2021 paper "Real-Time High-Resolution Background Matting".

This post is a summary of the paper by Lin et al 20201 where they proposes a two stage deep neural network model for real time segmentation of subjects from background.

screen
The technique employed is based on background matting, where an additional frame of the background is captured and used in recovering the alpha matte and the foreground layer. Source: Lin et al 2020.

Review TLDR;

Its pretty remarkable what has been accomplished here. Future work might benefit from:

  • Efforts to definitely would benefit from further optimizing for latency while maintaining fine grained segmentation (e.g. using residual U-blocks as proposed in the U2Net Paper. This might lead to usable FPS values on commodity CPU machines.
  • Efforts to optimize for usability by elmininating the background image requirement. E.g. by reframing the ML problem, we can train the network to jointly predict the background (image completion) in addition to the alpha matte values, and leverage this knowledge in predicting better matte values. Ideally, this formulation will utilize background images (as labels) during training but not require them during inference.

Overall, well written paper and well produced video explaining their work.

I will be updating this post as I experiment with the model itself.

References


  1. Lin, S., Ryabtsev, A., Sengupta, S., Curless, B., Seitz, S., & Kemelmacher-Shlizerman, I. (2020). Real-Time High-Resolution Background Matting. arXiv preprint arXiv:2012.07810. CVPR 2021.
Interested in more articles like this? Subscribe to get a monthly roundup of new posts and other interesting ideas at the intersection of Applied AI and HCI.

RELATED POSTS | research, paper review, machine learning

Read the Newsletter.

I write a monthly newsletter on Applied AI and HCI. Subscribe to get notified on new posts.

Feel free to reach out! Twitter, GitHub, LinkedIn

.