Real-Time High-Resolution Background Matting | Paper Review
A review of the CVPR 2021 paper "Real-Time High-Resolution Background Matting".
This post is a summary of the paper by Lin et al 2020[^1] where they proposes a two stage deep neural network model for real time segmentation of subjects from background.
Review TLDR;
Its pretty remarkable what has been accomplished here. Future work might benefit from:
Efforts to definitely would benefit from further optimizing for latency while
maintaining fine grained segmentation (e.g. using residual U-blocks as proposed
in the U2Net Paper. This might lead to usable
FPS values on commodity CPU machines.
Efforts to optimize for usability by elmininating the background image requirement.
E.g. by reframing the ML problem, we can train the network to jointly predict
the background (image completion) in addition to the alpha matte values, and
leverage this knowledge in predicting better matte values. Ideally, this formulation
will utilize background images (as labels) during training but not require them
during inference.
Overall, well written paper and well produced video explaining their work.
I will be updating this post as I experiment with the model itself.
Interested in more articles like this? Subscribe to get a monthly roundup of new posts and other interesting ideas at the intersection of Applied AI and HCI.