Benjamin Planche

Benjamin Planche - Portrait

I am a passionate senior research scientist at UII America in Cambridge, MA. There, I develop novel computer vision and machine learning solutions, focusing on data scarcity problems for a variety of clinical scenarios.

I obtained my PhD summa cum laude from the Faculty of Computer Science and Mathematics at the University of Passau, under the supervision of Prof. Dr. Harald Kosch.

I have been working in various research labs around the world (LIRIS in France, Mitsubishi Electric in Japan, Siemens in Germany and the US, etc.). I have a double master's degree from INSA-Lyon (France) and the University of Passau (Germany), with first-class honors and a multinational excellence award. I co-authored a book on applied computer vision. I am also sharing my knowledge and experience on online platforms such as StackOverflow or applying them to the creation of aesthetic demos. I love visual art.

Email  /  Resume  /  Google Scholar  /  StackOverflow  /  LinkedIn


We have several openings (internship & full-time). More details and application here.



I am interested in computer vision, domain adaptation, image rendering, inverse problems, and photography. Much of my research is about training robust recognition systems on scarce data and bridging the gap between real and synthetic modalities.

Paper Teaser - Result from our Method

Progressive Multi-view Human Mesh Recovery with Self-Supervision

Xuan Gong, Liangchen Song, Meng Zheng, Benjamin Planche, Terrence Chen, Junsong Yuan, David Doermann, Ziyan Wu
AAAI Conference on Artificial Intelligence (AAAI), 2023 [oral]

We propose a novel simulation-based training pipeline for multi-view human mesh recovery, which (a) relies on intermediate 2D representations which are more robust to synthetic-to-real domain gap; (b) leverages learnable calibration and triangulation to adapt to more diversified camera setups; and (c) progressively aggregates multi-view information in a canonical 3D space to remove ambiguities in 2D representations.

Paper Teaser - Results of our sky coverage prediction module

Vision on the Bog: Cranberry Crop Risk Evaluation with Deep Learning

Peri Akiva, Benjamin Planche, Aditi Roy, Peter Oudemans, Kristin Dana
Computers and Electronics in Agriculture, 2022

Our goal is to develop state-of-the-art computer vision algorithms for image-based crop evaluation and weather-related risk assessment to support real-time decision-making for growers. Our cranberry bog monitoring system maps cranberry density (based on fruit instance segmentation) and predicts short-term cranberry internal temperatures (predicting solar irradiation and fruit temperature in an end-to-end differentiable network).

Paper Teaser - Proposed Federated Learning Pipeline

Federated Learning with Privacy-Preserving Ensemble Attention Distillation

Xuan Gong, Liangchen Song, Rishi Vedula, Abhishek Sharma, Meng Zheng, Benjamin Planche, Arun Innanje, Terrence Chen, Junsong Yuan, David Doermann, Ziyan Wu
IEEE Transactions on Medical Imaging (TMI), 2022

We propose a privacy-preserving FL framework leveraging unlabeled public data for one-way offline knowledge distillation. The central model is learned from local knowledge via ensemble attention distillation. Our technique uses decentralized and heterogeneous local data like existing FL approaches, but more importantly, it significantly reduces the risk of privacy leakage. We demonstrate that our method achieves very competitive performance with more robust privacy preservation based on extensive experiments on image classification, segmentation, and reconstruction tasks.

Paper Teaser - Dense Motion Modeling from Proposed Method

PREF: Predictability Regularized Neural Motion Fields

Liangchen Song, Xuan Gong, Benjamin Planche, Meng Zheng, David Doermann, Junsong Yuan, Terrence Chen, Ziyan Wu
European Conference on Computer Vision (ECCV), 2022 [oral]

We leverage a neural motion field for estimating the motion of all points in a multiview setting. Modeling the motion from a dynamic scene with multiview data is challenging due to the ambiguities in points of similar color and points with time-varying color. We propose to regularize the estimated motion to be predictable. If the motion from previous frames is known, then the motion in the near future should be predictable. Therefore, we introduce a predictability regularization by first conditioning the estimated motion on latent embeddings, then by adopting a predictor network to enforce predictability on the embeddings. [project webpage]

Paper Teaser - Diagram Of Proposed Cross-Representation Alignment

Self-supervised Human Mesh Recovery with Cross-Representation Alignment

Xuan Gong, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, David Doermann, Ziyan Wu
European Conference on Computer Vision (ECCV), 2022

We propose cross-representation alignment utilizing the complementary information from the robust but sparse representation (2D keypoints). Specifically, the alignment errors between initial mesh estimation and both 2D representations are forwarded into regressor and dynamically corrected in the following mesh regression. This adaptive cross-representation alignment explicitly learns from the deviations and captures complementary information: robustness from sparse representation and richness from dense representation.

Paper Teaser - Results of Proposed Method on Brain MRI

PseudoClick: Interactive Image Segmentation with Click Imitation

Qin Liu, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, Marc Niethammer, Ziyan Wu
European Conference on Computer Vision (ECCV), 2022

We ask the question: can our model directly predict where to click, so as to further reduce the user interaction cost? To this end, we propose PseudoClick, a generic framework that enables existing segmentation networks to propose candidate next clicks. These automatically generated clicks, termed pseudo clicks in this work, serve as an imitation of human clicks to refine the segmentation mask. We build PseudoClick on existing segmentation backbones and show how our click prediction mechanism leads to improved performance.

Paper Teaser - Results of Proposed Methods for Patients under Cover

Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion

Meng Zheng, Xuan Gong, Benjamin Planche, Fan Yang, Ziyan Wu
Medical Image Computing and Computer Assisted Intervention (MICCAI), 2022 [early accept]

We propose a generic modularized 3D patient modeling method consists of (a) a multi-modal keypoint detection module with attentive fusion for 2D patient joint localization, to learn complementary cross-modality patient body information, leading to improved keypoint localization robustness and generalizability in a wide variety of imaging and clinical scenarios; and (b) a self-supervised 3D mesh regression module which does not require expensive 3D mesh parameter annotations to train, bringing immediate cost benefits for clinical deployment.

Paper Teaser - Prediction of In-Vivo Organ Deformation Under New Patient Poses

SMPL-A: Modeling Person-Specific Deformable Anatomy

Hengtao Guo, Benjamin Planche, Meng Zheng, Srikrishna Karanam, Terrence Chen, Ziyan Wu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

We present the first learning-based approach to estimate the patient's internal organ deformation for arbitrary human poses in order to assist with radiotherapy and similar medical protocols. The underlying method first leverages medical scans to learn a patient-specific representation that potentially encodes the organ's shape and elastic properties. During inference, given the patient's current body pose information and the organ's representation extracted from previous medical scans, our method can estimate their current organ deformation to offer guidance to clinicians.

Paper Teaser - Toy Application of DDS to Inverse Problem

Physics-based Differentiable Depth Sensor Simulation

Benjamin Planche, Rajat Vikram Singh
IEEE/CVF International Conference on Computer Vision (ICCV), 2021

We introduce DDS, a novel end-to-end differentiable simulation pipeline for the generation of realistic depth scans, built on physics-based 3D rendering and custom block-matching algorithms. Each module can be differentiated w.r.t sensor and scene parameters; e.g., to automatically tune the simulation for new devices over some provided scans or to leverage the pipeline as a 3D-to-2.5D transformer within larger computer-vision applications. [full version, with sup-mat]

Paper Teaser - Results of our sky coverage prediction module

AI on the Bog: Monitoring and Evaluating Cranberry Crop Risk

Peri AkivaEC, Benjamin PlancheEC, Aditi Roy, Kristin Dana, Peter Oudemans, Michael Mars
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2021 (EC equal contribution)

We propose an end-to-end cranberry health monitoring system to enable and support real time cranberry over-heating assessment and facilitate informed decisions that may sustain the economic viability of farms. Our system performs: 1) cranberry fruit segmentation to delineate fruit regions that are exposed to sun, 2) prediction of cloud coverage and sun irradiance to estimate the inner temperature of exposed cranberries.

Paper Teaser - Pipeline for Incremental Scene Synthesis

Incremental Scene Synthesis

Benjamin Planche, Xuejian Rong, Ziyan Wu, Srikrishna Karanam, Harald Kosch, YingLi Tian, Jan Ernst, Andreas Hutter
Annual Conference on Neural Information Processing Systems (NeurIPS), 2019

We present a method to incrementally generate complete 2D or 3D scenes. Our framework can register observations from a non-localized agent in a global representation, which can be used to synthesize new views as well as fill in gaps in the representation while observing global consistency.

Paper Teaser - Pipeline and Results for Reverse Domain Adaptation

Seeing Beyond Appearance - Mapping Real Images into Geometrical Domains for Unsupervised CAD-based Recognition

Benjamin PlancheEC, Sergey ZakharovEC, Ziyan Wu, Harald Kosch, Andreas Hutter, Slobodan Ilic
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019 (EC equal contribution)

Tackling real/synthetic domain adaptation from a different angle, we introduce a pipeline to map unseen target samples into the synthetic domain used to train task-specific methods. Denoising the data and retaining only the features these recognition algorithms are familiar with, our solution greatly improves their performance.

Paper Teaser - Pipeline and Results for Reverse Domain Adaptation

Keep it Unreal: Bridging the Realism Gap for 2.5D Recognition with Geometry Priors Only

Sergey ZakharovEC, Benjamin PlancheEC, Ziyan Wu, Harald Kosch, Andreas Hutter, Slobodan Ilic
International Conference on 3D Vision (3DV), 2018 [oral] (EC equal contribution)

We propose a novel approach leveraging only CAD models to bridge the realism gap for depth images. Purely trained on synthetic data, playing against an extensive augmentation pipeline in an unsupervised manner, our GAN learns to effectively segment depth images and recover the clean synthetic-looking depth information even from partial occlusions.

Paper Teaser - Pipeline and Results for Depth Sensor Simulation

DepthSynth: Real-Time Realistic Synthetic Data Generation from CAD Models for 2.5D Recognition

Benjamin Planche, Ziyan Wu, Kai Ma, Shanhui Sun, Stefan Kluckner, Oliver Lehmann, Terrence Chen, Andreas Hutter, Sergey Zakharov, Harald Kosch, Jan Ernst
International Conference on 3D Vision (3DV), 2017 [oral]

We present an end-to-end framework which simulates the whole mechanism of depth sensors, generating realistic depth data from 3D models by comprehensively modeling vital factors, e.g., sensor noise, material reflectance, surface geometry. Our solution covers a wider range of devices and achieves more realistic results than previous methods.

Paper Teaser - Illustration of Triplets and Pairs

3D Object Instance Recognition and Pose Estimation Using Triplet Loss with Dynamic Margin

Sergey Zakharov, Wadim Kehl, Benjamin Planche, Andreas Hutter, Slobodan Ilic
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017

Inspired by the descriptor learning approach of Wohlhart et al. [link], we propose a method that introduces the dynamic margin in the manifold learning triplet loss function. Introducing the dynamic margin allows for faster training times and better accuracy of the resulting low dimensional manifolds.

Paper Teaser - Pipeline and Results for Reverse Domain Adaptation

The Brightnest Web-Based Home Automation System

Benjamin PlancheEC, Bryan Isaac MalynEC, Daniel Buldon BlancoEC, Manuel Cerrillo BermejoEC
International Conference on Ubiquitous Computing and Ambient Intelligence (UCAmI), 2014 [oral] (EC equal contribution)

Brightnest is a generic and user-friendly web-based Home Automation System. Its interface provides users with information on the whole system or with control over the devices and their rules. The modular architecture is based on "JS Drivers", their REST API imitating the way a computer usually handles new devices.


A few years ago, I got the opportunity to co-author a book, teaching how to leverage deep learning to create powerful image processing apps with TensorFlow 2.0 and Keras. While some technical examples in the book are now a bit outdated outdated (with regard to the TensorFlow API), the book also covers the foundations of deep learning, illustrated with publicly-available code examples (see GitHub link below).

Book Cover - Hands-On Computer Vision With TensorFlow 2

Hands-On Computer Vision With TensorFlow 2

Benjamin Planche and Eliot Andres
Packt Publishing, 2019

Computer vision solutions are becoming increasingly common, making their way in fields such as health, automobile, social media, and robotics. With the release of TensorFlow 2, the brand new version of Google's open source framework for machine learning, it is the perfect time to jump on board and start leveraging deep learning for your visual applications!
This book is a practical guide to building high performance systems for object detection, segmentation, video processing, smartphone applications, and more. By its end, you will have both the theoretical understanding and practical skills to solve advanced computer vision problems with TensorFlow 2.0. [Amazon | Packt | GitHub]