Making robots work
when the world
gets messy.

I build perception and learning systems for robotics - from grasp detection and occlusion-robust vision to vision-language models, validated end-to-end on physical robots.

View Research Work ↗ GitHub ↗ Email Me ↗ Resume ↗ LinkedIn ↗

Role Robotics Researcher

Lab MER Lab ↗ Advisor Prof. Berk Calli ↗ Status Open to roles↗

Venkatesh Mullur

Years DL &
Robotics Research

Publications &
Conferences

Robot Platforms
Deployed On

🏊

Swimmer, Soccer player

A little about who I am

I am a second-year PhD student in Robotics Engineering at WPI, working in the Manipulation and Environmental Robotics Lab with Prof. Berk Calli.

Watching Yamaha's Motobot struggle against Valentino Rossi got me thinking about the gap between perceiving an environment and making the right decision in it - in conditions the system wasn't built for. That question has shaped most of my research.

I came to robotics through Electronics Engineering in Pune, India, which means I think about systems from the hardware up. My work spans robot perception, deep learning for manipulation, and vision-language reasoning - all validated on real robots, not just in simulation.

Outside the lab I am a competitive swimmer and a die hard soccer fan (Glory Glory Man United!). The early-morning training discipline has been useful. Especially when a model training run fails at 3am.

I am looking for internship roles in robot perception, computer vision, deep learning, and physical AI - at teams building things that need to work when the situation changes.

What I can contribute from day one

⚙️

Deploy perception pipelines on real robots Franka Panda, Yale OpenHand, ROS2 - I've taken systems from training to hardware execution, not just simulation

🧠

Train and evaluate deep learning models end-to-end Dataset curation, architecture design, benchmark evaluation, ablations - PyTorch, CUDA, HPC clusters

📐

Design rigorous evaluations and benchmarks Not just best-case numbers - I quantify failure modes, diversity, and complementarity across datasets

📄

Contribute to both research and engineering Comfortable writing papers and shipping code - I've done both under deadline across three publications

Selected Projects

Flagship · MER Lab Franka Panda GANs Keypoint Detection UKF IBVS

2023 – 2024

Occlusion-Robust Robot Perception Pipeline

Real-time perception stack for encoderless Franka Panda. WGAN-GP inpainting reconstructs occluded joints from markerless images; keypoint detection with UKF temporal smoothing feeds Image-Based Visual Servoing, enabling stable control under severe occlusion. Outperforms LaMa with 11.7 FID, 0.91 precision.

91.6%

Occlusion accuracy

<2%

Pixel error

26%

Settling time ↓

Robotic Grasping Mixture-of-Experts Feature Fusion GraspNet-1B Ensemble Learning

2024 – 2026

Feature-Level Mixture-of-Experts for Grasping

Feature-level fusion across multiple grasp expert networks. Complementarity quantified via Q-statistics and error correlation - proving mid-accuracy moderately-correlated ensembles outperform SOTA individual models. Evaluated on Cornell, Jacquard, GraspNet-1B and Franka Panda System.

13%

Fewer grasp failures

+7%

Over individual models

Physical AI · Active Yale OpenHand VLA 6 DoF Grasping Dexterous Picking

2026 – Present

Vision-Language-Action Robotic Grasping

Proposal-conditioned VLA framework for cluttered tabletop grasping. RGB-D Mask R-CNN generates object proposals; confidence-based skill selection chooses the best planner across known, unknown, and occluded multi-object scenarios. Deployed end-to-end on Yale OpenHand gripper.

92%

Grasp success in clutter

Industry · CogniAux VLA Sensor Fusion Optical Flow Computer Vision

2024

Multimodal Human Activity Tracking

Real-time fusion of RGB-D, EEG, and audio for human activity tracking and attention estimation. Led a three-person team to full deployment - OpenPose, temporal attention, GMM, and optical flow as a full-stack GUI, not just a research prototype.

+17%

mAP improvement

43%

Error rate ↓

+8%

Attention estimation

Research & publications

Under Review IROS 2026

Feature-Level Mixture-of-Experts for Robust Robotic Grasp Detection

IEEE IROS 2026 · Under review

Venkatesh Mullur, Vinayak Kapoor, Prof. Berk Calli

Under Review IJRR 2025

Utilizing Inpainting for Keypoint Detection for Vision-Based Control of Robotic Manipulators

IJRR 2025 · Under review

Sreejani Chatterjee, Venkatesh Mullur, Abhinav Gandhi, Prof. Berk Calli

Presented CASE 2025

Novel Sweeping Methods for Robotic Rearrangement of Object Piles

IEEE CASE 2025 · Los Angeles · August 2025 · Presenter

Abhijeet Sanjay Rathi, Filip Radil, Hrishikesh Dhairyasheel Pawar, Prof. Berk Calli

More projects

Coursework and independent projects spanning 3D vision, SLAM, and robotics systems.

3D Vision

Neural Radiance Fields

Enhanced NeRF with improved rendering and sampling for photorealistic novel view synthesis.

PyTorchNeRF

GitHub ↗

3D Reconstruction

Structure from Motion

Full SfM pipeline with epipolar geometry and bundle adjustment for monocular depth.

OpenCVBundle Adjustment

GitHub ↗

3D Perception

Point Cloud Semantic Segmentation

3.2% IoU improvement via voxel grid filtering and bird's-eye-view with PointNet++.

PointNet++PCL

GitHub ↗

SLAM & Odometry

Visual-Inertial Odometry (MSCKF)

VIO at 29 FPS via stereo camera + IMU fusion for real-time 6-DOF pose estimation.

MSCKFIMU Fusion

GitHub ↗

Robotics

Motion Planning in Adversarial Environments

RRT-APF hybrid planner for dynamic environments with moving obstacles.

RRTAPF

GitHub ↗

Computer Vision

Automatic Camera Calibration

Multi-view calibration with robust homography estimation and distortion correction.

OpenCVHomography

GitHub ↗

Technical depth

Robotics

ROS1 / ROS2
MoveIt! · RViz
Gazebo · Isaac Sim
Franka Panda
Yale OpenHand
IBVS Control
Visual SLAM

Perception

RGB-D · Point Clouds
Keypoint Detection
Pose Estimation
Occlusion Handling
OpenCV · Open3D
Epipolar Geometry
Sensor Fusion

Deep Learning

PyTorch · TensorFlow
GANs · GNNs
Vision-Language Models
Mixture-of-Experts
Mask R-CNN
CUDA · HuggingFace
Few-shot Learning

Engineering

Python · C/C++
Docker · AWS
Git · CI/CD
Bash · MATLAB
HPC Clusters
Lidar / Radar
Kubernetes

Let's build robots that survive the real world.

Looking for research and engineering internship roles in robot perception, computer vision, deep learning, and physical AI. If your team cares about what happens when the system meets the real world - let's talk.

vmullur@wpi.edu↗ LinkedIn - @vmullur↗ GitHub - venk221↗ +1 (508) 304-2824↗

Worcester, MA · Open to relocation · PhD exp. 2028

Making robots work when the world gets messy.

A little about who I am

Selected Projects

Occlusion-Robust Robot Perception Pipeline

Feature-Level Mixture-of-Experts for Grasping

Vision-Language-Action Robotic Grasping

Multimodal Human Activity Tracking

Research & publications

More projects

Neural Radiance Fields

Structure from Motion

Point Cloud Semantic Segmentation

Visual-Inertial Odometry (MSCKF)

Motion Planning in Adversarial Environments

Automatic Camera Calibration

Technical depth

Let's build robots that survive the real world.

Making robots work
when the world
gets messy.