Making robots work
when the world
gets messy.

I build perception and learning systems for robotics - from grasp detection and occlusion-robust vision to vision-language models, validated end-to-end on physical robots.

Venkatesh Mullur
Venkatesh Mullur
4+
Years DL &
Robotics Research
3
Publications &
Conferences
3
Robot Platforms
Deployed On
🏊
Swimmer, Soccer player

A little about who I am

I am a second-year PhD student in Robotics Engineering at WPI, working in the Manipulation and Environmental Robotics Lab with Prof. Berk Calli.

Watching Yamaha's Motobot struggle against Valentino Rossi got me thinking about the gap between perceiving an environment and making the right decision in it - in conditions the system wasn't built for. That question has shaped most of my research.

I came to robotics through Electronics Engineering in Pune, India, which means I think about systems from the hardware up. My work spans robot perception, deep learning for manipulation, and vision-language reasoning - all validated on real robots, not just in simulation.

Outside the lab I am a competitive swimmer and a die hard soccer fan (Glory Glory Man United!). The early-morning training discipline has been useful. Especially when a model training run fails at 3am.

I am looking for internship roles in robot perception, computer vision, deep learning, and physical AI - at teams building things that need to work when the situation changes.

What I can contribute from day one
⚙️
Deploy perception pipelines on real robots Franka Panda, Yale OpenHand, ROS2 - I've taken systems from training to hardware execution, not just simulation
🧠
Train and evaluate deep learning models end-to-end Dataset curation, architecture design, benchmark evaluation, ablations - PyTorch, CUDA, HPC clusters
📐
Design rigorous evaluations and benchmarks Not just best-case numbers - I quantify failure modes, diversity, and complementarity across datasets
📄
Contribute to both research and engineering Comfortable writing papers and shipping code - I've done both under deadline across three publications

Selected Projects

Flagship · MER Lab Franka Panda GANs Keypoint Detection UKF IBVS
2023 – 2024

Occlusion-Robust Robot Perception Pipeline

Real-time perception stack for encoderless Franka Panda. WGAN-GP inpainting reconstructs occluded joints from markerless images; keypoint detection with UKF temporal smoothing feeds Image-Based Visual Servoing, enabling stable control under severe occlusion. Outperforms LaMa with 11.7 FID, 0.91 precision.

91.6%
Occlusion accuracy
<2%
Pixel error
26%
Settling time ↓
Robotic Grasping Mixture-of-Experts Feature Fusion GraspNet-1B Ensemble Learning
2024 – 2026

Feature-Level Mixture-of-Experts for Grasping

Feature-level fusion across multiple grasp expert networks. Complementarity quantified via Q-statistics and error correlation - proving mid-accuracy moderately-correlated ensembles outperform SOTA individual models. Evaluated on Cornell, Jacquard, GraspNet-1B and Franka Panda System.

MoE Architecture
13%
Fewer grasp failures
+7%
Over individual models
Physical AI · Active Yale OpenHand VLA 6 DoF Grasping Dexterous Picking
2026 – Present

Vision-Language-Action Robotic Grasping

Proposal-conditioned VLA framework for cluttered tabletop grasping. RGB-D Mask R-CNN generates object proposals; confidence-based skill selection chooses the best planner across known, unknown, and occluded multi-object scenarios. Deployed end-to-end on Yale OpenHand gripper.

VLA Grasping
92%
Grasp success in clutter
Industry · CogniAux VLA Sensor Fusion Optical Flow Computer Vision
2024

Multimodal Human Activity Tracking

Real-time fusion of RGB-D, EEG, and audio for human activity tracking and attention estimation. Led a three-person team to full deployment - OpenPose, temporal attention, GMM, and optical flow as a full-stack GUI, not just a research prototype.

CogniAux demo
+17%
mAP improvement
43%
Error rate ↓
+8%
Attention estimation

Research & publications

Under Review IROS 2026
Feature-Level Mixture-of-Experts for Robust Robotic Grasp Detection
IEEE IROS 2026 · Under review
Venkatesh Mullur, Vinayak Kapoor, Prof. Berk Calli
Under Review IJRR 2025
Utilizing Inpainting for Keypoint Detection for Vision-Based Control of Robotic Manipulators
IJRR 2025 · Under review
Presented CASE 2025
Novel Sweeping Methods for Robotic Rearrangement of Object Piles
IEEE CASE 2025 · Los Angeles · August 2025 · Presenter
Abhijeet Sanjay Rathi, Filip Radil, Hrishikesh Dhairyasheel Pawar, Prof. Berk Calli

More projects

Coursework and independent projects spanning 3D vision, SLAM, and robotics systems.

NeRF
3D Vision

Neural Radiance Fields

Enhanced NeRF with improved rendering and sampling for photorealistic novel view synthesis.

PyTorchNeRF
GitHub ↗
SfM
3D Reconstruction

Structure from Motion

Full SfM pipeline with epipolar geometry and bundle adjustment for monocular depth.

OpenCVBundle Adjustment
GitHub ↗
3D Perception

Point Cloud Semantic Segmentation

3.2% IoU improvement via voxel grid filtering and bird's-eye-view with PointNet++.

PointNet++PCL
GitHub ↗
VIO
SLAM & Odometry

Visual-Inertial Odometry (MSCKF)

VIO at 29 FPS via stereo camera + IMU fusion for real-time 6-DOF pose estimation.

MSCKFIMU Fusion
GitHub ↗
Motion Planning
Robotics

Motion Planning in Adversarial Environments

RRT-APF hybrid planner for dynamic environments with moving obstacles.

RRTAPF
GitHub ↗
Calibration
Computer Vision

Automatic Camera Calibration

Multi-view calibration with robust homography estimation and distortion correction.

OpenCVHomography
GitHub ↗

Technical depth

Robotics
  • ROS1 / ROS2
  • MoveIt! · RViz
  • Gazebo · Isaac Sim
  • Franka Panda
  • Yale OpenHand
  • IBVS Control
  • Visual SLAM
Perception
  • RGB-D · Point Clouds
  • Keypoint Detection
  • Pose Estimation
  • Occlusion Handling
  • OpenCV · Open3D
  • Epipolar Geometry
  • Sensor Fusion
Deep Learning
  • PyTorch · TensorFlow
  • GANs · GNNs
  • Vision-Language Models
  • Mixture-of-Experts
  • Mask R-CNN
  • CUDA · HuggingFace
  • Few-shot Learning
Engineering
  • Python · C/C++
  • Docker · AWS
  • Git · CI/CD
  • Bash · MATLAB
  • HPC Clusters
  • Lidar / Radar
  • Kubernetes

Let's build robots that survive the real world.

Looking for research and engineering internship roles in robot perception, computer vision, deep learning, and physical AI. If your team cares about what happens when the system meets the real world - let's talk.