Seeking research and engineering roles in robotic manipulation, robot perception, and physical AI — where the goal is systems that deploy, not just papers that publish.
I'm a PhD candidate in Robotics Engineering at Worcester Polytechnic Institute, advised by Prof. Berk Calli. I work on robot perception, grasp detection, and vision-language systems for manipulation in cluttered and uncertain environments.
My background is in Electronics Engineering from Savitribai Phule Pune University, which gave me a strong foundation in embedded systems and low-level signal processing before I moved into robotics and deep learning at WPI. That path from hardware to software shapes how I think — I care about the full stack, not just the model.
I'm actively looking for roles in robotic manipulation, robot perception, and physical AI at companies building systems that need to work outside a controlled lab setting. Also affiliated with CogniAux, a faculty-founded computer vision startup.
Real-time perception stack for encoderless Franka Panda. WGAN-GP inpainting reconstructs occluded joints from markerless images; keypoint detection with UKF temporal smoothing feeds Image-Based Visual Servoing, enabling stable control under severe occlusion. Outperforms LaMa with 11.7 FID, 0.91 precision.
Feature-level fusion across multiple grasp expert networks. Complementarity quantified via Q-statistics and error correlation — proving mid-accuracy moderately-correlated ensembles outperform SOTA individual models. Evaluated on Cornell, Jacquard, and GraspNet-1B; validated on Franka Panda.
Proposal-conditioned VLA framework for cluttered tabletop grasping with Yale OpenHand gripper. RGB-D Mask R-CNN generates object proposals; confidence-based skill selection chooses the best planner across known, unknown, and occluded multi-object scenarios. Deployed end-to-end on real hardware.
Real-time fusion of RGB-D, EEG, and audio for human activity tracking and attention estimation. Led a three-person team to full deployment: OpenPose, temporal attention, GMM, and optical flow as a full-stack GUI — not just a research prototype.
Coursework and independent projects spanning 3D vision, robotics systems, and deep learning.
Enhanced NeRF with improved rendering and sampling strategies for photorealistic novel view synthesis.
Full SfM pipeline with epipolar geometry and bundle adjustment for monocular depth estimation.
3.2% IoU improvement via voxel grid filtering and bird's-eye-view representation with PointNet++.
VIO at 29 FPS via stereo camera and IMU fusion for real-time 6-DOF pose estimation.
RRT-APF hybrid planner for robot navigation in dynamic environments with moving obstacles.
Automated multi-view calibration pipeline with robust homography estimation and distortion correction.
Actively seeking roles in robotic manipulation, robot perception, and deep learning for physical AI. If your system needs to work in clutter, occlusion, and real deployment conditions — let's talk.