My professional work experience spans multiple responsibilities and diverse workplaces. I have 4+ years of experience in the domains of healthcare analytics, consumer technology, marketplace, financial technology, and enterprise software.

At a Glance



Data Scientist Intern

Mathematica

  • Multivariate data modeling and causal inferencing for healthcare data
  • Random Forest and Decision Tree based methods for estimate prediction using pruned estimators
  • Dimensionality reduction and explainable variable selection using Sparse PCA (PCA with lasso penalty)
  • Framework for database creation in Redshift using multi-source raw data using Pandas, Boto3, and Step Functions
  • Modeling standalone and crosswalk datasets in Redshift and RDS for medical insurance pricing database

  • Deep Learning Researcher (part-time)

    Adobe Research

  • Learning optimal image compression algorithms for computer vision pipelines
  • Generalized latent representations of salient features for object detection, segmentation, and classification
  • Latency and memory reduction using an autoencoder based approach with information from pretrained hyperpriors
  • GPU training and testing with modular code using Pytorch Lightning

  • Software Engineer II

    Uber

  • Batch analytics at scale (100PB+) using Spark and Hive query engines on HDFS and S3
  • Real-time streaming pipelines and analytics using Apache Flink engine on Kafka topics
  • Query performance and resource optimizations using Spark and SQL best practices
  • Python and Java application development for supporting monitoring and governance use cases
  • Writing PRDs and proposal documentation for adding new features and integrating new acquisitions data resources
  • Data modeling for efficient querying, fault tolerance, and memory optimization using advanced SQL and design patterns

  • Software Engineer

    SAP Labs

  • Workflow orchestration and CI/CD using Git, Jenkins
  • Infrastructure provisioning, configuration management, and deployments using Terraform, Ansible, Chef
  • Analytics datalake and warehouse solutions using Hadoop HDFS, Spark, Hive, and ElasticSearch
  • Automated testing suite for application features using Java, Selenium
  • Health, server, and API traffic monitoring using Splunk, Zabbix, Kibana, and Grafana


  • More details in the resume