Shivam Rajput - Data Engineer

About

Highly analytical Data Engineer with a Bachelor of Technology from IIT (BHU) Varanasi, specializing in building and optimizing scalable data pipelines and distributed data lake architectures. Proven ability to leverage AWS services, Apache technologies, and advanced programming to achieve significant efficiency gains, including a 30% reduction in data migration time. Recognized for strong problem-solving skills, evidenced by a top 0.5% rank in JEE Advanced and a 2nd place win in the Gen AI Hackathon for an AI-powered staffing recommendation system.

Work Experience

Data Engineer

Accordion

Jun 2024 - May 2025

Bengaluru, Karnataka, IN

Engineered and maintained robust ETL pipelines utilizing AWS services to ensure seamless data ingestion, transformation, and loading processes.

Developed a Python script leveraging boto3 and concurrent.futures to optimize data migration between Amazon S3 buckets, achieving a 30% reduction in transfer time.
Created and optimized stored procedures in Amazon Redshift for complex data operations, significantly enhancing performance and scalability.
Maintained robust ETL pipelines using AWS Glue, Amazon Redshift, and Amazon S3, ensuring seamless data ingestion, transformation, and loading processes.

Data Engineer

Physics Wallah

May 2025 - Jul 2024

Bengaluru, Karnataka, IN

Currently leads the development and optimization of robust data pipelines and data lake architecture for scalable and performant analytics.

Built and scheduled data pipelines using Apache Airflow to ingest data from Google Sheets, REST APIs, and MongoDB into Trino tables, ensuring reliable data availability.
Implemented Debezium and Kafka for real-time change data capture from MongoDB collections, centralizing data into the core platform.
Contributed to an in-house data architecture leveraging Apache Iceberg, Amazon S3, and Trino, optimizing query performance for scalable and performant analytics.
Supported data transformation workflows using Apache Spark for efficient batch processing within a distributed data lake environment.

Education

Technology

Indian Institute of Technology (BHU), Varanasi

Nov 2020 - May 2024

Varanasi, Uttar Pradesh, IN

Projects

Staffing Assistant (Gen AI Hackathon Project)

Nov 2024 - Nov 2024

Developed an AI-powered staffing recommendation system leveraging NLP and embedding-based similarity search, securing 2nd place in the Gen AI Hackathon.

Product Management System

Aug 2024 - Sep 2024

Developed a microservices-based system for product management and user interactions, focusing on efficient, decoupled services.

Awards

2nd Rank, Accordion Gen AI Hackathon 2024

Accordion

Nov 2024

Awarded for developing an innovative AI-powered staffing recommendation system that leveraged NLP and embedding-based similarity search.

Codeforces Expert (Max rating 1727, Global Rank 675)

Codeforces

Mar 2024

Achieved Codeforces Expert status with a max rating of 1727 and a Global Rank of 675 in Codeforces Round 927, solving over 350+ Data Structures & Algorithms problems, showcasing advanced problem-solving and algorithmic skills.

Top 0.5% Rank, JEE ADVANCED 2020

JEE ADVANCED

Jan 2020

Achieved a top 0.5% ranking among over 1 Million candidates in the highly competitive JEE ADVANCED 2020 examination, demonstrating exceptional aptitude in science and engineering.

Languages

English

Skills

Programming Languages

Python
SQL
C++

Big Data & Data Engineering

Apache Airflow
Apache Spark
Kafka
Apache Iceberg
Debezium
Trino
ETL
Data Pipelines
Data Lake
Distributed Systems
Data Warehousing

Cloud Platforms & Databases

AWS Glue
Amazon Redshift
Amazon S3
boto3
MySQL
FAISS

Web Frameworks & DevOps

Django
Flask
Streamlit
Docker
RabbitMQ

Artificial Intelligence & Machine Learning

NLP
Embedding-based Similarity Search
AI-powered Recommendation Systems