Home Stack Experience GenAI Projects Contact
Open to opportunities

Aryaman Malikk

Data Engineer  ·  GenAI Specialist  ·  Cloud Architect

Building scalable data solutions with measurable enterprise impact. Specializing in ETL pipelines, intelligent document processing, and cloud-native architectures.

📍 Gurugram, India ✉ aryamanmalik04@gmail.com 📞 +91 84273 31664
$57M+
Business Impact Delivered
97%
OCR + GenAI Extraction Accuracy
20K+
Daily Booking Requests Processed

Tech Stack

⚙️

Data Engineering

SQL PySpark Databricks ETL / ELT Data Architecture
☁️

Cloud & Platforms

Azure Cloud Workflows Data Lakes Cloud Infrastructure
🤖

GenAI & Automation

Generative AI Prompt Engineering Document Processing RPA
📊

Analytics & BI

Power BI Tableau DAX Real-time Dashboards
💻

Software Engineering

C# / .NET Python OOP Version Control
🔬

Specializations

Identity Resolution Data Cleansing OCR & ML Big Data

Professional Experience

BTSA (ETL) — Data Engineer

Current

ZS Associates · Gurugram

May 2025 – Present

  • Engineered intelligent document processing system using OCR + GenAI, achieving ~97% accuracy extracting data from emails & attachments for global booking operations
  • Designed & deployed centralized reservation system serving 130+ luxury properties, reducing guest communication response time by ~90%
  • Built scalable cloud-based workflows handling 20,000+ daily booking requests, driving ~$50M in business impact through intelligent customer insights
  • Delivered RPA-driven automation reducing processing time from hours to seconds, generating ~$7M in annual efficiency gains
  • Contributed to enterprise-scale initiatives for a leading global bank & luxury hotel groups, supporting high-value business-critical operations
  • Structured revenue insights & contribution tracking for major banking client, improving leadership decision-making visibility

Data Engineer Intern

Intern

Neostats Analytics · Bengaluru

Sept 2024 – Mar 2025

  • Designed & implemented scalable ETL pipelines ingesting data from 10+ channels, ensuring continuous flow & consistency across systems
  • Applied data cleansing & standardization techniques on 2M+ records, significantly improving data quality & reliability
  • Built identity resolution & record linkage logic to deduplicate customer profiles, creating a unified master dataset for analytics
  • Modeled layered data architecture enriched with historical & behavioral attributes supporting downstream analytics & targeted marketing
  • Developed real-time dashboards across oil & gas domain, enabling live monitoring of operational efficiency

Software Engineer Intern

Intern

Becton Dickinson and Co. · Chandigarh

Jul 2023 – Dec 2023

  • Contributed to enhancements in a .NET-based application for device monitoring, improving functionality & supporting business requirements
  • Applied C# & OOP principles to implement features and manage code updates through structured version control practices
🎓
B.Tech in Computer Science
UPES, Dehradun
Sept 2020 – May 2024 · Specialized in Big Data Analytics

GenAI & Prompt Engineering

📄

Document Intelligence

Built end-to-end intelligent document processing combining OCR with Generative AI for automated data extraction from unstructured sources — emails, attachments, PDFs. Achieved 97% accuracy validating and extracting structured data from booking and event opportunities at enterprise scale.

Prompt Engineering

Designed and optimized prompts for GenAI models to extract complex data patterns from unstructured text. Implemented validation logic to ensure data integrity and consistency in enterprise-scale document processing pipelines handling thousands of documents daily.

🔗

Production AI Integration

Integrated Generative AI into production ETL workflows, enabling seamless centralized processing of large-scale deals while maintaining data quality standards. Reduced manual intervention by automating complex extraction workflows end-to-end.

Projects & Portfolio

Featured

Sales Insights Dashboard

MySQL Power BI Analytics

Comprehensive analysis of a large-scale sales dataset. Utilized MySQL for robust data storage and Power BI for interactive dashboards revealing key business insights and performance metrics.

View Repository →
Featured

HR Analytics Dashboard

Power BI DAX HR Data

In-depth HR data analysis examining employee location preferences and leave patterns across 100+ employees. Leveraged DAX calculations and Power BI visualizations for actionable HR insights.

View Repository →
Featured

Fake News Detector

Machine Learning NLP Classification

ML-powered classifier to identify and flag fake news articles. Demonstrates proficiency in classification algorithms and NLP techniques applied to a real-world misinformation problem.

View Repository →

London Bike Rides

Tableau Data Viz Analysis

Interactive Tableau dashboard visualizing bike-sharing patterns correlated with weather conditions and holidays. Integrated multiple data sources for comprehensive transportation insights.

View Repository →

Let's Build Something Together

Interested in data engineering, GenAI solutions, or new opportunities? I'm always open to a good conversation.