
Archana S
Data Engineer
Habilidades

Conheça meus serviços


Experiência profissional
Data Engineer
Deloitte • Período integral
Mar 2026 - Present • 3 mos
Core Responsibility: Design, develop, and optimize scalable Databricks-based data engineering solutions for enterprise-wide analytics platforms, enabling real-time data integration, transformation, and consumption across multiple business functions. Project: EDH Core Data Platform • Developed and maintained scalable data ingestion and transformation pipelines using Databricks, PySpark, and SQL for enterprise-wide analytics initiatives. • Implemented data transformation and validation processes across Bronze, Silver, and Gold layers to ensure data quality and consistency. • Optimized PySpark and SQL workloads, improving pipeline performance and supporting growing data volumes. • Collaborated with business stakeholders to translate requirements into scalable data engineering solutions. • Supported data governance, metadata management, and monitoring processes to enhance platform reliability and operational efficiency. • Contributed to cloud-based data modernization initiatives, enabling self-service analytics and data-driven decision-making across the organization.
Data Engineer
Technologies • Período integral
Jul 2022 - Feb 2026 • 3 yrs 7 mos
Core Responsibility: Lead the architecture and development of cloud-native data platforms, migrating legacy systems to Databricks and optimizing pipelines for high-volume data ingestion. Project 1: Telecom Data Lakehouse & Revenue Analytics (Databricks/PySpark) • Architected a centralized Data Lakehouse solution on Databricks, ingesting semi-structured Call Detail Records (CDRs) and telemetry data to support critical revenue reporting. • Engineered highly optimized PySpark pipelines to process over 250 GB of streaming and batch data daily, reducing data latency from 24 hours to near real-time. • Implemented Delta Lake architecture (Bronze/Silver/Gold layers) to standardize raw telecom data, ensuring ACID transactions and schema enforcement across the data lifecycle. • Optimized complex SQL transformation logic within the silver layer, reducing query execution time. • Developed automated Python scripts to validate data integrity between source systems and the Data Lake, identifying and resolving anomalies in a 5 TB historical dataset. Project 2: Intelligent Data Agents & Quality Framework (Python/Cloud) • Designed the backend data infrastructure for AI-driven Data Agents, creating robust pipelines that aggregate unstructured data from diverse sources into a structured Knowledge Base. • Built a comprehensive Data Quality Framework using Python, automating schema validation and anomaly detection which increased the reliability of analytical outputs by 20%. • Modeled complex business entities to support context-aware querying, ensuring scalability for the agents to handle increasing request loads. • Collaborated with data scientists to translate business logic into efficient data access layers, accelerating the deployment of intelligent agents. Project 3: Campaign Management System & Migration • Led the end-to-end migration of a campaign management database to the cloud, writing custom PySpark scripts to transfer and transform 300 GB of data with zero data loss.