l
leonidlupko

Leonid Lupko

@leonidlupko

Cloud Data Engineer, BigQuery, Snowflake, dbt, Python, ETL

Ucrânia
Inglês, Ucraniano
Algumas informações são exibidas no idioma inglês.
Sobre mim
Senior Data Engineer specializing in scalable cloud data platforms and production-ready ETL/ELT pipelines. I help businesses build reliable data solutions using AWS, BigQuery, Snowflake, dbt, Python, and SQL. Experience includes: - API integrations - automated data pipelines - large-scale web scraping - data warehouse design - Power BI backend optimization - serverless architectures - incremental processing - data quality and monitoring Focused on reliability, scalability, cost optimization, and clean architecture for long-term maintainability.... Saiba mais

Habilidades

l
leonidlupko
Leonid Lupko
offline • 
Tempo médio de resposta: 1 hora

Conheça meus serviços

Engenharia de Dados
I will automate API ingestion into bigquery with python
ETLs de dados
I will fix and optimize your dbt pipelines and sql models

Portfólio

Experiência profissional

Self-Employed

High-Load Web Scraping Platform (AWS)

Self-Employed • Freelance

Jan 2025 - Present1 yr 4 mos

Designed and implemented a scalable web scraping platform using serverless AWS infrastructure. Currently running in production for price monitoring across 120000 SKUs on 6 websites (total 0,72M SKUs ), with reliable change tracking and stable daily execution. The system uses curl_cffi for high-performance requests and integrates with Bright Data to bypass anti-bot protections (Cloudflare, Akamai, DataDome). Architecture: Distributed workers (AWS Lambda / ECS) with SQS queues S3-based data lake (raw → normalized → curated) Parquet + partitioning SQL analytics via Amazon Athena Scalability: 🚀 Designed to scale up to 5M+ pages/day Horizontal scaling via queue-based architecture Ready for TB-scale datasets Results: ⚡ 200–800 ms average request latency 💰 60–85% cost reduction vs browser-based scraping 📦 Efficient data pipeline with optimized storage 🔍 Athena queries in 2–10 seconds 📉 $0.01–$0.20 per query