3 years designing & optimizing scalable data pipelines and analytics solutions. Currently at Procter & Gamble — building ETL in Python / PySpark / Databricks across Bronze + Silver, shipping Power BI dashboards stakeholders actually open, and enforcing data-quality checks on production pipelines.
I'm an electronic engineer who became a data engineer. Five years as a teaching assistant at PUCP, an exchange semester in Vigo, hardware roles building drones, air-quality monitors and Peru's first oxygen concentrator — then an MSc in Big Data & AI at the University of Barcelona to pivot into data.
Most of what I do now is unglamorous on purpose: reading other people's PySpark, making it faster, and writing the data-quality checks they wish they'd had. I prefer pipelines that recover quietly to pipelines that look impressive in a slide deck.
Friendliest with Databricks / PySpark / Power BI, comfortable across the rest of the modern data stack, and quick to pick up the next tool when the work calls for it.
Untangled a PySpark codebase nobody wanted to touch — narrower DataFrames, fewer shuffles, cleaner partitioning. Diffs the team could actually review. Runtime dropped from several hours to about one.
Two dashboards tracking pipeline health and business KPIs. Modeled the semantic layer, owned the data contracts, set up alerting.
Selenium + Sheets/BigQuery/APIs glue and tiny automations that compounded into real hours back to the sales-ops team.
Full e-commerce site for my mom's artisanal jewelry brand. 77-piece catalog with category filters, a live customizer for real-time engraving preview, and a smart cart that auto-generates WhatsApp orders. Built an offline AI chatbot ("Mona") handling 12+ customer intents — materials, pricing, shipping — zero external APIs. GSAP scroll animations & 3D card tilt. Deployed on Cloudflare Pages. Not data work — proof I ship.
Wrapper UI on top of an LLM — prompt scaffolding, response formatting, light persistence.
Generator that turns an exported chat into a visual word cloud — with stopword cleanup & emoji handling.
NLP notebook scoring sentiment across WhatsApp chats over time — who's the optimist?
ML regression to predict used-car prices from listing features — baseline + feature engineering iterations.
Web scraper analysing shared-bedroom listing prices across neighbourhoods — a personal house-hunting tool that escaped.
5 years explaining electronics at PUCP. Learned how to translate dense things into "I get it now."
3 years on Peru's NT. Recognized as a qualified athlete. I bring team-sport rhythm to standups.
Elected to the University Assembly at PUCP. Also swam on the university swim team — early lessons in showing up.
Coached swimming and water polo. A good coach asks better questions than a junior data engineer.
Open to Data Engineer roles in Madrid or fully-remote across the EU. Email is fastest — happy to share a case-study deep-dive if you want one.
davidwp37@gmail.com