JD

Jeremiah Daniel Serenge

Data Engineer | Business Intelligence Developer and MLOps Specialist

Nairobi, Kenya

ABOUT

Data Engineer with 6+ years of experience designing, building, and maintaining scalable data pipelines, analytics platforms, and machine learning workflows in regulated and data-intensive environments. Strong expertise in Python, SQL, ETL processes, big data technologies, and cloud-based data platforms, with hands-on experience supporting BI reporting layers, data governance, and end-to-end ML pipelines. Proven ability to translate business requirements into robust data architectures that enable analytics, automation, and informed decision-making.

SKILLS

ETL / ELT pipeline design and orchestrationData ingestion from ERP systems (Business Central, SAP)Batch and near real-time pipelinesData quality checks, validation & monitoring
Dimensional modeling (facts & dimensions)Data warehouse & lakehouse conceptsSQL (complex queries, transformations)Metadata management & documentation
Apache SparkHadoop ecosystemNoSQL databases (MongoDB)
End-to-end ML pipelines (training, validation, deployment)Feature engineering pipelinesModel versioning & reproducibilityAPI-based model serving (FastAPI)CI/CD concepts for data & ML pipelines
Power BI (semantic models, dashboards, UAT)Executive & operational reporting
PythonSQLGitCI/CDLinux & Windows
SPSSSTATALogistic RegressionRandom Forest

EXPERIENCE

Data Scientist / Data Engineer

2024-01 - Present

Dataposit Africa • Nairobi, Kenya

Designed and maintained scalable ETL pipelines integrating ERP systems into analytics and reporting layers. Built and optimized Python- and Spark-based data processing pipelines for large datasets. Developed and maintained data models supporting BI dashboards and downstream analytics use cases. Implemented data quality checks, validation rules, and monitoring to ensure reliable and auditable data pipelines. Built end-to-end ML pipelines integrating data ingestion, feature engineering, model training, validation, and API-based deployment. Deployed production-ready APIs and data services using Python SDKs and cloud infrastructure. Supported CI/CD workflows using Git and automated deployments for analytics and ML solutions. Collaborated with business stakeholders to translate requirements into robust data structures and pipelines especially for BI related reporting and analysis. • Designed and maintained scalable ETL pipelines integrating ERP systems (Business Central, SAP) into analytics and reporting layers • Built and optimized Python- and Spark-based data processing pipelines for large datasets • Developed and maintained data models supporting BI dashboards and downstream analytics use cases • Implemented data quality checks, validation rules, and monitoring to ensure reliable and auditable data pipelines • Built end-to-end ML pipelines integrating data ingestion, feature engineering, model training, validation, and API-based deployment • Deployed production-ready APIs and data services using Python SDKs and cloud infrastructure • Supported CI/CD workflows using Git and automated deployments for analytics and ML solutions • Collaborated with business stakeholders to translate requirements into robust data structures and pipelines especially for BI related reporting and analysis

Intern, Economic Planning & Development

2023-01 - 2023-04

Kakamega County Government • Kakamega, Kenya

Conducted field data collection and used SPSS for policy-informing statistical analysis. Created Excel dashboards to communicate insights to county planners and decision-makers. Authored comprehensive reports summarizing key trends and actionable recommendations. • Conducted field data collection and used SPSS for policy-informing statistical analysis • Created Excel dashboards to communicate insights to county planners and decision-makers • Authored comprehensive reports summarizing key trends and actionable recommendations

Freelance Data Analyst

2019-01 - 2023-12

Upwork & Fiverr • Remote

Developed credit risk models for financial institutions and performed complex statistical analyses for international clients. Built predictive models and visualization dashboards translating complex datasets into clear business insights. Conducted data mining to discover patterns from large financial datasets informing strategic decisions. • Developed credit risk model (Logistic Regression/Random Forest) for Singaporean micro-finance client achieving 96% accuracy • Directly reduced default rates by 15% in first quarter post-deployment through improved risk assessment • Created model documentation and validation reports for regulatory compliance and stakeholder review • Performed complex statistical analyses using STATA and SPSS for international finance and academic clients • Built predictive models and visualization dashboards translating complex datasets into clear business insights • Conducted data mining to discover patterns from large financial datasets informing strategic decisions

EDUCATION

Moi University

2018-08 - 2023-12

Bachelor of Science in Applied Statistics with Computing