P_Urban Aparell

Company Background

Urban Apparel is a fast-growing e-commerce fashion retailer with operations across North America. The company processes thousands of transactions daily across multiple sales channels including their website, mobile app, and third-party marketplaces. As the business scaled, their legacy data infrastructure struggled to keep pace with the volume and variety of marketing data needed for effective campaign optimization.

Problem Statement

Marketing data was delivered daily as CSV files and stored in Azure Blob Storage. However, the data ingestion process was manual. A team member had to download files and upload them every day. The process was time-consuming, taking several hours each week . Reports were often delayed due to late or missed uploads. Manual handling increased the risk of errors and inconsistent data These challenges affected data freshness, reporting accuracy, and timely decision-making.

Objectives

• Automate the ingestion of daily CSV files from Azure Blob Storage

• Eliminate manual file handling and reduce human error

• Load clean, reliable data into a One Lake Lakehouse in Delta format

• Ensure data is refreshed daily and ready for Power BI reporting

• Create a scalable and reliable pipeline using Microsoft Fabric Data Factory

Design

• Automate the ingestion of daily CSV files from Azure Blob Storage

• Eliminate manual file handling and reduce human error

• Load clean, reliable data into a One Lake Lakehouse in Delta format

• Ensure data is refreshed daily and ready for Power BI reporting

• Create a scalable and reliable pipeline using Microsoft Fabric Data Factory

ChatGPT Image Dec 16, 2025, 01_23_18 PM.png

Execuation

Pipeline Overview

A Microsoft Fabric Data Factory pipeline was implemented using a Copy Data activity to ingest data from Azure Blob Storage into a Fabric Lakehouse.

Purpose:
To automate the ingestion of daily CSV files and eliminate manual file uploads.

Design highlights:

Dedicated pipeline focused solely on ingestion
Simple and easy-to-maintain structure
Built using native Microsoft Fabric orchestration capabilities

1. Source Configuration – Azure Blob Storage

Implementation:
The pipeline source was configured to read CSV files directly from Azure Blob Storage.

Objective:
To enable automated, scheduled ingestion of daily source files.

Key aspects:

Direct integration with Azure Blob Storage
Designed for recurring file ingestion
Minimal configuration for reliability and maintainability

2. Destination Configuration – Fabric Lakehouse

Implementation:
The target destination was set as a Fabric Lakehouse table.

Configuration details:

Data written to the Tables section of the Lakehouse
Stored in Delta Lake format
Explicit table naming for clarity and governance
Load behavior configured as Overwrite, with flexibility to switch to Append when required

Result:
Data is immediately available in an analytics-ready format, enabling seamless querying via Power BI and SQL.

3. Pipeline Execution & Run History

Implementation:
The pipeline was executed multiple times and monitored through the Run History view.

Validation outcomes:

Successful end-to-end data movement
No execution failures observed
Consistent and repeatable ingestion behavior

Each execution confirmed that data was accurately transferred from the source to the Lakehouse destination.

4. Reliability & Failure Handling

Implementation:
Basic resiliency and fault-tolerance settings were applied to the Copy Data activity.

Configured safeguards:

Execution timeout to prevent stalled runs
Retry mechanism enabled
Defined retry interval for transient failures

These configurations ensure a stable and reliable ingestion process.