Helped India's Leading Quick-Com Reduce Average Delivery Time by 30%
About the Customer
Customer is a leading Quick Commerce platform that has disrupted the grocery delivery model in India, offering 10-minute deliveries and transforming the industry. With their e-grocery delivery app, customers can conveniently purchase from a wide range of 2500+ products, including fresh produce, cooking essentials, dairy, and more. Leveraging its technology and optimized delivery centers across ten locations, ensuring swift order fulfillment within 10 minutes.
Customer Challenge
- Batch Data Pipeline: Initial data pipelines were built with batch mode functionality resulting in significant data lag
- Near real-time ETA and Demand-Supply Metrics: Challenge in identifying and incorporating the right driver metrics to accurately calculate the estimated time of arrival (ETA) and manage demand-supply dynamics in near real-time
- Packer Tracking: Difficulty in tracking and monitoring the movement of packers within the warehouse
- Near real-time Inventory Management: Challenges with managing inventory in near real-time in their warehouses, compromising the availability of products for customer orders
Risks if the above challenges are not implemented
- The delay in data processing adversely affected the timeliness and accuracy of key performance indicators (KPIs), hindering near real-time decision-making and operational efficiency
- The absence of timely and precise metrics hindered effective delivery coordination and efficient resource allocation
- The lack of near real-time visibility into packer activities impeded efficient task allocation, coordination, and performance evaluation, leading to potential delays in order fulfillment
- Challenges in maintaining optimal stock levels, tracking inventory movement, and ensuring seamless order fulfillment
Solution Implemented
- Identify Required Datasets and Tables: Conduct a thorough analysis to determine all the datasets and tables necessary for calculating the required metrics and tables where there are data delays
- Set up CDC-Based Replication with MSK and Debezium Connectors: Utilized Debezium connectors on MSK connect to establish Change Data Capture (CDC)-based replication from the identified datasets
- Debezium source connectors capture changes from the Aurora PostgreSQL database and write changes to S3 using S3 destination connectors
- Incremental data from S3 is exported to Redshift tables based on frequency for each table as required with a minimum frequency of 5 minutes using Managed Apache Airflow in AWS.
- Scheduled Aggregation queries in Redshift to identify stock level details, resource allocation
- Implemented a data mesh architecture on Redshift to segregate workloads by multiple teams from the customer end to manage operations, and revenue workloads and perform analytics, so each team can create their aggregation tables as required. This helped improve performance and security as the size of data is around 500 TB
Results and Benefits
- Enabled near real-time analytics from 1 day to 15 minutes
- 5x improvements in the overall operational process
- Improved driver availability metrics by 30%
- Cut down the ETL tool cost by 40% by implementing a custom in-house ETL framework leveraging Airflow and Redshift