SUM, AVG, COUNT)CASE WHEN)GROUP BY)CAST, CONCAT)A retail grocery chain wanted to gain insights into its sales performance, customer preferences, and outlet efficiency. The raw sales dataset contained information on product types, fat content, outlet characteristics, sales, and ratings, but it required cleaning and structured analysis to extract actionable insights.
==> The main objectives of the project were:
Data Cleaning – Standardize product categories for accurate analysis.
Sales Analysis – Calculate total and average sales, identify top-selling items, and understand revenue patterns.
Product Analysis – Examine product diversity, average ratings, and performance by item type and fat content.
Outlet Analysis – Evaluate sales by outlet characteristics (size, type, location, establishment year) and identify which factors drive revenue.
Customer Insights – Understand preferences for “Low Fat” vs. “Regular” products across different outlets.
==> The project involved a step-by-step approach using SQL queries:
1. Data Cleaning:
Standardized the Item_Fat_Content field to "Low Fat" and "Regular" for consistency.
Ensured the dataset was ready for analysis by replacing inconsistent labels like "LF" and "reg".
2. Total Sales and Revenue Analysis:
Calculated total sales and average sales per transaction.
Identified the total number of different items sold and computed average product ratings.
3. Metrics by Product Categories:
Aggregated sales metrics by Item Fat Content and Item Type, including total sales, average sales, and transaction count.
Highlighted high-performing categories for better inventory and marketing planning.
4. Outlet-Level Analysis:
Evaluated sales by outlet size, computing the percentage contribution to overall revenue.
Analyzed sales by outlet location type (Urban, Semi-Urban, Rural) and outlet type.
Cross-analyzed item fat content and outlet location to understand customer preferences in different areas.
Assessed sales performance by outlet establishment year, combining total and average sales with customer ratings.
==> the following major outcomes were generated
Data Cleaning: Standardized fat content categories, ensuring accurate reporting and analysis.
Sales Insights: Total sales and average transaction values quantified overall revenue, while product-level metrics highlighted top-selling categories.
Product Performance: Identified popular product types and customer satisfaction trends through average ratings.
Outlet Insights:
Larger outlets and urban locations contributed most to revenue.
Low Fat products were preferred in certain areas, while Regular items dominated elsewhere.
Outlet establishment year showed trends in revenue generation and customer ratings.
Actionable Outcomes: Provided recommendations for inventory management, marketing focus, outlet expansion, and customer targeting, enabling the grocery chain to make data-driven strategic decisions.
Skills & Tools:
SQL: Advanced querying, aggregation, conditional logic, and CTEs
Data Cleaning: Standardizing inconsistent data entries
Analytical Thinking: Translating raw data into actionable business insights
Reporting: Presenting metrics in a clear, interpretable format
Objective:
To clean and standardize inconsistent entries in the Item_Fat_Content column of the groceryd dataset in order to ensure data consistency and accuracy in subsequent analyses.
Description:
The raw dataset contained variations in the Item_Fat_Content field, such as "LF" and "reg", which represent "Low Fat" and "Regular" respectively.
These inconsistencies could lead to duplicate categories and incorrect aggregations when analyzing sales or performance by fat content.
Purpose & Outcome:
All inconsistent values were successfully cleaned and standardized.
The dataset is now ready for reliable KPI computation and segmentation analysis by fat content.
Improved data integrity ensures that insights derived from this column (e.g., sales by fat content) are accurate and meaningful.
SQL Query:
Objective:
To compute the overall revenue and average revenue per transaction from the grocery dataset, providing a clear picture of the business’s financial performance at both macro and micro levels.
Description:
This task involves calculating two key KPIs:
Total Sales — Sum of all Total_Sales values across the dataset, converted to millions for better readability.
Average Sales per Transaction — Mean of all Total_Sales values, reflecting the typical revenue generated by a single sale.
Purpose & Outcome:
Purpose:
To measure total revenue as a high-level KPI reflecting overall business size.
To measure average revenue per sale, providing insight into typical transaction value.
Outcome:
Total Sales: Identified the company’s revenue scale in millions.
Average Sales: Revealed the typical revenue generated per customer transaction.
These metrics form the foundation for further analysis, such as segmentation by product type, outlet, or fat content, and enable strategic decision-making for sales optimization.
SQL Query:
Objective:
To analyze the variety of items sold and determine the average customer rating for products, providing insights into product diversity and overall customer satisfaction.
Description:
This task focuses on two aspects of the grocery dataset:
Total Number of Different Items Sold by Item Type — Counting how many sales were recorded for each type of product to understand product distribution and popularity.
Average Rating of Items Sold — Calculating the mean rating to gauge overall customer satisfaction and product performance.
Purpose & Outcome:
Purpose:
To quantify the product diversity in terms of sales and understand which product types contribute most to revenue.
To assess overall customer satisfaction by calculating the average rating.
Outcome:
Item Count: Revealed the number of sales per product type, highlighting the most and least sold categories.
Average Rating: Provided a summary measure of customer satisfaction across all items.
These insights help in inventory planning, marketing focus, and quality assessment for products.
SQL Query:
Objective:
To analyze sales performance by product fat content and item type, enabling insights into which categories generate the most revenue and identifying patterns in customer purchases.
Description:
This task calculates multiple metrics for each product category:
Metrics by Fat Content — Summarizes total sales, average sales per transaction, and total number of sales for "Low Fat" and "Regular" items.
Metrics by Item Type — Performs a similar aggregation for each product type (e.g., beverages, snacks, dairy), showing total revenue, average sale, and number of transactions.
Purpose & Outcome:
Purpose:
To understand how different product categories perform in terms of total and average sales.
To identify which fat content and item types drive the most revenue and transactions.
Outcome:
By Fat Content: Highlighted whether “Low Fat” or “Regular” products contributed more to total sales.
By Item Type: Showed top-selling product categories and their average sales per transaction.
Provided actionable insights for inventory management, marketing focus, and product promotion strategies.
SQL Query:
Objective:
To analyze sales performance across outlets based on size, location, type, and product fat content, providing insights into which outlet characteristics drive revenue and customer behavior.
Description:
This task includes several analyses to understand sales patterns at the outlet level:
Outlet Size Sales Percentage — Calculates the contribution of each outlet size (Small, Medium, High) to total sales.
Metrics by Outlet Location Type — Computes total sales, average sales, and number of transactions for each location type (Urban, Semi-Urban, Rural).
Metrics by Outlet Type — Summarizes total and average sales, transaction count, and average product rating for each outlet type.
Sales by Outlet Location and Fat Content — Determines sales breakdown of “Low Fat” vs. “Regular” products for each outlet location type.
Purpose & Outcome:
Purpose:
To evaluate how outlet characteristics (size, location, type) affect sales.
To understand which product types perform best in specific outlet locations.
Outcome:
Outlet Size Analysis: Identified the contribution of each outlet size to total sales.
Location & Type Analysis: Highlighted high-performing locations and outlet types.
Fat Content by Location: Revealed customer preference patterns for Low Fat vs. Regular products by area.
Provided actionable insights for store expansion, marketing campaigns, and inventory allocation.
SQL Query:
Objective:
To analyze sales performance and customer ratings based on the year outlets were established, providing insights into how outlet age influences revenue and customer satisfaction.
Description:
This task involves calculating several key metrics grouped by Outlet Establishment Year:
Total Sales by Year — Aggregates the revenue generated by all outlets established in a specific year.
Average Sales per Transaction — Calculates the mean revenue per transaction for outlets of each establishment year.
Total Number of Sales — Counts the number of sales transactions for each outlet year.
Average Customer Rating — Determines the mean rating of products sold by outlets established in each year.
Purpose & Outcome:
Purpose:
To assess whether older or newer outlets generate more revenue.
To analyze the relationship between outlet age and customer satisfaction.
Outcome:
Sales Metrics: Revealed trends in revenue generation based on outlet age.
Customer Insights: Showed how average ratings vary across outlets of different establishment years.
Enabled data-driven decisions for outlet expansion, resource allocation, and marketing strategies.
SQL Query:
This project successfully demonstrated the power of data-driven decision making in the retail grocery sector. By cleaning and analyzing the dataset, we were able to uncover key insights about product performance, customer preferences, and outlet efficiency. The analysis revealed patterns in sales by item type, fat content, outlet size, location, and establishment year, providing actionable recommendations for inventory planning, marketing strategies, and outlet expansion.
Overall, this project highlights how structured data analysis can transform raw transactional data into strategic business intelligence, enabling better revenue optimization and improved customer satisfaction. It reinforced the importance of data cleaning, aggregation, and insightful reporting as core skills in the field of data analytics.
Emmanuel Olorunda
Copyright © 2025 | All Rights Reserved
Every project starts with a chat, Emmanuel is keen to conversation and open to discussing your project. he will pull in the right expertise when needed