Advanced Analytics Database For Increasing client sales

Project to leverage Enterprise Analytics tools to help our client, shoe manufacturer, understand their customer segments better. It involves brainstorming process in creation of a Star Schema based database design.

Project Introduction

You have been consulted by a major footwear manufacturer, say, adidas or Nike, to provide advanced analytics to help advance their business. As with most businesses, the ultimate goal is to increase sales (profits) and decrease costs, and there are of course several ways to achieve the goal. After certain sessions with your client, you understand that as a manufacturer, they do not know a lot about their consumers, but they might be missing important opportunities for not knowing enough about their customers. Therefore, they are hoping, through data analytics, to learn more about their customer profiles, to understand where they stand with other competitors or potential competitors, to tap on opportunities, to be able to predict future trends.

Problem Identification: First Step was identifying what the problem was

Based on the given data regarding our client, we have identified that it would be most beneficial for our client to have a deeper understanding of their customer base. We plan to design a database to help our client understand their customer segments better by addressing the following key questions:

● Who is Joy Shoes’ customer?

● Where do these customers come from?

● What are the customers’ shopping preferences? (Place, Product, Price)


These key questions would help our client with customer segmentation-- which would enable the marketing team of the shoe manufacturers to draft a promotion strategy catering to different customer segments.

We are further granularizing these questions with sub questions related to these central questions regarding customers; this will aid further comparative analysis on customer shopping behavior, sales trends, formulation of price-product mix, etc.

A list of Sub Questions was asked to accurately represent the overall data required by the client

The sub-questions have been divided into the following categories:

Location Sub-Questions:

● What proportion of customers shop online vs in store?

● Customers shopping in stores by geography (N, E, W, S)?

Sales Sub-Questions:

● What proportion of sales do we do online vs in store?

● Instore sales by geography (N, E, W, S)? Comparing this to the store locations

● What are the most/least popular product categories online and offline?

● Average sales on holidays?

● Comparing sales by states, analyzing how tax in each state could affect sales?

Promotion Sub-Questions:

● What is the best time to run a promotion campaign?

● What products to promote by product lifecycle to maximize revenue?

● Whether the products under promotion showed a gain in sales but other products showed a corresponding sales decrease (cannibalization)?

Rival Sub-Questions:

● Comparison of rival sales trend vs client’s sales trend by product subcategory

● Comparison between average product price at product subcategory

Customer Sub-Questions:

● Average spending per customer by geographical region and store type

● Customer shopping trends by time of year

● Most popular product category by customer age

● Total number of returning customers per

A Close up look at the Star Schema

FACT

DIMENSIONS

A FACT DIMENSION CUBE

The fact dimension cube would look something like this query:

with FactsDimension as

(

select * from SalesFacts a

inner join Store_D b on a.StoreID=b.StoreID

inner join Customer_D c on a.CustomerID=c.CustomerID

inner join Product_D d on a.ProductID = d.ProductID

inner join Promotions_D e on a.PromotionID=e.PromotionID

inner join Date_D f on a.DateID=f.DateID

)

Benefits to the Client ?

Using this database system the client would be able to use simple queries to quickly gather insights about the following questions ? I have also attached the sample queries for pulling out these request from the above database:

Query1: What proportion of customers shop online vs in store?

select distinct a. StoreType,count(distinct a.StoreID) /(select count(distinct b.StoreID) as Ratio% from FactsDimension b) from FactsDimension a group by a. StoreType;

Query 2: Customers shopping in stores by geography (N, E, W, S)?

select distinct a.StoreRegion,count(distinct a.CustomerID) as NumOfCustomer from FactsDimension a

where a.StoreType=’0’ group by a.StoreRegion;

Query3: What proportion of sales do we do online vs in store?

select(select sum(SalesQuantity)from FactsDimension where Storetype = 1;)/sum(SalesQuantity) from FactsDimension where Storetype = 0;

Query4: Instore sales by geography (N, E, W, S)? Comparing this to the store locations

select distinct a.StoreRegion,sum(a.SalesQuantity) / (select sum(a.SalesQuantity) from FactsDimension) as RevenueProportion from FactsDimension group by a.StoreRegion;

Query5: Proportion of sales by store type by product category:

select distinct a.ProductCategory,sum(a.Quantity)/(select sum(a.Quantity) from FactsDimension) from FactsDimension a group by a.ProductCategory

Query6: What are the most popular product categories online and offline?

select

( select count(ProductCategory) from FactsDimension where StoreType = ‘Online’ )as `count online`,

(

select count(ProductCategory) from FactsDimension where StoreType = ‘Offline’ )as `count offline` from FactsDimension;

Query7: Comparison of rival sales trend vs client’s sales trend by product subcategory

-- Assuming Rival BrandName = ‘Adidas’

select

(f.ProductCategory) `Product Category`, (select sum(SalesQuantity) from FactsDimensions) as `Our Sales`, (select sum(SalesQuantity) from FactsDimensions where BrandName = ‘Adidas’) as ‘Rival Sales’

From FactsDimensions group by `Product Category`;

Query8: What is the best time to run a promotion campaign?

select b.Year,b.Month,(b.CurrentQty/b.PreviousQty-1) as RollingGR

(

select distinct a.Year,a.Month,sum(a.Quantity) as CurrentQty,lag(CurrentQty,1) over(order by a.Year,a.Month asc) as PreviousQty from FactsDimension a group by a.Year,a.Month

) b ;

When the monthly sales have decreased for two consecutive months, then we have assumed the product to be ready to be listed under promoted products, therefore, to maximize the sales of products whose monthly sales are decreasing we advise our client to run a promotion campaign.

Query9: What products to promote by product lifecycle to maximize revenue?

select distinct a.Year,a.Month,a.ProductName, sum(a.Quantity) as MonthlyQuantity from FactsDimension a group by a.Year,a.Month,a.ProductName

order by a.Year,a.Month,a.ProductName;

This query result requires further graphical analysis through data visualization tools like tableau.

Query10: Whether the products under promotion showed a gain in sales but other products showed a corresponding sales decrease (cannibalization)?

select distinct a.ProductCategory,a.Product,a.Year,a.Month,sum(a.SalesQuantity) as MonthlyQty from FactsDimension a where a.PromotionCode=’CIS505’ and a.ProductCategory=’Sport Shoes’

group by a.ProductCategory,a.Product,a.Year,a.Month

order by a.ProductCategory,a.Product,a.Year,a.Month;


This project was a team based effort to create a Database system to facilitate Enterprise Analytics