Analysis of Footwear Purchasing Pattern Among Students in Udaipur Using Machine Learning with One Hot Encoding
Dr. J.Shreemali
Professor,
Management Studies,
GITS, Udaipur
Dhruv Paneri
B.Tech (CSE) Student,
GITS, Udaipur
Himanshi Rangwani
B.Tech (CSE) Student,
GITS, Udaipur
Dimple Kunwar Rao
B.Tech (CSE) Student,
GITS, Udaipur
Garv Parihar
B.Tech (CSE) Student,
GITS, Udaipur
Abstract:
Footwear that traditionally treated as a utility item has emerged as an important component of a person’s attire. The demand for footwear is projected to be very high (estimated over 10 percent) for several years to come with convenient and cozy footwear contributing to this rise in a big way [2]. The study focused on identifying the main factors that drive footwear purchase among students. Based on a pilot study, a structured questionnaire for prepared and used for collecting data on various factors that enhance propensity to purchase a pair of footwear. A total of ninety-four responses was received based on purposive sampling, and subjected to statistical analysis. Factor analysis was carried out to ascertain the existence of underlying factors. The data collected was coded using one hot encoding and put through a machine learning model based on ‘relu’ activation function, ‘adam’ optimizer with ‘mean square error’ being used as the loss function. The data was run for fifty thousand epochs. The key factors that explain about 70 percent variation in annual spend are: (i) Number of pairs owned presently; (ii) Frequency of purchase; (iii) Number of cars at home (as an indication of the economic status of the person); (iv) Reasonable price (includes discounts, sale etc.), and (v) Frequency of social events a person attends.
Key Words: Questionnaire, Sampling, Machine Learning, One-hot-encoding, Relu activation
Introduction:
Footwear is among the basic needs of all mankind and, with growing affluence in society, consumer taste of footwear has changed too leading to a change in market dynamics, all the way from production to distribution. The market for footwear includes the wide range of shoes designed and used for a wide range of activities. This usage could be routing or casual use, formal or athletic use as well as safety boots for specific purposes. The actual manufacturing process and material used depend on their specific function and purpose. Thus, shoes may be made of leather, rubber, textile/cloth, plastic or wood (non-conducting and strong). Consumers of shoes include all human population so their potential market covers all men, women and children, each forming a segment of prospective customers. In this respect, the shoe market is like the apparel market and is similarly impacted heavily by evolving consumer preferences as well as their extensive use of online purchasing, sometimes, in preference to brick-and-mortar stores. The most recent spike in the footwear market came after the Covid-19 pandemic with a revenue growth showing CAGR of about 18 percent in the four year period from 2021 to 2025 (2021 hit the lowest point in the recent past due to Covid-19 impact). In terms of products of choice, sandals fetch the highest revenue followed by boots and sneakers [11].
Choudary International Pvt. Limited (2025) estimated global footwear market to reach USD 588.22 Billion by 2030 from USD 438.62 Billion in 2023 with the growth rate from 2024 to 2030 being 4.3 percent. The total production, worldwide, in 2022 was over 24 Billion pairs with customers preferences being driven by athleisure, sustainability, and customization and their choices also encompassing luxury and premium brands. India and China play an important part in Asia-Pacific making up about 40 percent of global sales as well as production of footwear. North America and Europe saw a rise in demand for high-end and sustainable footwear that, in turn, caused a move towards premium materials. Middle East and Latin America witnessed increased demand for casual footwear as well as sportswear and may emerge as another important destination for footwear producers. As far as India is concerned, the market is expected to grow to USD 46 Billion by 2033 from USD 18.8 Billion in 2024 with the CAGR being a whooping 10.1 percent. However, this growth may require addressing challenges like need for higher automation and top quality leather. [4] Estimates of actual market size as well as growth are seen to vary, probably, because of the role of unorganized sector in this market. Statista site reports (2025) the total footwear market in India much higher at 33.86 Billion in 2025 with the CAGR between 2025 and 2030 being a more moderate 7.73 percent. Growth in terms of numbers is such that the Indian market size may reach 4 Billion pairs by 2030 with the growth in 2026 being close to 2026. Am interesting feature of Indian footwear market is that about 97 percent of sales in 2025 falls in the non-luxury category suggesting that it is largely seen as a utility item and not a luxury item.
The rapid growth of footwear industry presents several opportunities to budding professionals especially in the area of (i) footwear design and development with focus on customization and use of ICT (Information and Communication Technology) for design; (ii) improved precision and quality in manufacturing; (iii) optimizing use of the internet for online marketing; and (iv) ensuring compliance and sustainability for an improved image [4]. An important part of demand for Indian footwear producers is likely to come from exports with India accounting for about 13 percent of global foot wear production, way behind China that accounts for about 67 percent of the global market. At the moment, the focus is almost entirely in domestic markets in India that absorb about 95 percent of the more than 2.2 Billion pairs produced. As far as the Indian footwear production industry is concerned, it carries all the limitations that go with smaller production units. The industry is made up of about 15000 SMEs operating primarily in the unorganized sector and providing jobs to about 11 lac people making it among the biggest employers in India besides agriculture, textile and retail industry. The biggest challenge this industry faces in India is the very little change in footwear production installed capacity signaling the need for developing a policy framework that encourages mechanization as well as growth in this, otherwise, stagnating sector [3]. With the indigenous production in India largely falling in the unorganized sector, foreign players have entered this arena and are performing quite well. This study focuses on exploring two questions: (i) How can the market size, within a region, be estimated for the unorganized sector to benefit from the market; and (ii) What policy changes are required for encouraging capacity building by the organized sector at the national level. Inevitably, the factors that determine choice are seen to be common, with the some of the factors being: (i) knowing the feet type whether flat, high-arch-type or neutral-arch-type; (ii) the occasion of use; (iii) the accompanying outfit; (iv) comfort; (v) quality; and (vi) season of use [7].
Literature Review:
Shoes are believed to contribute immensely to a person’s appearance and reflect her/his attitude, personality, choice and sense of fashion. Shoes are also a source of comfort and confidence. The key attributes customer look for in a pair of shoes are comfort with cushioning and appropriate balance between toes and heels, purpose whether formal or informal and the season for using those shoes [9]. Well made shoes can last very long meaning that those owning well made shoes will go for an added purchase largely because of factors like taste or desire for choice. Unlike jewelry or other expensive items, rental use of footwear is rare despite the rather high prices of luxury or designer footwear. Some customers do choose purchase of used shoes (second hand use) though that is, again, not very common [1]. The choice of footwear varies enormously from customer-to-customer. However, some of the factors that impact customer choice of a particular pair of shoes are: (i) comfort and flexibility; (ii) fitting into the feet without negatively impacting feet health; and (iii) purpose of use [14]. Yet another approach for choosing a pair could be based on the occasion, person’s chosen outfit for the occasion, comfort and functionality [5]. Considering that the Indian footwear market is likely to grow at a CAGR of 11.7 percent till 2028 with new varieties of convenient and cozy footwear and evolving fashion. Yet another segment that may dominate in the not-too-distant future is the therapeutic footwear segment. The big advantage policy makers have in their quest to support footwear production in India is the fact that about 65 percent of total domestic consumption as well as 28 percent of total footwear exports in India come from production in Agra [2]. The industry represents a combination of the traditional and the modern with rapidly rising role for e-commerce and digital platforms for marketing and sale of footwear. The share of online revenue as percentage of total sales revenue has gone up from a few percentage points in 2017 to about twenty-five percent in the 2024-2025 period. This added role of online sales increases the opportunity to the SME sector, especially in relatively smaller cities and towns [13]. As far as the organized sector goes, some of the best known brands include the following names given along with their unique selling propositions [10]:
The list of top companies does vary based on the source with the list above coming from a blog on the GreenSole website. Another blog on NoStrain website includes NoStrain, Liberty and Paragon but not GreenSole or Mochi from the lust above. The key strength of NoStrain is flat footwear, that of Liberty is affordability coupled by superior quality and Paragon is slippers for men and women [8].
The impressive prospects, notwithstanding, the footwear industry requires significant investments to rise to its potential in terms of increased production and employment generation. The DPIIT (Department for Promotion of Industry and Internal Trade) reported that an investment of one crore India Rupees (INR) can generate about 250 jobs and for every 1000 pairs produced for sale in India can generate about 425 jobs spanning multiple sectors/industries. While the sector is expected to continue growth with the overall growth being over 10 percent, it is the women’s footwear segment and non-leather segment that are expected to grow faster [6].
Research Methodology:
This research employs a quantitative research design with data being collected through a structured questionnaire. The objective was to understand factors driving decisions associated with purchase of footwear by respondents. This was done with the intention of understanding user preferences as regards purchase of footwear and assessing what makes consumers take the purchasing decision.
The structured questionnaire used to collect primary data was developed after piloting it among target population. Participation in the study was voluntary and respondents were told about the purpose of the study and that their participation was entirely voluntary. No inducements were offered for participation in the study. The sampling technique used was the purposive sampling. A sample size of 94 was used for the analysis. The sample was largely made up of college going students in and around Udaipur with 88 responses coming from students. Data collected was statistically analysed. While the summary at higher level of abstraction with all 94 responses is also presented here, a more detailed analysis of the 88 responses from students was carried out and the findings are presented below.
As the responses were collected using google forms, some ambiguities were found to have crept into the questions and, therefore, the responses. As a result, the data had to be cleaned up before being used. For example, when a respondence was ‘ 6 or 7’ when asked ‘How often do you buy shoes (approximate number of times/year)’, the value assigned was average of 6 and 7 i.e. 6.5. Imputing or estimating any value inevitability introduces error into a model. One hot encoding was employed for training the machine learning model.
Data and Analysis:
Given below is the summary of data as summarized by the google forms. Subsequently, data collected was analysed statistically and a machine learning model developed to train estimating annual spend on footwear purchase.
Fig. 1: Gender distribution of responses
Fig. 2: Present role of respondents
Fig. 3: Factors that influence choice of footwear
Fig. 4: Relative importance of factors that influence choice of footwear
Output of statistical (regression) analysis of the responses after cleaning the dataset is given below:
Table 1: R-Square and Adjusted R-Square: Summary Output
|
|
|
|
Regression Statistics |
|
|
Multiple R |
0.791104 |
|
R Square |
0.625845 |
|
Adjusted R Square |
0.497844 |
|
Standard Error |
2719.452 |
The R square value of 0.625 (adjusted R square of 0.497) indicates that about half the variation can be explained by the factors identified. The significant factors (less than 5 percent) in determining the total value of shoes purchased in a year are: (i) Number of pairs owned presently; (ii) Frequency of purchase; (iii) Number of cars at home (as an indication of the economic status of the person); (iv) Weightage given to ‘Reasonable price (includes discounts, sale etc.)’, and (v) Frequency of social events a person attends (as these often require footwear suited to the gentry participating).
Table 2: Ascertaining Significant Variables for Estimating Customer’s Purchase Value/Year
|
|
Coefficients |
Standard Error |
t Stat |
P-value |
Lower 95% |
Upper 95% |
|
Intercept |
-9145.89 |
4147.065 |
-2.20539 |
0.033549588 |
-17541.2 |
-750.6 |
|
How many pairs of footwear do you own presently? |
757.5091 |
177.6672 |
4.26364 |
0.000128185 |
397.8405 |
1117.178 |
|
Shoe purchase frequency (approx. nos. times/yr) |
826.6166 |
301.2219 |
2.744211 |
0.009208244 |
216.8247 |
1436.408 |
|
Family Income (Lacs/annum) |
51.0178 |
89.5719 |
0.569574 |
0.572318821 |
-130.311 |
232.3466 |
|
No. of Cars at home |
1464.188 |
651.4575 |
2.247557 |
0.030489375 |
145.3812 |
2782.995 |
|
No. of ACs at home |
-339.43 |
291.7872 |
-1.16328 |
0.251968813 |
-930.122 |
251.2621 |
|
Weight of factor (Reasonably priced) |
1295.19 |
519.5144 |
2.493078 |
0.017137333 |
243.488 |
2346.892 |
|
Weight of factor (Comfort in wearing) |
506.0084 |
1130.915 |
0.447433 |
0.657102804 |
-1783.41 |
2795.425 |
|
Weight of factor (Shoe’s looks (appearance)) |
610.4219 |
756.9778 |
0.806393 |
0.425034194 |
-922 |
2142.843 |
|
Weight of factor (Shoe’s brand & image) |
190.3603 |
614.7758 |
0.309642 |
0.758525 |
-1054.19 |
1434.909 |
|
Weight of factor (Reputation or earlier experience at shoe store) |
166.983 |
578.0208 |
0.288888 |
0.7742376 |
-1003.16 |
1337.125 |
|
Weight of factor (An upcoming event/ celebration) |
-765.32 |
498.5662 |
-1.53504 |
0.133058503 |
-1774.61 |
243.9743 |
|
Weight of factor (Latest fashion among peers/friends) |
586.2434 |
513.6394 |
1.141352 |
0.260865234 |
-453.565 |
1626.052 |
|
Approximate number of social events you attend every month (on an average) |
-886.072 |
384.723 |
-2.30314 |
0.026837109 |
-1664.9 |
-107.241 |
One of the questions examined was whether men and ladies differed in their purchasing patterns. Based on the five significant factors above, namely:
Statistical analysis of the data collected shows that male and female respondents did not differ significantly on any of the factors above. Data on degrees of freedom, t-value and p value for each of these is given and the p values are well above 0.05 leading to rejection of the hypotheses that male and female respondents differ significantly on the key parameters that are significant in deciding the annual purchase value as regards footwear purchase among men and ladies:
Table 3: Hypotheses testing on purchasing habits of men and women not being different
|
Sl. No. |
Significant Factor |
df* |
t-value |
p value |
|
1 |
Number of pairs owned presently |
49 |
1.2625 |
All p values well above the level of significance of 0.05 |
|
2 |
Frequency of purchase |
29 |
1.5328 |
|
|
3 |
Number of cars at home |
47 |
0.6616 |
|
|
4 |
Weightage to Reasonable price |
30 |
1.4815 |
|
|
5 |
Frequency of social events attending |
44 |
0.8360 |
* Degrees of freedom rounded down
Thus, one can not reject the hypotheses as regards purchasing habits being different among men and women. This was further checked using average value of annual purchase from men and women and once again the difference was not found to be significant.
Fig. 5: Relationship between footwear owned and Annual Spend on Footwear
An R2 value of 0.5914 indicates a higher propensity to spend among those who own more pairs suggesting the possibility of perceived need for footwear extending beyond its utility value as assessed by its utilization.
Fig. 5: Relationship between Family Income and Footwear Purchase Frequency
The graph above highlights the rather delicate relationship between the two variables. A low R2 value of 0.1355 indicates that family income or affluence does not play a key role in determining frequency of purchases and that other factors are more important in this regard.
To examine the possibility of underlying factors that were driving the extraneous behaviour in so far as it relates to purchase of footwear Factor Analysis was carried out. That would inevitably lead to dimensionality reduction. The KMO and Bartlett’s test of Sphericity for a level of significance of 0.01 showed that the null hypotheses of correlation matrix being an identity matrix can be rejected and that correlation does exist among the variables. The KMO and Bartlett’s test of Sphericity value was found to be above 0.5 (at 0.531) thus justifying the need for factor analysis. Given below are the key output values of the KMO and Bartlett’s test of Sphericity:
Table 4: KMO and Bartlett’s test of Sphericity
KMO and Bartlett’s test of Sphericity 0.531
Bartlett’s test of Sphericity (Approx. Chi-Square) 210.837 for df = 91 and significance = 0.000
The cut off value for Chi-Square is in the range of 114 to 130 for level of significance from 0.05 to 0.005 thus justifying rejection of null hypotheses of no correlation among variables.
Table 5: Correlation Matrix Highlighting Correlation Over 0.3
Table 6: Explaining variance using factor analysis
|
Total Variance Explained |
|||||||
|
Component |
Initial Eigenvalues |
Extraction Sums of Squared Loadings |
Rotation Sums of Squared Loadings* |
||||
|
Total |
% of Variance |
Cumulative % |
Total |
% of Variance |
Cumulative % |
Total |
|
|
1 |
2.794 |
19.955 |
19.955 |
2.794 |
19.955 |
19.955 |
2.575 |
|
2 |
2.510 |
17.927 |
37.881 |
2.510 |
17.927 |
37.881 |
2.448 |
|
3 |
1.711 |
12.222 |
50.103 |
1.711 |
12.222 |
50.103 |
1.767 |
|
4 |
1.360 |
9.717 |
59.820 |
1.360 |
9.717 |
59.820 |
1.537 |
|
5 |
1.246 |
8.902 |
68.722 |
1.246 |
8.902 |
68.722 |
1.653 |
|
6 |
.955 |
6.820 |
75.542 |
|
|
|
|
|
7 |
.718 |
5.127 |
80.669 |
|
|
|
|
|
8 |
.595 |
4.252 |
84.921 |
|
|
|
|
|
9 |
.553 |
3.952 |
88.872 |
|
|
|
|
|
10 |
.470 |
3.360 |
92.233 |
|
|
|
|
|
11 |
.394 |
2.814 |
95.047 |
|
|
|
|
|
12 |
.304 |
2.173 |
97.220 |
|
|
|
|
|
13 |
.213 |
1.519 |
98.739 |
|
|
|
|
|
14 |
.177 |
1.261 |
100.000 |
|
|
|
|
|
Extraction Method: Principal Component Analysis. |
|||||||
|
* For correlated components, sums of squared loadings can’t be added to obtain total variance. |
|||||||
There are five factors required to explain about 69 percent of the total variance. This is further borne out by the scree plot for an eigenvalue of 1 being the cut off for accepting the factor. The following scree plot brings out five factors where the eigen value is above 1.
Fig 6: Scree Plot showing Eigen values that exceed a value of 1
As a final step, the data collected was put through a machine learning model using one hot encoding. The basis of the model was the demand function given by [12]:
Qd = D (X, P, U)
Where Qd and P represent Quantity Demanded and Price respectively; and
X represents all demand shifters.
Demand shifters can be viewed as parameters that are within the customer’s control (choice of footwear, frequency of purchase and individual customer’s preference to various features like reasonable price, comfort, looks/appearance and upcoming event) as well as factors that are driven by the economy at large and footwear manufacturers (eg. advertising spend, overall economic climate, economic prospects for future growth etc). This study focused primarily on parameters that are within the customer’s control and not on macro-economic parameters or company driven parameters (like advertising spend or number of outlets set up for selling company products).
The machine learning model was a ‘Sequential’ model based on the ‘relu’ activation function, ‘adam’ optimizer and ‘mean square error’ was used as the loss function. To enhance accuracy, deep learning model was employed with 50000 epochs. The python code of machine model used for this study is given below:
Fig. 7: Machine learning model using 21 features (including the one hot encoded)
import tensorflow as tf
from tensorflow import keras
model = keras.Sequential([
keras.layers.Dense(21, input_shape=(21,), activation='relu'),
keras.layers.Dense(15, activation='relu'),
keras.layers.Dense(1, activation='relu')
])
model.compile(optimizer='adam',
loss='mse',
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=50000)
Multiple combination of features were tried but accuracy in predicting annual purchase from a potential customer could not be ascertained precisely suggesting the existence of factors that go beyond customer’s immediate family, income, tastes and social commitments also play an important role in determining annual spending on footwear purchases by customers. These would, typically, be the macro-economic parameters or company driven parameters not directly within customer’s control.
Conclusions and Recommendations:
The study highlights some key aspects as regards footwear purchase. The overriding feature appears to be affluence as measured by number of cars at home and frequency of social visits/interaction. Perception of reasonable price plays a very important role too. While a study of underlying factors through factor analysis allowed explaining close to 70 percent of the variance in total purchase spend during the year, the deep learning model deployed suggests that extraneous factors (like macro-economic parameters or company driven parameters like advertising budget or number of outlets that sell the footwear) play a more important role in the amount that students annually spend on footwear purchases. Predicting customer value in terms of footwear purchases in any given year appears to be largely driven by those factors. This enables the study to conclude that increasing footwear production and annual spending (i.e. purchases) in India requires greater focus on the macro-economic environment and the producers (i.e. manufacturers) with all support being extended to them so as to capitalize on the immense inherent strengths for this product and skills already available in companies for its manufacture on account of extended period of indigenous production.
Limitations of the Study:
The study used purposive sampling that carries the risk of bias. The findings need to be further validated on a larger scale before it can be considered as a basis for decision making. The use of factor analysis along with Machine Learning model brings out the fact that while variation in purchases can be largely explained, predicting customer value is more challenging and requires a deeper study that takes into account macroeconomic variables as well as the micro variables. This study implicitly assumes that buyer behaviour as regards purchase of footwear is relatively stable. While that may be true within a region, the variables for other regions could be different.
References: