Analyzing Hotel Bookings Data !




Metadata:¶

  1. Hotel: Represents the hotel name.
  2. Is_canceled: Represents that the Booking is canceled or not by 0 and 1.
  3. arrival_date_year: It represents the arrival year of customers.
  4. arrival_date_month: It represents the arrival month of customers.
  5. arrival_date_week_number: It represents the arrival week number of customers.
  6. Lead_time: Booking or Reservation Lead Time is the period of time (most typically measured in calendar days) between when a guest makes the reservation and the actual check-in/arrival date.
  7. arrival_date_day_of_month: It represents the arrival date in month of customers.
  8. stays_in_weekend_nights: The number of nights corresponds to the total number of nights spent by guests on the basis of weekend nights.
  9. stays_in_week_nights: The number of nights corresponds to the total number of nights spent by guests on the basis of weekly nights.
  10. adults: It represents the adults visited or stayed in hotels.
  11. children: It represents the number of children visted or stayed in hotels
  12. Babies: It represents the count of babies stayed in hotels.
  13. Meal: It represents different types of meals like- BB — Breakfast only. HB — Half board (bed, breakfast, evening meal – no drinks included in the evening) , FB — Full board (bed, breakfast, lunch, evening meal – no drinks included in the evening) & SC means self-catering (no meals are included).
  14. country: Country house hotels is a term used to describe a specific type of accommodation in the hospitality industry. As the name states a country house hotel, is a hotel located in the country side, which appears more as a home than as a hotel.
  15. market_segment: A market segment is a group of customers that have common characteristics: price sensitivity, booking channel, the purpose of travel, booking lead time, geographical region, length of stay, etc.
  16. is_repeated_guest: Repeated guest means a guest stayed in hotel more than once.
  17. previous_cancelations: It represents that the guest or customer had previously canceled the bookings or not.
  18. previous_bookings_not_canceled: It represents the number that whether booking is cancelled before or not.

Report content


0.1. Import Modules

  1. Data cleaning
    1.1 Searching the nulls
    1.2 Dealing with nulls
    1.3 Check
  2. Exploratory Data Analysis (E.D.A.)
    2.1 Filtering out count of Bookings being cancelled
    2.1.2 Visualisation on cancellation of hotel bookings.
    2.2 Top 10 countries From which most guest are belonging
    2.2.1 Filtering of the top10 countries
    2.2.2 Visualisation for the top 10 countries.
    2.2.3 Visualization of top10 countries(map).
    2.3 Monthly wise guests visits to hotels
    2.3.1 Filtering out Months with No. of guests visits.
    2.3.2 Visualisation for monthly visits
    2.4 Counts of guest revisiting hotels
    2.4.1 Filtering out the guest revisits to hotels
    2.4.2 Visualisation for the guest revisits
    2.5 Analyzing type of customer visiting hotels.
    2.5.1 Filtering out the customer_type
    2.5.2 Visualization for the type of customer visiting.
    2.6 Analyzing Most reserved room_type
    2.6.1 Filtering out the most reserved room_type.
    2.6.2 Visualisation for the room type reservations.
    2.7 Comparison of reserved room type against assigned room.
    2.8 Correlation

0.1 Importing modules¶

  • I had Used different libraries for this project like pandas, matplotlib, seaborn & plotly.
In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

import plotly.express as px


pd.set_option('display.max_columns',32)
In [2]:
df=pd.read_csv("C:/Users/user/Downloads/hotel booking project/hotel_booking_data.csv")
df.head(35)
Out[2]:
hotel is_canceled lead_time arrival_date_year arrival_date_month arrival_date_week_number arrival_date_day_of_month stays_in_weekend_nights stays_in_week_nights adults children babies meal country market_segment distribution_channel ... assigned_room_type booking_changes deposit_type agent company days_in_waiting_list customer_type adr required_car_parking_spaces total_of_special_requests reservation_status reservation_status_date name email phone-number credit_card
0 Resort Hotel 0 342 2015 July 27 1 0 0 2 0.0 0 BB PRT Direct Direct ... C 3 No Deposit NaN NaN 0 Transient 0.00 0 0 Check-Out 2015-07-01 Ernest Barnes Ernest.Barnes31@outlook.com 669-792-1661 ************4322
1 Resort Hotel 0 737 2015 July 27 1 0 0 2 0.0 0 BB PRT Direct Direct ... C 4 No Deposit NaN NaN 0 Transient 0.00 0 0 Check-Out 2015-07-01 Andrea Baker Andrea_Baker94@aol.com 858-637-6955 ************9157
2 Resort Hotel 0 7 2015 July 27 1 0 1 1 0.0 0 BB GBR Direct Direct ... C 0 No Deposit NaN NaN 0 Transient 75.00 0 0 Check-Out 2015-07-02 Rebecca Parker Rebecca_Parker@comcast.net 652-885-2745 ************3734
3 Resort Hotel 0 13 2015 July 27 1 0 1 1 0.0 0 BB GBR Corporate Corporate ... A 0 No Deposit 304.0 NaN 0 Transient 75.00 0 0 Check-Out 2015-07-02 Laura Murray Laura_M@gmail.com 364-656-8427 ************5677
4 Resort Hotel 0 14 2015 July 27 1 0 2 2 0.0 0 BB GBR Online TA TA/TO ... A 0 No Deposit 240.0 NaN 0 Transient 98.00 0 1 Check-Out 2015-07-03 Linda Hines LHines@verizon.com 713-226-5883 ************5498
5 Resort Hotel 0 14 2015 July 27 1 0 2 2 0.0 0 BB GBR Online TA TA/TO ... A 0 No Deposit 240.0 NaN 0 Transient 98.00 0 1 Check-Out 2015-07-03 Jasmine Fletcher JFletcher43@xfinity.com 190-271-6743 ************9263
6 Resort Hotel 0 0 2015 July 27 1 0 2 2 0.0 0 BB PRT Direct Direct ... C 0 No Deposit NaN NaN 0 Transient 107.00 0 0 Check-Out 2015-07-03 Dylan Rangel Rangel.Dylan@comcast.net 420-332-5209 ************6994
7 Resort Hotel 0 9 2015 July 27 1 0 2 2 0.0 0 FB PRT Direct Direct ... C 0 No Deposit 303.0 NaN 0 Transient 103.00 0 1 Check-Out 2015-07-03 William Velez Velez_William@mail.com 286-669-4333 ************8729
8 Resort Hotel 1 85 2015 July 27 1 0 3 2 0.0 0 BB PRT Online TA TA/TO ... A 0 No Deposit 240.0 NaN 0 Transient 82.00 0 1 Canceled 2015-05-06 Steven Murphy Steven.Murphy54@aol.com 341-726-5787 ************3639
9 Resort Hotel 1 75 2015 July 27 1 0 3 2 0.0 0 HB PRT Offline TA/TO TA/TO ... D 0 No Deposit 15.0 NaN 0 Transient 105.50 0 0 Canceled 2015-04-22 Michael Moore MichaelMoore81@outlook.com 316-648-6176 ************9190
10 Resort Hotel 1 23 2015 July 27 1 0 4 2 0.0 0 BB PRT Online TA TA/TO ... E 0 No Deposit 240.0 NaN 0 Transient 123.00 0 0 Canceled 2015-06-23 Priscilla Collins PhD PhD.Priscilla74@att.com 833-887-7898 ************4642
11 Resort Hotel 0 35 2015 July 27 1 0 4 2 0.0 0 HB PRT Online TA TA/TO ... D 0 No Deposit 240.0 NaN 0 Transient 145.00 0 0 Check-Out 2015-07-05 Laurie Smith Smith.Laurie@att.com 804-383-4080 ************5450
12 Resort Hotel 0 68 2015 July 27 1 0 4 2 0.0 0 BB USA Online TA TA/TO ... E 0 No Deposit 240.0 NaN 0 Transient 97.00 0 3 Check-Out 2015-07-05 Casey Thomas Casey_T78@outlook.com 211-071-2173 ************8518
13 Resort Hotel 0 18 2015 July 27 1 0 4 2 1.0 0 HB ESP Online TA TA/TO ... G 1 No Deposit 241.0 NaN 0 Transient 154.77 0 1 Check-Out 2015-07-05 Rachel Friedman Rachel.F@protonmail.com 435-075-8409 ************9767
14 Resort Hotel 0 37 2015 July 27 1 0 4 2 0.0 0 BB PRT Online TA TA/TO ... E 0 No Deposit 241.0 NaN 0 Transient 94.71 0 0 Check-Out 2015-07-05 Edward Torres Edward.T@zoho.com 790-746-7471 ************7719
15 Resort Hotel 0 68 2015 July 27 1 0 4 2 0.0 0 BB IRL Online TA TA/TO ... E 0 No Deposit 240.0 NaN 0 Transient 97.00 0 3 Check-Out 2015-07-05 Samuel Zavala Zavala_Samuel46@xfinity.com 649-384-5387 ************7612
16 Resort Hotel 0 37 2015 July 27 1 0 4 2 0.0 0 BB PRT Offline TA/TO TA/TO ... E 0 No Deposit 8.0 NaN 0 Contract 97.50 0 0 Check-Out 2015-07-05 Dr. Victor Martin Dr.Martin@xfinity.com 331-430-8824 ************6279
17 Resort Hotel 0 12 2015 July 27 1 0 1 2 0.0 0 BB IRL Online TA TA/TO ... E 0 No Deposit 240.0 NaN 0 Transient 88.20 0 0 Check-Out 2015-07-02 Sara Lee Sara.L@hotmail.com 573-306-9938 ************8950
18 Resort Hotel 0 0 2015 July 27 1 0 1 2 0.0 0 BB FRA Corporate Corporate ... G 0 No Deposit NaN 110.0 0 Transient 107.42 0 0 Check-Out 2015-07-02 Curtis Rodriguez CRodriguez@verizon.com 466-424-2102 ************1179
19 Resort Hotel 0 7 2015 July 27 1 0 4 2 0.0 0 BB GBR Direct Direct ... G 0 No Deposit 250.0 NaN 0 Transient 153.00 0 1 Check-Out 2015-07-05 Stephanie Schmidt StephanieSchmidt@hotmail.com 896-642-1049 ************1110
20 Resort Hotel 0 37 2015 July 27 1 1 4 1 0.0 0 BB GBR Online TA TA/TO ... F 0 No Deposit 241.0 NaN 0 Transient 97.29 0 1 Check-Out 2015-07-06 John Matthews John.Matthews@aol.com 952-496-4398 ************1019
21 Resort Hotel 0 72 2015 July 27 1 2 4 2 0.0 0 BB PRT Direct Direct ... A 1 No Deposit 250.0 NaN 0 Transient 84.67 0 1 Check-Out 2015-07-07 Robert Chung Robert.Chung47@yandex.com 382-465-6552 ************8524
22 Resort Hotel 0 72 2015 July 27 1 2 4 2 0.0 0 BB PRT Direct Direct ... A 1 No Deposit 250.0 NaN 0 Transient 84.67 0 1 Check-Out 2015-07-07 Mark Garcia MGarcia16@comcast.net 784-675-4921 ************4371
23 Resort Hotel 0 72 2015 July 27 1 2 4 2 0.0 0 BB PRT Direct Direct ... D 1 No Deposit 250.0 NaN 0 Transient 99.67 0 1 Check-Out 2015-07-07 Brandon Taylor BrandonTaylor@hotmail.com 227-329-7167 ************8470
24 Resort Hotel 0 127 2015 July 27 1 2 5 2 0.0 0 HB GBR Offline TA/TO TA/TO ... I 0 No Deposit 115.0 NaN 0 Contract 94.95 0 1 Check-Out 2015-07-01 Angie Sanchez Angie_Sanchez@att.com 211-889-2476 ************8871
25 Resort Hotel 0 78 2015 July 27 1 2 5 2 0.0 0 BB PRT Offline TA/TO TA/TO ... D 0 No Deposit 5.0 NaN 0 Transient 63.60 1 0 Check-Out 2015-07-08 Alexis King King_Alexis70@hotmail.com 103-516-5853 ************6809
26 Resort Hotel 0 48 2015 July 27 1 2 5 2 0.0 0 BB IRL Offline TA/TO TA/TO ... D 0 No Deposit 8.0 NaN 0 Contract 79.50 0 0 Check-Out 2015-07-08 Michael Davidson MichaelDavidson82@att.com 336-525-2460 ************8662
27 Resort Hotel 1 60 2015 July 27 1 2 5 2 0.0 0 BB PRT Online TA TA/TO ... E 0 No Deposit 240.0 NaN 0 Transient 107.00 0 2 Canceled 2015-05-11 Jaime Flynn JaimeFlynn29@gmail.com 549-866-3721 ************9660
28 Resort Hotel 0 77 2015 July 27 1 2 5 2 0.0 0 BB PRT Online TA TA/TO ... A 0 No Deposit 240.0 NaN 0 Transient 94.00 0 0 Check-Out 2015-07-08 Mrs. Alicia Williams Mrs..W61@yandex.com 427-564-4927 ************4445
29 Resort Hotel 0 99 2015 July 27 1 2 5 2 0.0 0 BB PRT Online TA TA/TO ... D 0 No Deposit 240.0 NaN 0 Transient 87.30 1 1 Check-Out 2015-07-08 Heather Hart Heather.H@xfinity.com 431-329-6663 ************2780
30 Resort Hotel 0 118 2015 July 27 1 4 10 1 0.0 0 BB NaN Direct Direct ... A 2 No Deposit NaN NaN 0 Transient 62.00 0 2 Check-Out 2015-07-15 Diamond Wilson Wilson.Diamond@comcast.net 870-563-6202 ************8017
31 Resort Hotel 0 95 2015 July 27 1 4 11 2 0.0 0 BB GBR Offline TA/TO TA/TO ... D 0 No Deposit 241.0 NaN 0 Transient 63.86 0 0 Check-Out 2015-07-16 Paul Williams Williams_Paul@xfinity.com 789-736-8837 ************7006
32 Resort Hotel 1 96 2015 July 27 1 2 8 2 0.0 0 BB PRT Direct Direct ... E 0 No Deposit NaN NaN 0 Transient 108.30 0 2 Canceled 2015-05-29 Reginald Cunningham Reginald_C57@outlook.com 800-249-2144 ************5699
33 Resort Hotel 0 69 2015 July 27 2 2 4 2 0.0 0 BB IRL Offline TA/TO TA/TO ... C 0 No Deposit 175.0 NaN 0 Transient 65.50 0 0 Check-Out 2015-07-08 Willie Sims Willie_S@yahoo.com 790-830-7635 ************7682
34 Resort Hotel 1 45 2015 July 27 2 1 3 3 0.0 0 BB PRT Online TA TA/TO ... D 0 No Deposit 241.0 NaN 0 Transient 108.80 0 1 Canceled 2015-05-19 Alex Brown Alex.B@zoho.com 956-737-1944 ************4084

35 rows × 36 columns

1. Data Cleaninng¶

In [3]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 119390 entries, 0 to 119389
Data columns (total 36 columns):
 #   Column                          Non-Null Count   Dtype  
---  ------                          --------------   -----  
 0   hotel                           119390 non-null  object 
 1   is_canceled                     119390 non-null  int64  
 2   lead_time                       119390 non-null  int64  
 3   arrival_date_year               119390 non-null  int64  
 4   arrival_date_month              119390 non-null  object 
 5   arrival_date_week_number        119390 non-null  int64  
 6   arrival_date_day_of_month       119390 non-null  int64  
 7   stays_in_weekend_nights         119390 non-null  int64  
 8   stays_in_week_nights            119390 non-null  int64  
 9   adults                          119390 non-null  int64  
 10  children                        119386 non-null  float64
 11  babies                          119390 non-null  int64  
 12  meal                            119390 non-null  object 
 13  country                         118902 non-null  object 
 14  market_segment                  119390 non-null  object 
 15  distribution_channel            119390 non-null  object 
 16  is_repeated_guest               119390 non-null  int64  
 17  previous_cancellations          119390 non-null  int64  
 18  previous_bookings_not_canceled  119390 non-null  int64  
 19  reserved_room_type              119390 non-null  object 
 20  assigned_room_type              119390 non-null  object 
 21  booking_changes                 119390 non-null  int64  
 22  deposit_type                    119390 non-null  object 
 23  agent                           103050 non-null  float64
 24  company                         6797 non-null    float64
 25  days_in_waiting_list            119390 non-null  int64  
 26  customer_type                   119390 non-null  object 
 27  adr                             119390 non-null  float64
 28  required_car_parking_spaces     119390 non-null  int64  
 29  total_of_special_requests       119390 non-null  int64  
 30  reservation_status              119390 non-null  object 
 31  reservation_status_date         119390 non-null  object 
 32  name                            119390 non-null  object 
 33  email                           119390 non-null  object 
 34  phone-number                    119390 non-null  object 
 35  credit_card                     119390 non-null  object 
dtypes: float64(4), int64(16), object(16)
memory usage: 32.8+ MB

1.1 Searching the null values¶

In [4]:
df.isnull().sum()
Out[4]:
hotel                                  0
is_canceled                            0
lead_time                              0
arrival_date_year                      0
arrival_date_month                     0
arrival_date_week_number               0
arrival_date_day_of_month              0
stays_in_weekend_nights                0
stays_in_week_nights                   0
adults                                 0
children                               4
babies                                 0
meal                                   0
country                              488
market_segment                         0
distribution_channel                   0
is_repeated_guest                      0
previous_cancellations                 0
previous_bookings_not_canceled         0
reserved_room_type                     0
assigned_room_type                     0
booking_changes                        0
deposit_type                           0
agent                              16340
company                           112593
days_in_waiting_list                   0
customer_type                          0
adr                                    0
required_car_parking_spaces            0
total_of_special_requests              0
reservation_status                     0
reservation_status_date                0
name                                   0
email                                  0
phone-number                           0
credit_card                            0
dtype: int64

1.2 Dealing with null values¶

In [5]:
## Dealing with null values of int col.
for i in df.columns:
    if df[i].dtype==float:
        df[i].fillna(0,inplace=True)

## Dealing with null values of object col.
for i in df.columns:
    if df[i].dtype==object:
        df[i].fillna('Not Available',inplace=True)

1.3 Checking the above code.¶

In [6]:
df.isnull().sum()
Out[6]:
hotel                             0
is_canceled                       0
lead_time                         0
arrival_date_year                 0
arrival_date_month                0
arrival_date_week_number          0
arrival_date_day_of_month         0
stays_in_weekend_nights           0
stays_in_week_nights              0
adults                            0
children                          0
babies                            0
meal                              0
country                           0
market_segment                    0
distribution_channel              0
is_repeated_guest                 0
previous_cancellations            0
previous_bookings_not_canceled    0
reserved_room_type                0
assigned_room_type                0
booking_changes                   0
deposit_type                      0
agent                             0
company                           0
days_in_waiting_list              0
customer_type                     0
adr                               0
required_car_parking_spaces       0
total_of_special_requests         0
reservation_status                0
reservation_status_date           0
name                              0
email                             0
phone-number                      0
credit_card                       0
dtype: int64
In [7]:
df.describe()
Out[7]:
is_canceled lead_time arrival_date_year arrival_date_week_number arrival_date_day_of_month stays_in_weekend_nights stays_in_week_nights adults children babies is_repeated_guest previous_cancellations previous_bookings_not_canceled booking_changes agent company days_in_waiting_list adr required_car_parking_spaces total_of_special_requests
count 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000
mean 0.370416 104.011416 2016.156554 27.165173 15.798241 0.927599 2.500302 1.856403 0.103886 0.007949 0.031912 0.087118 0.137097 0.221124 74.828319 10.775157 2.321149 101.831122 0.062518 0.571363
std 0.482918 106.863097 0.707476 13.605138 8.780829 0.998613 1.908286 0.579261 0.398555 0.097436 0.175767 0.844336 1.497437 0.652306 107.141953 53.943884 17.594721 50.535790 0.245291 0.792798
min 0.000000 0.000000 2015.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 -6.380000 0.000000 0.000000
25% 0.000000 18.000000 2016.000000 16.000000 8.000000 0.000000 1.000000 2.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 7.000000 0.000000 0.000000 69.290000 0.000000 0.000000
50% 0.000000 69.000000 2016.000000 28.000000 16.000000 1.000000 2.000000 2.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 9.000000 0.000000 0.000000 94.575000 0.000000 0.000000
75% 1.000000 160.000000 2017.000000 38.000000 23.000000 2.000000 3.000000 2.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 152.000000 0.000000 0.000000 126.000000 0.000000 1.000000
max 1.000000 737.000000 2017.000000 53.000000 31.000000 19.000000 50.000000 55.000000 10.000000 10.000000 1.000000 26.000000 72.000000 21.000000 535.000000 543.000000 391.000000 5400.000000 8.000000 5.000000

2.0 Exploratory Data Analysis (E.D.A.)¶


2.1 Filtering out count of Bookings being cancelled.¶

In [8]:
can=df[df['is_canceled']==1][['is_canceled']].groupby(df['hotel']).count().reset_index()
can
Out[8]:
hotel is_canceled
0 City Hotel 33102
1 Resort Hotel 11122

2.1.2 Visualisation on cancellation of hotel bookings.¶

In [9]:
plt.figure(figsize=(5,4), dpi=100)
plt.title('No. of Bookings cancellations by hotels')
sns.barplot(data=can, x='hotel', y='is_canceled')
plt.xlabel('Hotel',size=20)
plt.ylabel('Is_canceled',size=14)
plt.show()

Conclusion:¶

  • so here this graph is clearly showing that number of cancelation is more in city hotels as compare to resort hotels .

Reason

  • The reason behind this can be beacuse city hotels are situated in city where most of the population lives where as resort hotels are situated at the outer part of city. so, that's why most pepole vists city hotels rather than resort hotels.

2.2 Top 10 countries From which most guest are belonging:¶


2.2.1 Filtering of the top10 countries.¶

In [10]:
top10=df[df['is_canceled']==0]['country'].value_counts().reset_index().head(10)
top10.columns=['Country','No. of guests']
top10
Out[10]:
Country No. of guests
0 PRT 21071
1 GBR 9676
2 FRA 8481
3 ESP 6391
4 DEU 6069
5 IRL 2543
6 ITA 2433
7 BEL 1868
8 NLD 1717
9 USA 1596

2.2.2 Visualisation for the top 10 countries.¶

In [11]:
plt.figure(figsize=(5,4), dpi=100)
plt.title('Top 10 countries')
sns.barplot(data=top10, x='Country', y='No. of guests')
plt.xlabel('COUNTRY',size=16)
plt.ylabel('NO. OF GUESTS',size=16)
plt.show()

2.2.3 Visualization of top10 countries(map).¶

In [12]:
guest_visit_map=px.choropleth(top10,locations=top10['Country'],color=top10['No. of guests'],hover_name=top10['Country'])
guest_visit_map.show()

Conclusion¶

  • That mostly guests are from country name PRT which stands for Portugal. And it has more than 20k guests.

2.3 Monthly wise guests visits to hotels¶


2.3.1 Filtering out Months with No. of guests visits.¶

In [13]:
monthly=df['arrival_date_month'].value_counts().reset_index()
monthly.columns=['arrival date month','No. of guest']
monthly
Out[13]:
arrival date month No. of guest
0 August 13877
1 July 12661
2 May 11791
3 October 11160
4 April 11089
5 June 10939
6 September 10508
7 March 9794
8 February 8068
9 November 6794
10 December 6780
11 January 5929

2.3.2 Visualisation for monthly visits.¶

In [14]:
plt.figure(figsize=(9,4))
plt.xticks(rotation=50)
plt.title('Monthly Guest visits')
sns.barplot(data=monthly,x='arrival date month',y='No. of guest')
plt.xlabel('Arrival date month',size=15)
plt.ylabel('No. of guest',size=15)
Out[14]:
Text(0, 0.5, 'No. of guest')

Conclusion:¶

  • In the month of august mostly guests visits to hotels.

2.4 Counts of guest revisiting hotels:¶


2.4.1 Filtering out the guest revisits to hotels.¶

In [15]:
revisit=df[df['is_repeated_guest']==1]['hotel'].value_counts().reset_index()
revisit.columns=['Hotel','Guests Revisited']
revisit
Out[15]:
Hotel Guests Revisited
0 City Hotel 2032
1 Resort Hotel 1778

2.4.2 Visualisation for the guest revisits.¶

In [16]:
plt.figure(figsize=(5,4), dpi=100)
plt.title('No. of guests Revisited to hotels')
sns.barplot(data=revisit, x='Hotel', y='Guests Revisited',palette='muted')
plt.xlabel('Hotel',size=20)
plt.ylabel('Guests Revisited',size=14)
plt.show()

CONCLUSION:

  • well there is no such big difference in the revisit of guests in both the hotels.But as per data guest more prefer city hotel to revisit.

2.5 Analyzing type of customer visiting hotels.¶


2.5.1 Filtering out the customer_type¶

In [17]:
type=df[['hotel','customer_type']].value_counts().reset_index()
type.columns=['hotels','customer_type','count_of_guests']
type
Out[17]:
hotels customer_type count_of_guests
0 City Hotel Transient 59404
1 Resort Hotel Transient 30209
2 City Hotel Transient-Party 17333
3 Resort Hotel Transient-Party 7791
4 City Hotel Contract 2300
5 Resort Hotel Contract 1776
6 City Hotel Group 293
7 Resort Hotel Group 284

2.5.2 Visualization for the type of customer visiting.¶

In [18]:
plt.figure(figsize=(5,4), dpi=100)
plt.title('Type of customer visits to hotels')
sns.barplot(data=type,x='hotels',y='count_of_guests',hue='customer_type')
plt.xlabel('Hotel',size=20)
plt.ylabel('Count_Of_Guests',size=14)
plt.show()

conclusion

  • Mostly transient type of customers are visiting. Transient guests are one of the major market segments an consist of individuals or groups that are occupying less than 10 rooms per night. so hotels can arrange special offers for these type of guests.

2.6 Analyzing Most reserved room_type¶


2.6.1 Filtering out the most reserved room_type.¶

In [54]:
room=df[['hotel','reserved_room_type']].value_counts().reset_index()
room.columns=['Hotel','reserved_room_type','count_of_reservation']
room
Out[54]:
Hotel reserved_room_type count_of_reservation
0 City Hotel A 62595
1 Resort Hotel A 23399
2 City Hotel D 11768
3 Resort Hotel D 7433
4 Resort Hotel E 4982
5 City Hotel F 1791
6 Resort Hotel G 1610
7 City Hotel E 1553
8 City Hotel B 1115
9 Resort Hotel F 1106
10 Resort Hotel C 918
11 Resort Hotel H 601
12 City Hotel G 484
13 City Hotel C 14
14 City Hotel P 10
15 Resort Hotel L 6
16 Resort Hotel B 3
17 Resort Hotel P 2

2.6.2 Visualisation for the room type reservations.¶

In [20]:
plt.figure(figsize=(8,4), dpi=100)
plt.title('Most room type reserved by customers')
sns.barplot(data=room,x='Hotel',y='count_of_reservation',hue='reserved_room_type',palette='muted')
plt.xlabel('Hotel',size=20)
plt.ylabel('Count Of Reservations',size=14)
plt.show()

conclusion

  • The above visualisation clearly shows that room type A is mostly reserved by guests. So the hotels should increase there A type of rooms as per they are mostly in demand.

2.7 Comparison of reserved room type against assigned room.¶

In [79]:
r=list(df.reserved_room_type)
a=list(df.assigned_room_type)
In [81]:
c=0
for i,j in zip(r,a):
    if i!=j:
        c=c+1
print(f'Total number of customer who did not got the room as per there preferences are {c}')
Total number of customer who did not got the room as per there preferences are 14917

Conclusion

  • so here we found that 14,917 customer didn't got the rooms as per there preferences. so hotels should increase the quantity of rooms.

2.8 Correlation¶

In [82]:
sns.heatmap(df.corr())
Out[82]:
<AxesSubplot:>



Thank you