Project Overview
- This project is created with the intention to determine which factor(s) influences clients to apply and accept a personal loan from the bank.
 - We found that an individual’s annual income and the number of family members are the most important factors in determining whether an individual accepts a personal loan from the bank.
 - The dataset can be obtained from Kaggle.
 - The libraries involved in the project include: Pandas, Matplotlib, Seaborn, and Plotly
 - Link to this Project on Github
 
Objectives
What is the most influential factors to receiving a personal loan from the bank?
Data Cleaning and Organization
- 
    
Some nominal variables are eliminated (such as “ID” and “Zip Code”).
 - 
    
No missing data is found.
 - 
    
Anomaly values, such as negative values in “Experience”, are replaced with the mean

 - 
    
As depicted in the graph, there is a large number of outliers in “income”.
 - 
    
The correlation in the attributes is explored.

Analyzing The Data
 - 
    
There are more people with an “undergraduate” degree, however, the distribution of educational level attained is relatively equivalent among the people who received a loan.


 - 
    
Most individuals who receive a personal loan do not hold or invest in any securities or bank deposits in their investment account

 - 
    
Most people with an higher income are approved for a personal loan from the bank.

 - 
    
Individuals who have family size 3 or greater with a higher income between 100k to 200k are more likely to apply for a loan.

 
Interpreting the Results
- “Income” and “number of family member” are positively correlated with personal loan.
 - “Education level” and investments has the least influence or impact on whether an individual decides to accept the loan
 - Further investigation is recommended to determine the correlation between “income” and other factors such as “mortgage” or “CCAvg” etc.
 - You can access the Jupyter notebook here.