Highly motivated and detailed-oriented candidate passionate about using data to improve business performance and customer experience. Skilled at leveraging data to develop actionable solutions to business challenges and utilizing data mining and data visualization to create meaningful insights. Excellent technical aptitude and knowledge of programming languages, data analytics, and data visualization.
Melbourne Housing Dataset, Conducted in-depth analysis of Melbourne Housing Market dataset. Produced descriptive statistics and visualizations to understand property price distributions, trends, and outliers and developed predictive models for property prices using regression techniques., Python, pandas, matplotlib, scikit-learn, Tableau, Evaluated model performance with metrics like MAE and RMSE. Explored geographic factors impacting property prices. Investigated seasonality and trends in property prices over time., Extracted valuable insights into Melbourne's housing market, including regional price variations and influential factors. Communicated findings effectively through reports and visualizations. College Acceptance Dataset, Loaded the "College" dataset for analysis. Checked for missing values in the dataset. Split the dataset into training and test sets (70% training and 30% test) using stratified sampling based on the "Private" variable. Created predictor matrices (X) and response vectors (y) for both training and test datasets and conducted Ridge regression analysis on the "Outstate" variable using the glmnet package. Performed cross-validation to estimate optimal lambda values (lambda.min and lambda.1se)., Extracted and interpreted Ridge regression coefficients. Calculated Root Mean Squared Error (RMSE) for both training and test datasets. Checked for model overfitting by comparing RMSE on training and test datasets and also performed cross-validation to estimate optimal lambda values. Extracted and interpreted LASSO regression coefficients. Calculated RMSE for both training and test datasets and checked for model overfitting by comparing RMSE on training and test datasets.