Summary
Overview
Work History
Education
Skills
Timeline
Generic

Anish P

Vancouver,BC, Canada

Summary

● Offering 7+ Years of experience can be headhunted for a Lead level position across any functional sectors within an IT organization of repute.

● Experience on Migrating SQL database to Azure data Lake, Azure data lake Analytics, Azure SQL Database, Data Bricks and Azure SQL Data warehouse and controlling and granting database access and Migrating On premise databases to Azure Data Lake store using Azure Data factory.

● Experience in Developing Spark applications using Spark - SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.

● Good understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Driver Node, Worker Node, Stages, Executors and Tasks.

● Good understanding of Big Data Hadoop and Yarn architecture along with various Hadoop Demons such as Job Tracker, Task Tracker, Name Node, Data Node, Resource/Cluster Manager, and Kafka (distributed stream-processing).

● Experience in Database Design and development with Business Intelligence using SQL Server 2014/2016, Integration Services (SSIS), DTS Packages, SQL Server Analysis Services (SSAS), DAX, OLAP Cubes, Star Schema and Snowflake Schema.

● Excellent communication skills with excellent work ethics and a proactive team player with a positive attitude.

● Domain Knowledge of Finance, Logistics and Retail.

● Strong skills in visualization tools Power BI, Confidential Excel - formulas, Pivot Tables, Charts and DAX Commands.

● Expertise in various phases of project life cycles (Design, Analysis, Implementation, and testing).

● Led database administration and database performance tuning efforts to provide scalability and accessibility in a timely fashion, provide 24/7 availability of data, and solve end-user reporting and accessibility problems.

Overview

8
8
years of professional experience

Work History

Technology Lead / Data Engineer

Infosys Ltd.
05.2022 - Current
  • Leveraged Azure Data Engineering expertise to design and implement data solutions specifically for the retail domain, utilizing Azure PaaS offerings for robust data visualization tools
  • Conducted evaluations of existing applications to assess and ensure seamless integration of new implementations with established business processes within the supply chain context
  • Implemented ETL processes using Azure Data Factory, T-SQL, Spark SQL, and U-SQL in Azure Data Lake Analytics for efficient data ingestion and processing in supply chain management
  • Developed and deployed ETL pipelines in Azure Data Factory, facilitating data movement from various sources like Azure SQL Database, Blob Storage, and Azure Synapse Analytics, enhancing data integrity for supply chain decisions
  • Designed Spark applications using PySpark and Spark-SQL for complex data processing tasks, enabling deeper insights into supply chain operations
  • Optimized Spark Databricks clusters to ensure high performance and reliability, critical for data processing tasks in supply chain management
  • Fine-tuned Spark applications to enhance data processing efficiency, crucial for analytics in the supply chain sector
  • Created custom UDFs in Scala and PySpark to solve unique business challenges within the supply chain domain
  • Automated data pipeline deployments in Azure Data Factory using JSON scripts, improving operational efficiency in supply chain management
  • Developed and optimized SQL scripts for automation, enhancing scalability and efficiency of supply chain data management processes
  • Managed deployment and maintenance of data engineering projects in production environments using Azure DevOps, ensuring reliable data solutions for supply chain management
  • Engineered SSIS packages for data integration from various sources into SQL Server, facilitating complex data processing within the supply chain context
  • Conducted data cleansing, enrichment, mapping, and validation to ensure accurate and relevant supply chain data
  • Developed Power BI dashboards and reports for actionable supply chain insights, enabling data-driven decision-making
  • Participated in the full SDLC for supply chain management solutions, ensuring high-quality data engineering projects from requirements analysis to deployment
  • Designed and implemented data warehousing solutions using star schema, optimizing data storage for efficient retrieval and analysis in supply chain applications
  • Led end-to-end supply chain management projects, leveraging Power BI for advanced reporting and visualization capabilities
  • Implemented data governance frameworks and MDM strategies with tools like Informatica and Collibra to enhance data quality and compliance in supply chain operations
  • Developed and deployed AI and ML models for predictive analytics in supply chain management, ensuring industry standards compliance
  • Managed database cleanup and standardization projects, supporting effective supply chain management practices
  • Translated business requirements into technical specifications, facilitating clear communication between supply chain stakeholders and data engineering teams
  • Supported strategic decision-making with KPI dashboards and performance analysis reports, utilizing Python and Power BI for data-driven insights in supply chain management.

Environment: Azure Data Factory, T-SQL, Spark SQL, U-SQL, Azure Data Lake Analytics, Azure Databricks, Informatica Data Explorer (IDE), Informatica Data Quality (IDQ), SQL, SAS, Collibra, Rochade, AI/ML models, Cognos, WebSphere Process Server, HL7 FHIR messaging, Crystal Reports, Power BI, Oracle, Teradata, Python, SSAS Tabular cubes, Windows Server, Windows 10, C# .Net, Azure SQL, SQL Server 2012, SQL Server 2016, Visual Studio, ETL, Excel, Macros, Snowflake, Bitbucket, Git, Confluence, JIRA, Smartsheet, MySQL, MySQL Workbench, SSMS, AWS, Jupiter Hub, Slack.

MSBI Developer

Citi Bank
02.2020 - 04.2022
  • Led the implementation of Azure Data Factory (ADF) for the seamless ingestion of both structured and unstructured data from diverse banking systems, ensuring alignment with the bank's functional requirements for data management and analysis
  • Spearheaded the design and development of both batch and real-time data processing solutions tailored to the banking sector, leveraging ADF, Azure Databricks clusters, and Azure Stream Analytics to optimize financial data flows and analytics
  • Engineered and maintained numerous data pipelines within Azure Data Factory v2, facilitating the integration and transformation of financial data from varied sources, including banking transactions, customer data, and market feeds, using Azure services like Data Movement, Transformation, Copy, and Databricks, ensuring data integrity and accessibility for analytics and reporting
  • Automated data processing workflows in ADF using event-based, schedule-based, and tumbling window triggers, enhancing the efficiency of data management tasks in banking operations
  • Provisioned and managed Azure Databricks clusters and notebooks for scalable data processing and analysis, implementing autoscaling to efficiently handle varying loads of banking data
  • Utilized Polybase for efficient data loading into Azure Synapse, streamlining the analytics processes for large-scale banking datasets
  • Employed Azure's self-hosted and cloud-based integration runtimes within ADF to ensure secure and efficient data integration and processing across the bank's on-premises and cloud environments
  • Optimized the performance of real-time financial data processing by fine-tuning Databricks cluster configurations, reducing latency in fraud detection and market data analysis
  • Designed and implemented a novel solution for near-real-time (NRT) financial data processing using Azure Stream Analytics, Event Hubs, and Service Bus Queues, improving the bank's responsiveness to market changes and customer activities
  • Established linked services in ADF to facilitate secure and reliable connections to external data sources, including financial market feeds and regulatory databases, ensuring comprehensive data coverage for analysis and reporting
  • Collaborated with database administrators and developers in the creation and optimization of SQL-based data structures and procedures for complex financial data, enhancing the bank's data warehousing and business intelligence capabilities
  • Leveraged Azure DevOps and Jenkins for the continuous integration and delivery (CI/CD) of banking data solutions, ensuring rapid deployment and high availability of data services
  • Played a key role in financial data lake management using Azure Data Lake Storage (ADLS), implementing strategies for secure data storage, access, and integration with other Azure services to support advanced analytics and reporting
  • Directed the migration of critical banking data from legacy systems (Oracle, Teradata) to Azure Data Lake Store (ADLS) using Azure Data Factory, facilitating a seamless transition to cloud-based data management, and enabling advanced analytics capabilities
  • Developed and enforced data engineering best practices within the bank, collaborating with IT support and solution architecture teams to address complex data integration challenges and ensure the reliability and scalability of data platforms
  • Designed and executed comprehensive data pipelines on the Azure platform, integrating data from a wide array of banking systems to support centralized analytics and decision-making
  • Implemented sophisticated ETL processes using Talend Open Studio for Data Integration, enhancing the bank's capability to ingest, transform, and analyze data from various sources, including core banking systems, customer relationship management (CRM) platforms, and financial markets
  • Conducted extensive data profiling and quality assessments using tools like Informatica Data Quality (IDQ), ensuring the accuracy and integrity of financial data used in reporting, compliance, and strategic analysis
  • Developed and maintained enterprise-grade data warehousing and business intelligence solutions, employing Microsoft SQL Server Integration Services (SSIS), Power BI, and Reporting Services (SSRS) to deliver actionable insights to bank executives and stakeholders
  • Applied Master Data Management (MDM) principles to standardize and cleanse critical banking data, establishing a robust framework for ensuring data quality and consistency across the bank's operational and analytical systems.

Environment: SQL Server, SSIS, SSRS, Windows Server, Windows 10, Unix, Shell Scripting, Power BI, Azure SQL, SQL Server 2012, SQL Server 2016, SSAS, Visual Studio, C#, Azure, ETL, Shell Scripting, Excel, Macros.

Azure Data Engineer

IGM Financials
08.2018 - 01.2020
  • Engineered and maintained state-of-the-art data pipeline architectures within Microsoft Azure, leveraging Azure Data Factory and Azure Databricks, tailored to the retail finance sector's unique data management and analysis needs
  • Designed Azure-based architectural solutions optimizing analytics tools for financial retail scenarios, significantly enhancing data-driven decision-making processes
  • Translated complex technical solutions into easily understandable concepts for retail finance stakeholders, ensuring buy-in and facilitating informed decision-making
  • Guided retail clients through the advantages and limitations of various Azure PaaS and SaaS offerings, focusing on cost-efficient solutions that meet the dynamic needs of the finance domain
  • Developed self-service reporting capabilities in Azure Data Lake Store Gen2, employing an ELT approach to democratize data access for retail finance analysts
  • Innovated with Spark Vectorized panda UDFs to streamline data manipulation and wrangling, addressing the specific challenges of financial data analysis in retail
  • Strategically transferred data through logical stages (from system of records to raw, refined, and produce zones), optimizing for the retail finance sector's requirements for data translation and denormalization
  • Configured Azure infrastructure (storage accounts, integration runtimes, service principal IDs, app registrations) to support scalable analytics for retail banking and financial services
  • Implemented complex transformations using PySpark and Spark SQL in Azure Databricks, directly supporting retail finance business rule applications
  • Automated bulk data transfers from relational databases to Azure Data Lake Gen2, enhancing data management efficiency in retail finance operations
  • Designed a custom logging framework for ELT and performance insights for retail financial services data pipelines
  • Led proof of concept projects from ideation to production, delivering Azure Data Factory pipelines that add tangible business value to retail finance
  • Ensured data privacy and security across international borders, adhering to regulatory requirements specific to the retail finance industry
  • Adopted CI/CD best practices using Azure DevOps for financial data pipeline development, maintaining high standards of code quality and version control
  • Facilitated denormalized data access for PowerBI, enhancing data modeling and visualization capabilities for retail financial analysts
  • Embedded in a Scaled Agile Framework (SAFE) team environment, contributing to agile development cycles with a focus on retail finance data solutions
  • Extracted and organized competitor data using web scraping techniques, providing a competitive edge in the retail banking sector
  • Applied time pipelines in Data Factory, improving monitoring and operational transparency in financial data processing
  • Enabled comprehensive monitoring and Azure log analytics, ensuring high availability
  • Series analysis on sales data to forecast trends and inform promotional strategies within the retail finance domain
  • Developed predictive models (using Xgboost and Random Forest) to forecast sales, directly supporting strategic planning in retail finance
  • Transitioned data storage from Cloudera Hadoop Hive to Azure Data Lake Store as part of a digital transformation strategy in the finance retail sector
  • Implemented IoT streaming and Delta Lake solutions for real-time financial transaction logging, enhancing data integrity and accessibility
  • Exposed data in efficient formats (e.g., parquet) via Azure Spark Databricks, optimizing storage and access for financial analytics in retail
  • Architected scalable cloud solutions for diverse analytics requirements in retail finance, employing SQL databases and ELT techniques for efficient data management
  • Assembled and managed large datasets to meet both functional and non-functional retail finance requirements, enhancing customer insight and operational efficiency
  • Integrated advanced analytics tools with the data platform, providing actionable insights into customer behavior, sales performance, and other key metrics in the retail finance domain.

Environment: Azure Data Lake Gen2, Azure Data Factory, Spark, Databricks, Azure Devops, Agile, PowerBI, Python, R, SQL, Scaled Agile team environment, Hadoop, Hive, Azure Data Lake, Azure Data Factory, Spark, Databricks, Dremio, Tableau, PowerBi, Python, R, Knime, Docker

SQL Server 2012,2014, SSIS, SSRS, SQL Profiler, SQL Sentry Plan Explorer, Microsoft Office, Microsoft Visual Studio, Team Foundation Server, SVN, C#.Net, Jira, Confluence, DevOps, VBA, T-SQL, SSAS, Macro.

Systems Engineer /ETL Developer

Infosys Ltd., India
05.2016 - 07.2018
  • Developed various types of SQL Server Reports (SSRS), including Tabular, Cross Tab, and SubReports, and scheduled them for automatic execution
  • Managed data source identification and definition for building data source views
  • Engaged in debugging, testing, and deploying reports to enhance data accuracy and functionality
  • Established and maintained data model/architecture standards, focusing on Master Data Management (MDM) to streamline data sharing and support strategic decision-making
  • Utilized MDM tools for data standardization, deduplication, and filtering to ensure data quality and compliance with organizational requirements
  • Created SQL tables with referential integrity and developed comprehensive queries using SQL, SQL
  • PLUS, and PL/SQL for database management and data manipulation
  • Conducted GAP Analysis between current and future business process models to identify necessary functionalities for meeting supply chain management record standards such as EDI, EMR, EHR, HL7
  • Participated in FHIR implementations within the Provider and Payer domains, utilizing FHIR STU3 and R4 standards for supply chain management data exchange
  • Designed test scripts for claims testing across development, integration, and production environments
  • Validated HL7 file formats from various vendors for accurate parsing and loading
  • Played a key role in the implementation phases of the Facets Extended Enterprise administrative system, including planning, designing/building/validation, and Go-live support, while following Agile methodology for project management
  • Contributed to the design of data warehouse solutions and business intelligence dashboards, ensuring the delivery of actionable insights through effective data visualization
  • Addressed the impacts of HIPAA 5010 on enrollment and claims processes, maintaining open communication with developers to ensure all modifications met project requirements
  • Performed compatibility testing with software, hardware, operating systems, and network environments to ensure system interoperability
  • Developed and enhanced ETL processes using SQL Server, including the creation of complex SSIS packages for data migration from heterogeneous sources (Oracle, XML, etc.) with advanced transformations
  • Designed high-level ETL architecture for data transfer to the Enterprise Data Warehouse, detailing server, database, accounts, tables, data flow, column mapping, data dictionary, and metadata
  • Utilized Performance Point Server for creating interactive reports with drill-down capabilities
  • Managed data migration from text files to SQL Server tables using BCP
  • Designed and implemented triggers, stored procedures, functions, and error handling mechanisms to enforce data integrity and meet business requirements
  • Optimized queries through index implementation and tuning to improve SQL Server database and application performance
  • Monitored server performance using tools like SQL Profiler and SQL Sentry Plan Explorer.

Environment: T-SQL, SQL Sentry Explorer, Visual Studio, SSMS, SQL Profiler, SSIS, SSRS,

SQL SERVER 2012, SQL SERVER 2008R2, Excel, Macros, CSV, PowerShell Scripting, C#, VBA

Education

Bachelor of Technology in Information Technology -

Andhra University

Skills

  • Azure Data Lake
  • Data factory
  • Azure Databricks
  • Azure SQL database
  • Azure SQL Datawarehouse
  • Programming Scala, Python, Spark SQL
  • MSBI (SSIS, SSAS, SSRS)
  • Data Visualization
  • Data Migration
  • SQL Server programming
  • Confidential Power BI
  • Analytic Problem-Solving
  • Data Analytics: Data Cleaning, Visualization & Normalization
  • Data Science: Artificial Intelligence, Regression, Machine Learning
  • Math & Stats: Probability, Correlation & Mathematical Modelling
  • Reporting Tools: Power BI & Tableau
  • Languages: SQL, TSQL & Basic Python
  • IDE: SQL Developer, SQL Server Management Studio, Jupiter
  • Python: Pandas, NumPy, Matplotlib, Seaborne, SciPy, Scikit Learn
  • Databases: MySQL, SQL Server, Oracle
  • Cloud Technologies: AWS RDS, Azure
  • ETL: SQL Server Integrated Services (SSIS)
  • Operating System: Windows
  • Repository System: GitHub
  • ETL/Middleware Tools: Talend, SSIS, Azure Data Factory, Azure Data Bricks
  • Big Data: Cosmos, Hadoop, Azure Data Lake
  • Azure: Azure Data Factory, Azure Data Bricks, PolyBase, Azure DW, ADLS, Azure Synapse Analytics, BLOB, Azure SQL Server, Azure DW
  • RDBMS: Oracle, Netezza, Teradata, Redshift, MS SQL Serv, MySQL
  • Programming Skills: T-SQL, Java, Python, MS-SQL, SOQL
  • Tools: TOAD, SQL developer, Azure Data Studio, Soap UI, SSMS, GitHub, Share Point, Visual Studio, Teradata SQL Assistant

Timeline

Technology Lead / Data Engineer

Infosys Ltd.
05.2022 - Current

MSBI Developer

Citi Bank
02.2020 - 04.2022

Azure Data Engineer

IGM Financials
08.2018 - 01.2020

Systems Engineer /ETL Developer

Infosys Ltd., India
05.2016 - 07.2018

Bachelor of Technology in Information Technology -

Andhra University

Anish P