Summary
Overview
Work History
Education
Skills
Timeline
Generic

SAI KRISHNA MOVVA

Summary

Over Seven years of IT experience, adept in both Azure and AWS technologies, including Azure Data Lake, Azure Data Factory, Azure Databricks, Amazon EMR, Apache Airflow, Python, Snowflake, and PySpark. Work with data pipelines using SQL, Python, Airflow, PySpark, AWS (EMR, Athena), and Hadoop. Expertise in data processing, automation of workflows, and cloud-based data solutions. Proven track record of transforming data into actionable insights for improved business decision-making. Actively involved in a data engineering project on AWS, focusing on Amazon EMR, Apache Airflow, Python, Snowflake, and PySpark, to deliver efficient data processing workflows and orchestration. Proficient in implementing data processing workflows on both Azure and AWS platforms using tools like Amazon EMR, Apache Airflow, and Azure Data Factory, ensuring streamlined data orchestration and resource optimization. Design and optimize data models within Snowflake on both Azure and AWS, guaranteeing optimal support for data retrieval, analysis, and reporting requirements across cloud environments. Extract and manipulate data from diverse source systems using PySpark within Snowflake, demonstrating proficiency in scalable data transformation and analysis across Azure and AWS. Develop and deploy PySpark scripts on both Azure and AWS for data transformation and analysis, ensuring scalability, performance, and consistency across cloud platforms. Configure Snowflake on both Azure and AWS for seamless connectivity with various systems and data sources, including on-premises databases and cloud platforms, facilitating smooth data integration and interoperability. Implement best practices for data security and access control within Snowflake on both Azure and AWS, ensuring data integrity, confidentiality, and compliance with regulatory standards. Proficient in designing, implementing, and optimizing data pipelines and solutions on both Azure and AWS platforms, catering to diverse client requirements and industry standards. Translate complex business requirements into technical solutions on both Azure and AWS, collaborating effectively with cross-functional teams to deliver high-quality results and meet project objectives. Skilled in all phases of the reporting life cycle on both Azure and AWS, proficiently crafting diverse dashboards and reports using SSRS, Power BI, Tableau, and other visualization tools. Extensive experience in SQL Server Integration Services (SSIS) and Reporting Services (SSRS) on both Azure and AWS, coupled with solid expertise in crafting DAX in Power BI for enhanced reporting capabilities. Successfully integrate Snowflake with various systems and data sources on both Azure and AWS, including on-premises databases, cloud storage platforms, and third-party APIs, ensuring seamless data flow and interoperability. Engage in report automation and extensive dataset handling using Microsoft Power BI, Microsoft SQL Server, Azure SQL, SSIS, and Azure Data Factory on both Azure and AWS, ensuring efficient data processing and reporting capabilities. Contribute to migration projects utilizing Azure and AWS services and tools for data ingestion, egress, and transformation from diverse sources, showcasing adaptability and expertise in cloud-based data engineering solutions.

Overview

8
8
years of professional experience

Work History

Desjardins
08.2022 - Current
  • Designed and implemented robust data solutions utilizing Azure technologies, with expertise in Azure Synapse, Azure Data Factory, and Azure Databricks.
  • Developed and maintained data pipelines using Python, ensuring reliable, efficient data collection, transformation, and storage from multiple sources.
  • Leveraged Azure Data Lake Storage (ADLS), Azure Functions, and Azure SQL Database for efficient data processing and storage, enhancing overall data management capabilities.
  • Developed comprehensive data marts within Azure SQL Data Warehouse, encompassing dimension tables, fact tables, stored procedures, and user-defined functions to bolster data analysis capabilities.
  • Implemented data processing workflows with Python libraries such as Pandas, PySpark, and Dask to handle large datasets and perform real-time analytics.
  • Orchestrated data extraction from on-premises SQL Server and Oracle databases into Azure SQL Data Warehouse using Azure Data Factory and SQL Server Integration Services (SSIS), ensuring seamless data integration.
  • Building scalable ETL pipelines using Python
  • Writing performant data transformations in Python
  • Interfacing Python with SQL/NoSQL databases
  • Configured Azure platform components for streamlined data pipelines, Azure Blob Storage, and Data Lakes, leveraging Azure Data Factory to automate workflows and optimize data flow.
  • Translated intricate business requirements into actionable insights through the creation of intuitive dashboards and reporting solutions using Microsoft Power BI Desktop, contributing to comprehensive Azure Data Analytics solutions.
  • Collaborated closely with stakeholders and end-users throughout the software development lifecycle, liaising with project managers, business analysts, and testing teams to ensure project success.
  • Batch and real-time data processing using Python and Apache Spark (PySpark)
  • Creating custom data connectors in Python
  • Integrated Azure Synapse Analytics with Power BI for immersive data visualization, producing interactive dashboards and reports that provided valuable insights to stakeholders.
  • Automating reporting pipelines with Python script.
  • Unit testing and debugging Python data workflows
  • Improved agility in data access and accelerated decision-making by implementing metadata-driven data virtualization frameworks.
  • Performance tuning for Python-based data jobs
  • Project :1 Azure Data Lake-Driven Business Intelligence
  • Roles and Responsibilities

Cvent
08.2017 - 05.2022
  • Built a real-time data pipeline using PySpark for stream processing and integrated with Apache Airflow for scheduling and monitoring jobs.
  • Utilized AWS EMR to scale the processing of high-volume data streams and Amazon S3 for cost-effective storage.
  • Designed a data lake on AWS, utilizing Hadoop for distributed data storage and Athena for ad-hoc querying.
  • Migrated legacy systems to a modern cloud-based architecture, reducing infrastructure costs by 30%.
  • Designed and fine-tuned data models within Snowflake on AWS, ensuring seamless support for data retrieval, analysis, and reporting requirements.
  • Executed data extraction and manipulation from diverse source systems using PySpark within Snowflake, facilitating agile data transformation and analysis.
  • Developed and deployed PySpark scripts for comprehensive data transformation and analysis, adhering to best practices for scalability and performance optimization.
  • Configured Snowflake for seamless connectivity with various systems and data sources, including on-premises databases and cloud platforms, ensuring robust data integration.
  • Implemented stringent best practices for data security and access control within Snowflake, safeguarding data integrity and compliance with regulatory standards.
  • Collaborated with project teams to analyze client requirements, employing data mining techniques for predictive analytics.
  • Produced high-quality ETL/ELT code within Snowflake, emphasizing performance and maintainability, crafting Extract, Transform, Load (ETL) processes from diverse source systems using SQL and Snowflake-specific functionalities.
  • Contributed to ETL mappings and OLAP report development tasks, supporting core data aggregation, modeling, and algorithm implementation, utilizing various SQL Server constraints and complex T-SQL queries.
  • Visualized data, authored reports, and scheduled automated refreshes in Power BI desktop, facilitating dashboard creation and sharing, and managed the migration of on-premises SQL server schemas to Azure SQL Server.
  • Developed SSIS packages using Visual Studio components, incorporating control flow and tasks like data flow, execute SQL, and expression, applied data cleansing and transformations, including Slowly Changing Dimension (SCD), derived columns, and conditional split.
  • Conducted performance tuning of stored procedures and T-SQL queries, created logins, configured permissions, and implemented indexes in SQL Server Management Studio for optimized query performance.
  • Leveraged SQL Azure for database needs, including data assessments, migrations, and improved application performance using Azure Search and SQL query optimization.
  • Optimized query performance within Snowflake by analyzing execution plans, identifying bottlenecks, and implementing tuning techniques such as indexing strategies and partitioning data.
  • Designed Power BI reports and dashboards using Power BI Desktop and Service, developed drill-down and drill-through reports in SSRS, and utilized SSIS for ETL packages to validate, extract, transform, and load data into warehouse and data mart databases.
  • Performed administrative tasks within Snowflake, including managing user accounts, monitoring system performance, and configuring resource allocation, and administered SSRS interface for organizing reports and data sources, scheduling report execution, and tracking reporting history.
  • Collaborated with teams and subject matter experts to validate results, visualize data analysis, and implemented notifications based on scheduled jobs and database monitoring, and designed and implemented data models within Snowflake, ensuring efficient data retrieval, analysis, and reporting.
  • Project 1 : AWS EMR Data Processing Optimization
  • Project 2: ETL Data Pipeline on Azure for Retail Analytics

Education

B. TECH - Information & Technology

JNTU UNIVERSITY
01.2012

COMPUTER SOFTWARE DEVELOPMENT STUDIES (CSDL) - undefined

CORNELL COLLEGE
01.2016

Skills

  • Languages: SQL, Python, Mongo DB
  • Big Data Technologies: PySpark, Hadoop (HDFS, MapReduce)
  • Cloud Services: AWS EMR, Athena, S3, EC2
  • Workflow Orchestration: Apache Airflow
  • Data Warehousing: Amazon Redshift, Snowflake
  • Databases: MySQL, PostgreSQL, Oracle
  • Version Control: Git, GitHub
  • Operating Systems: Linux, Windows
  • Data visualization: Power Bi, Tableau

Timeline

Desjardins
08.2022 - Current

Cvent
08.2017 - 05.2022

COMPUTER SOFTWARE DEVELOPMENT STUDIES (CSDL) - undefined

CORNELL COLLEGE

B. TECH - Information & Technology

JNTU UNIVERSITY
SAI KRISHNA MOVVA