Summary
Overview
Work History
Education
Skills
Accomplishments
Work Availability
Timeline
Generic

Amit Tyagi

Winnipeg,Canada

Summary

Over 12 years of experience in IT Industry with more than 10 years in Big Data Technologies.


Experience working as a Solutions Architect, Data Architect and Platform Architect, primarily in Big Data domain.


Having sound knowledge of Big Data Ecosystem, DWH, BI and ETL, ELT, IOT, DevOps, AWS and Azure.


Experience in Presales, deal solutioning, sales enablement and client presentations.


Migration planning of tools and technologies from legacy systems to microservices/cloud architecture.


Demonstrated ability to lead and manage teams effectively, motivating team members, and driving successful project delivery.


Proven track record of successfully managing and delivering complex projects within scope, timeline, and budget.


Proficient in setting up and working with large scale Infrastructure and Data Platforms.


Proficient in designing self-service architecture for internal teams, enabling them to use the Data Platforms as PaaS.


Good experience in Datalake/Data-Warehouse operations. Good experience working in Technology Roadmapping and Stakeholder management.


Good knowledge in optimizing performance through software-hardware co-design, including CPU, GPU, DPU and ASIC.


Proficient in evaluating emerging technologies and trends. Advising clients on technology selections, design improvements and implementations

Overview

12
12
years of professional experience

Work History

Resident Solutions Architect

Starburst Canada
01.2021 - Current

Description: Starburst is the analytics engine for all types of data. It provides the fastest, most efficient analytics engine for a data warehouse, data lake, or data mesh. It unlocks the value of distributed data by making it fast and easy to access, no matter where it lives. Starburst queries data across any database, making it instantly actionable for data-driven organizations platform.


Role:

  • Advising multiple clients on technology selections, design improvements and implementations
  • Grew combined Annual Recurring Revenue (ARR) from under 1 million to over 2.4 million in 15 months through implementing various growth strategies, including expanding the customer base and improving customer retention
  • Design the architecture for integrating Trino into customer’s existing platforms
  • Guide the clients through the onboardings and rollout of new use cases
  • Help tune the platforms for maximum performance and usability
  • To have regular meetings with customers to discuss about any issues/expansion plans or trainings required


Tools Used: Trino, Multi Cloud, Kubernetes, Ansible, Kafka, Hive, Teradata, Postgres, Git, Docker, Ranger

Senior Big Data Architect

MothersonSumi Project
01.2020 - 01.2021

Description: Working with multiple clients and teams in identifying the improvements areas in their projects. Designing and developing scalable and robust data engineering and data insight solutions using Nifi, Dremio, Python and Big Data platform.


Role:

  • Handling pre-sales and deal solutioning for the account
  • Advising multiple clients on technology selections, design improvements and implementations
  • Building software and systems to be responsible for the infrastructure and data pipelines
  • Migration planning of tools and technologies from legacy systems to microservices architecture
  • Lead the design for lakehouse and data pipelines/ETL using existing and emerging technologies
  • Manage two separate projects with 10 members team of data engineers
  • Communicate project status to different stakeholders


Tools Used: Linux, Hadoop, Impala, Python, Nifi, Dremio, Infor, Docker, Grafana, Snowflake, AWS

Lead Data Engineer

DBS Singapore
11.2018 - 09.2020

Description: Delivering Big Data Platforms as a service to internal customers in a self-service environment. Apart from maintaining and automating the existing platform, the team is supposed to explore other useful technologies and integrate them in the existing platform to support analytics, data science and other projects.


Role:

  • Explore and evaluate new technologies and solutions to push our capabilities forward and take on tomorrow’s problems not just today’s
  • Building software and systems to be responsible for the infrastructure and applications through automation using Infrastructure as a Code model
  • Platform capacity planning and management
  • Developing ETL pipelines using NiFi and Python
  • Help improve reliability, stability and tackle scalability challenges with engineering teams
  • Responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response
  • Optimization of the Big Data Platform


Tools Used: Linux, Hadoop, HIVE, Python, Cloudera Manager, Docker, Nifi, Ansible, Graylog, Grafana, ELK, Impala, Spark, Presto

Tech Lead

Royal Bank Of Scotland, RBS
11.2017 - 11.2018

Description: EDW is the Enterprise level data Warehouse having a business use case for Analytics purposes. EDW is OLAP system used mainly for Analysis and Reporting for Regulatory and as a single ledger across the organization. As a part of Big Data Solution, the team is handling development of four different use cases and actively involved in POCs for upcoming use cases. Cluster size is 150 nodes in Prod.


Role:

  • Working with multiple teams in understanding the requirement’s and shaping the overall architecture
  • Developing and Implementing the use cases
  • Managing the Big Data Platform team for RBS
  • Maintenance and optimization of the cluster
  • Automation of manual tasks using Python and shell scripting
  • Resolving job failures/performance issues
  • Be involved in change, release and incident management
  • Cluster storage planning
  • Implementation of various tools and frameworks in the Hadoop
  • Resolving the issues within SLA’s


Tools Used: Linux, Hadoop, HIVE, Python, Cloudera Manager, Sqoop, Kafka, Tableau, HBase, Ansible, Adobe Project: Big Data Platform

Tech Lead

Tech Mahindra
05.2016 - 11.2017

Description: Data is generated through multiple source systems which are fed into hdfs using different data importing tools such as Kafka, Sqoop, scripts etc. Data is processed using hive and pig and then later used by different BI teams for reporting purposes. Kyvos cubes are primarily used to reporting along with Tableau. Cluster size is 180 nodes in Prod.


Role:

  • Working directly with the client at client’s location
  • Managing the Big Data Platform for Adobe
  • Maintenance and optimization of the cluster
  • Automation of manual tasks using Jenkins, Python and shell scripting
  • Resolving job failures/performance issues
  • Handling user’s queries/ issues
  • Cluster storage planning
  • Implementation of various tools and frameworks in the Hadoop
  • Resolving the issues within SLA’s


Tools Used: Linux, Hadoop, HIVE, Python, Cloudera Manager, Sqoop, Kafka, Kyvos, Tableau, Hbase, Docker, Kubernetes

Consultant

Morgan Stanley
06.2015 - 05.2016

Description: Data is generated through calls and web usage. The data contains the call transactions, which are used as the source for the populating the data warehouse. The data is also generated through web usage by any user. Hadoop Team is getting daily feeds in form of csv files which are stored on HDFS, and then pig jobs are run on that data for ETL. After which Tableu is used to graphical analysis.


Role:

  • Major role in developing the platform architecture
  • Optimizing the tuning parameters of cluster
  • Resolving job performance issues
  • Implementing security in cluster using Kerberos
  • Cluster set up for Hadoop and queue monitoring
  • Implementation of various tools and frameworks in the Hadoop.
  • Implementing the new Yarn framework in Hadoop.


Tools Used: Linux, Hadoop, HUE, PIG, Ambari, Sqoop, Kerberos, Tableau

Sr. Hadoop Admin

Micron Technologies
10.2012 - 04.2015

Description: Micron is a world leader in Semi-Conductors. Data is generated in the process of forming a semi-conductor chip from wafers, which is finally stored in csv files. Hadoop Team is getting daily feeds in form of csv files which are stored on HDFS, and then hive jobs are run on that data for manipulations. After which Neo4j is used to graphical analysis.


Role:

  • Major role in developing the platform architecture
  • Optimizing the tuning parameters of cluster
  • Resolving job performance issues
  • Implementing security in cluster using Kerberos
  • Cluster set up for Hadoop and queue monitoring
  • Implementation of various tools and frameworks in the Hadoop.
  • Implementing the new Yarn framework in Hadoop.
  • Used Sqoop to transfer data between Hadoop and other relational databases .


Tools Used: Linux, Hadoop, Hive, Ambari, Sqoop, Kerberos, Neo4j

Team member

AXA
03.2011 - 09.2012

Description: This was a data migration project for development as well as production support, based on onsite/offshore delivery model. The data is loaded into Siebel base tables after applying business logic to the raw data received from ODS using Informatica. This also includes monitoring and fixing production bugs, creating new Informatica mappings, scheduling of workflows for loading new data.


Role:

  • Handling of all kind of Development as well as Support activities
  • Conducting status meeting with client using traceability matrix
  • Designed the ETL processes using Informatica tool to load data from File system, and Oracle into the target Oracle 11g database
  • To design Informatica mappings by using various basic transformations like filters, routers, source qualifiers, lookups etc
  • And advance transformations like aggregators, normalizers, sorters etc
  • Developed 16 mappings/sessions including complex mappings using Informatica power center for data loading
  • Used Informatica Workflow Manager & Monitor to Create, Schedule, and Monitor and Send the Messages in case of process failures


Tools Used: Informatica, PL/SQL, Unix, Cognos, Essbase

Education

Bachelor of Technology -

KEC College
2010

Skills

Distributed Computing, Big Data, Microservices, All major Clouds, Datalakes/Lakehouse, TOGAF, Snowflake

Python, SQL, Shell scripting, Starburst, CDP, Ansible, Docker, Kubernetes, Archi, Informatica, Trino, Druid, Dremio, Jenkins, NiFi, Kafka, ELK Stack, Grafana, Prometheus, Birst, Impala, Trino, Git, Jenkins, JMeter, LucidChart, Airflow, MySql, No-SQL Databases, Neo4j, Postgres

Solutions Archirect, Technical Architect, Technical Account Manager

Accomplishments

  • Cloudera Certified Hadoop Administrator – Certified in Dec-2012 (CCAH License No- 100-002-018).

Work Availability

monday
tuesday
wednesday
thursday
friday
saturday
sunday
morning
afternoon
evening
swipe to browse

Timeline

Resident Solutions Architect

Starburst Canada
01.2021 - Current

Senior Big Data Architect

MothersonSumi Project
01.2020 - 01.2021

Lead Data Engineer

DBS Singapore
11.2018 - 09.2020

Tech Lead

Royal Bank Of Scotland, RBS
11.2017 - 11.2018

Tech Lead

Tech Mahindra
05.2016 - 11.2017

Consultant

Morgan Stanley
06.2015 - 05.2016

Sr. Hadoop Admin

Micron Technologies
10.2012 - 04.2015

Team member

AXA
03.2011 - 09.2012

Bachelor of Technology -

KEC College
Amit Tyagi