Summary
Overview
Work History
Education
Skills
Accomplishments
Languages
Affiliations
Certification
Interests
Timeline
HIGHLIGHTS OF QUALIFICATIONS
Generic

Igor Chizhov

Toronto,Canada

Summary

A highly competent and methodical professional with an illustrious career history of championing success in creating outstanding technological solutions for complex problems and providing strategic direction to large teams. Looking for a technical position to utilize exemplary skills and working experience to spearhead ambitious projects.

Overview

30
30
years of professional experience
1
1
Certification

Work History

Sr. Technology Architect, Product and Data Management Team

Government of Ontario
10.2024 - 09.2025
  • Designed and deployed scalable ML pipelines across Azure Databricks, leveraging Azure Machine Learning for automated model training, deployment, and monitoring under enterprise-grade MLOps.
  • Architected AI solutions in Azure AI Foundry and Databricks, integrating GenAI, MLOps, and agentic AI workflows to accelerate enterprise AI adoption across data science and analytics teams.
  • Spearheaded the integration of the MIPROv2 optimizer in DSPy, improving agentic AI KPI’s and reducing computation time by 33% through optimized intelligent-assistant interfaces.
  • Developed reusable cloud infrastructure templates for scalable AI workloads on Azure Databricks using Python, PySpark, Docker, and AKS/Kubernetes, enabling rapid deployment for AI solutions.
  • Implemented distributed parallelization using Ray across Databricks, handling heavy workloads from Spark’s JVM layer to worker-based Ray clusters—eliminating executor bottlenecks and boosting throughput.
  • Engineered modular LangChain framework to streamline LLM integration and pattern-based reasoning, enhancing predictive model accuracy and reusability across AI workflows.
  • Used Azure Cognitive Services and Copilot Studio, enabling AI pipelines with natural-language interfaces and contextual GenAI response generation.
  • Designed an agent-based RAG solution on Azure Databricks, leveraging vector databases and empowering intelligent agents to deliver context-aware customer insights.
  • Implemented an AI caching gateway on GCP, optimizing retrieval of LLM responses, reducing duplicate token queries, and improving latency for conversational AI applications.
  • Optimized ML pipelines with MIPROv2 and Ray, automating data preprocessing, model fine-tuning, and evaluation using MLflow for experiment tracking and reproducibility.
  • Built automated ML edge-case detection pipelines in Python, ensuring continuous monitoring, validation, and refinement of deployed models under an enterprise MLOps framework.

Technical Lead Architect, BBM Business Intelligence Department

Bell Canada
02.2022 - 06.2024
  • Delivered multiple AI/ML and LLM data science projects end-to-end, architecting and deploying MLOps and RAG pipelines across AWS Cloud and on-premise environments to support scalable generative AI workloads.
  • Delivered multiple production-grade machine learning systems end-to-end, including regression and forecasting models for financial risk and operational analytics, covering experimentation, cross-validation, deployment, and continuous retraining on Azure and AWS SageMaker.
  • Designed and deployed scalable ML pipelines using Python, MLflow, Docker, and Kubernetes, implementing time-series feature engineering, quantile modeling, and model safeguard techniques to ensure robustness under data drift and seasonal variability.
  • Built and maintained ML lifecycle management frameworks, including model versioning, performance monitoring, and automated retraining strategies, supporting reliable promotion of models from experimentation to production.
  • Engineered a cloud-native ML platform on Azure and AWS, integrating Azure ML, AWS SageMaker, containerized inference services, and Kubernetes for model training, deployment, and scalable online and batch inference.
  • Implemented CI/CD pipelines for ML workflows using Git-based branching strategies, peer code reviews, automated testing, and infrastructure-as-code, reducing model release cycles and improving deployment reliability.
  • Developed a fraud detection and risk scoring system for Bell Accounts Receivable analytics, combining tree-based models and embedding-enhanced classifiers to improve prediction accuracy, explainability, and audit readiness in production environments.
  • Designed event-driven ML architectures using Azure Service Bus and AWS SQS/EventBridge to support near-real-time inference, batch retraining triggers, and integration with downstream enterprise systems.
  • Automated data ingestion, feature generation, and retraining pipelines using workflow orchestration, ensuring consistent data lifecycle management across training, validation, and inference stages.

Sr. Technical Architect, Digital Intelligence Department

TELUS Corporation
02.2019 - 01.2022
  • Completed data platform architecture and design using Python, Docker, and Linux OS. Defines business and technical requirements for OSS tools and software diagnostic workflow to identify the root cause of network performance degradations through artificial intelligence and AIOps functionality.
  • Built a scalable analytics environment that ingested multi-source operational data into a unified BI layer, enabling data teams to run complex reporting and ad-hoc analysis on a consistent, high-volume dataset.
  • Completed deployment of virtual HLR in AWS cloud. CloudWatch is actively used to monitor 5G virtual switches, routers, and HLR in the AWS cloud.
  • Migrated Splunk Implementation on-premise to a GCP cloud solution using Kafka and BigQuery. Collaborated on a team to build a real-time analytics platform using Python, providing actionable insights. Implemented secure and efficient API integrations with third-party services, enhancing system interoperability and expanding functionality for end-users.
  • Contributed to developing an AI/ML platform, utilizing PyTorch and TensorFlow frameworks for training and deploying machine learning models.

Technical Lead, Manager, OSS Department

TELUS Corporation
02.2008 - 01.2019
  • Designed the entire fault management platform (Netcool) from scratch and its operational support based on DevOps and CI/CD principles. The system proved to be very reliable and highly efficient. Implements a multi-year technology roadmap for fault management. The system uptime is three nines.
  • Explored AI perspective for network fault management using Python; studied structured and unstructured raw alarm, and network performance data and used Neural Networks, Random Forests, K-Means, Decision Tree, and Support Vector Machine algorithms.
  • Implemented supervised ML algorithms in Python to automate actionable Remedy trouble ticketing. The algorithm helped to reduce the cost of manual labour in the NOC.
  • Completed PoC projects for several AIOPS predictive analytics to reduce noise and determine only actionable alarms, providing strategic guidance, roadmaps, and technical leadership on selecting and adopting appropriate technologies.
  • Optimized fault management database performance and reduced response times by configuring and maintaining MySQL clusters, achieving a 30% reduction in query response times.
  • Automated cloud Splunk queries and data ingestion pipelines using log aggregation, improving incident response times.
  • Operated and managed a data warehouse based on a Spark cluster (Hadoop Framework) in TELUS private cloud to use these results inside KNIME analytics for machine learning and neural network training; headed a team of software engineers to design a prototype of a containerized and microservice-based fault management in-memory platform for 5G wireless network.
  • Orchestrated the integration of Azure Active Directory with MLOps workflows, ensuring secure and seamless authentication and access control for ML systems and data.
  • Leveraged Spark Databricks to implement advanced analytics and machine learning algorithms, optimizing performance and scalability for large-scale data processing tasks.
  • Completed anomaly detection based on network performance data and used results for proactive network fault management.
  • Developed a custom optimization library for model compression and pruning techniques, allowing the team to optimize and deploy AI models on resource-constrained edge devices.
  • Designed and implemented a high-performance database engine for a distributed in-memory system, leveraging parallel computing techniques and efficient memory management.
  • Implemented ITSM service desk (ITIL) for network trouble processes.
  • Led the creation of a network failure prediction by analyzing log data and utilizing unsupervised learning techniques such as anomaly detection and clustering to uncover hidden network failure patterns.
  • Implemented RESOLVE algorithms to automate business processes for IoT customers; integrated RESOLVE System with Netcool Fault Management platform (PAAS) and developed software run-books to bring automation, orchestrated diagnostics, and triages across heterogeneous networks, reducing incident volume and gaining financial benefits of $3.6M in 2018. The results were presented to the TELUS CEO, and received very positive feedback.

Application Engineer (Contract)

Rogers Communications Inc.
04.2007 - 02.2007
  • Spearheaded simultaneously, over 20 Messaging, OSS (IP/MPLS Cisco Systems), Switch Engineering and Wireless projects, with budgets ranging from $50,000 to $1.4M; managed expansion and upgrading projects of Ericsson and Cisco.
  • Coordinated with various vendors such as Ericsson, Alcatel, Cisco, Flash Networks (software data compression project) and many others for their deliverables, such as equipment installation, scheduling, and hardware and software deployment.
  • Supported the company’s critical night network cutovers and kept track of the project’s deadline, financial spending and project planning.

Application Engineer

AMT Group
01.2004 - 01.2007
  • Managed large IP/MPLS Cisco Systems multi-service WAN (22 Cisco 76-routers and PIX 525 IPSEC VPN firewalls) and handled several IBM Netcool software monitoring projects, the budget for one of which was USD 1.7M; led and collaborated with a team of network engineers providing software and hardware configuration.
  • Deployed contact centers based on Genesys software and Nortel Networks solution; managed OPEX, contract budget, workflow, and NGN development, based on soft-switch solutions such as Cisco PGW 2200 and Comverse (IP Centrex).

Network Engineer (Contract)

DeepMetrix Corporation
01.2002 - 01.2003
  • Organized and implemented successfully the project focused on designing a Web analytics TCP/IP Internet log analyzer, which was purchased by Microsoft in 2006.
  • Created innovative solutions for the customers of DeepMetrix, and developed and implemented plans to strengthen and increase the market share.
  • Delivered technical support for IP network monitoring software, consulted DeepMetrix customers, solved technical problems, handled customer objections, and assisted in closing sales.

Application Engineer

VIAVI (formerly JDS Uniphase Corporation)
01.1999 - 01.2001
  • Made technical presentations, advised customers on products, and completed several fibre-optics design improvement projects; performed ray-tracing and optical performance analysis of LC-based optical switching and LC and MEMS-based Dynamic Gain Equalization modules.
  • Provided engineering analysis of telecommunications components and modules; ran network computer simulations for passive components (OC3, OC48, and OC192 standards) and assisted senior management with product development.
  • Prepared over 450 technical proposals, each exceeding USD 0.5M for active and passive fibre optic components; led the development and engineering processes and collaborated with various functional departments;
  • Developed passive DWDM components and optoelectronic modules for Nortel Networks (OC-48 and OC-192 standards).

Software Developer

Farah USA, Inc.
01.1996 - 01.1999
  • Completed projects in the IT Department that supported senior management and marketing teams, and had a substantial positive impact on corporate operations and profit; reported directly to the CFO, and later to the VP of Operations.
  • Developed software to forecast 4 major financial disaster scenarios for the company; ran different forecast scenarios for company-wide applications; reduced OPEX level by 4-18%.

Education

MBA in Finance -

New Mexico State University
Las Cruces, NM, USA
01.1995

MSc. in Physics - minor in CS

University of Physics and Technology
Dolgoprudny
01.1986

Skills

  • Machine Learning Tools: Octave, MATLAB & Simulink, Jupiter Notebook, MLflow, Kubeflow, Open-Source ML libraries
  • Libraries: TensorFlow, Keras, PyTorch, NumPy, PyCharm, and Scikit-learn
  • Big Data: Hadoop (MapReduce & HDFS), Spark, Snowflake, Yellowbrick, BigQuery, BigTable, Dataproc, PySpark, Storm, Yarn, Kafka, Elastic, PostgreSQL, Splunk, and Teradata
  • Data Platforms: Apache Ignite, Oracle Coherence, MEAN, and Redis
  • RDB: Oracle 10g, 11g, 122 and SAP
  • Cloud Experience: AWS, Azure, and GCP; Docker, APIs, Git and GitLab CI/CD, Terraform & Kubernetes
  • Open-Source Environment: Java Eclipse
  • Network Protocol Knowledge: OSPFv2/v3, Integrated IS-IS, BGPv4, LDP, RSVP-TE, MP-BGP, CDMA, and HSPA;
  • Layer2: Ethernet (IEEE 8023), POS, ITU-T G709, PPP, spoon (ITU-T G984x, IEEE 8023ah)
  • Applications: Netcool, Splunk, SAP, Jira, MS VISIO, MS Office 365, MS Visual Studio, MS Project, SharePoint, Primavera, Jenkins, RESOLVE, and Ansible
  • Languages: Python, SQL, C, Java, Visual Basic, FORTRAN, Assembler
  • Reporting: Cognos 102 and Tableau
  • Knowledge of Inventory Software Systems: NetCracker and MetaSolv (Objectel)
  • Systems: Linux Ubuntu Mate, UNIX, PC DOS, Linux RH, Windows
  • NoSQL DB: Hive, HBase, MongoDB, ScyllaDB and Cassandra
  • Cloud Monitoring: IBM’s ICAM and ASM
  • Incident Management Platforms: Remedy ARS and ITSM
  • ETL: Informatica 96 and SAAS
  • Knowledge of Network/OSS Software: IBM Tivoli Netcool, ITM, TNPM, and TNCM), HP (TeMIP), Nokia/Siemens EMS (NetAct & @vantage Commander) and RESOLVE
  • Telecommunications: VoIP, data, Wi-Fi, WiMAX, SONET (SDH), IP/MPLS, H323, SIP, ISDN, MGCP, J2E, Remedy ARS, WebLogic, hosting & wireless services

Accomplishments

  • Invented and patented a Distributed Fault Management System for communications networks, introducing automated fault detection and resolution mechanisms that significantly improved network reliability and operational efficiency.
  • Led the architecture and delivery of large-scale AI, ML, and agentic AI platforms across telecommunications and public-sector environments, deploying production-grade solutions on Azure, GCP, and AWS with enterprise MLOps, DevOps, and governance standards.
  • Designed and implemented end-to-end machine learning pipelines on Azure Databricks and cloud-native platforms, integrating Ray-based parallelization, DSPy optimizers, MLflow tracking, and automated retraining to reduce inference latency and computation time by up to 33%.
  • Established reusable cloud and data architecture patterns using Python, PySpark, Docker, Kubernetes, and infrastructure-as-code, accelerating AI solution delivery and enabling consistent, repeatable deployments across teams.
  • Delivered multiple generative AI and RAG solutions using LangChain, vector databases, and foundation models, improving contextual reasoning, reducing hallucinations, and enabling explainable, production-ready LLM applications.
  • Built and governed AI/ML platforms with full observability, implementing monitoring for model drift, latency, accuracy, and reliability using Prometheus, Grafana, and cloud-native tooling.
  • Led successful migrations of mission-critical analytics, AI, and fault management systems from on-premise environments to cloud platforms, reducing infrastructure complexity and improving scalability, resilience, and time-to-market.
  • Designed and operationalized AI-driven fraud detection, anomaly detection, and predictive analytics systems combining classical ML and deep learning models, delivering measurable gains in detection accuracy and operational insight.
  • Architected and operated high-availability fault management and analytics platforms achieving three-nines uptime, supporting large-scale telecom networks and enabling proactive, AI-driven network operations.
  • Delivered AIOps and automation initiatives that reduced alarm noise, automated incident triage, and eliminated manual NOC processes, including a flagship automation program that generated $3.6M in annual financial benefits and executive-level recognition.
  • Provided long-term technical leadership across 20+ years, defining multi-year technology roadmaps, mentoring engineers and data scientists, and aligning advanced analytics, AI, and cloud strategies with business objectives in highly regulated, large-scale environments.

Languages

English
Full Professional

Affiliations

Member of PMI (Project Management Institute)

Certification

  • Google Cloud Platform Fundamentals |2021
  • Reinforcement Learning |2021; TensorFlow- Coursera.org |2021
  • Certificate of Data Science – BrainStation |2019
  • Juniper Networks – JNCIA |2006
  • ML with GCP |2021

Interests

Strong interest in emerging technologies and innovation, with a particular focus on exploring new tools, architectures, and practical applications through hands-on experimentation Actively engaged with open-source communities via GitHub, contributing to and learning from real-world software projects Maintains a long-standing appreciation for literature and creative pursuits, supporting analytical thinking and clear communication Pursues swimming as a disciplined physical activity that reinforces focus and endurance, and practices chess to sharpen strategic reasoning, pattern recognition, and problem-solving skills

Timeline

Sr. Technology Architect, Product and Data Management Team

Government of Ontario
10.2024 - 09.2025

Technical Lead Architect, BBM Business Intelligence Department

Bell Canada
02.2022 - 06.2024

Sr. Technical Architect, Digital Intelligence Department

TELUS Corporation
02.2019 - 01.2022

Technical Lead, Manager, OSS Department

TELUS Corporation
02.2008 - 01.2019

Application Engineer (Contract)

Rogers Communications Inc.
04.2007 - 02.2007

Application Engineer

AMT Group
01.2004 - 01.2007

Network Engineer (Contract)

DeepMetrix Corporation
01.2002 - 01.2003

Application Engineer

VIAVI (formerly JDS Uniphase Corporation)
01.1999 - 01.2001

Software Developer

Farah USA, Inc.
01.1996 - 01.1999

MSc. in Physics - minor in CS

University of Physics and Technology

MBA in Finance -

New Mexico State University

HIGHLIGHTS OF QUALIFICATIONS

  • Extensive knowledge of and experience in the telecommunications industry with specialized expertise in data science and governance
  • Inventor of a Distributed Fault Management System for Communications Networks, as described in a US Provisional Patent, enhancing network reliability through automated fault detection and resolution.
  • Proven expertise in IT, Software Development, Big Data, Machine Learning and AI, AIOPS, E-commerce, Web Analytics, Web Technologies & Data Warehousing, well-versed in software programs, applications, programming languages and data platforms
  • Track record of using GCP, AWS, and MS Azure cloud to migrate the existing on-premise solutions.
  • In-depth knowledge of various MLOps and DevOps practices.
  • Over 20 years of software and network system integration experience in the telecommunications industry
  • Thorough knowledge of Agile methodology
Igor Chizhov