Lead Systems Engineer
Company: Deloitte
Location: Reno
Posted on: May 15, 2022
Job Description:
Deloitte is hiring to scale up and the next generation of its AI
infrastructure. Deloitte recently launched the Deloitte Center of
AI Computing, a first-of-its-kind center designed to accelerate the
development of innovative AI solutions at Deloitte. This position
will help harness the value of NVIDIA DGX and industry-leading
software and GPU technologies used in AI workloads and HPC.
This is an opportunity to join the team and be fully hands-on to
help drive technical strategy, cluster management, AI workload
design, and make such resources available to a broad base of
engineering and consulting groups. Work within the Deloitte Center
for AI Computing to deliver and optimize the accelerated platform
that allows Deloitte to co-innovate with clients and expedite the
development of new AI applications. The candidate will help define
solution integration practices and represent emerging technology
needs inside the company with emphasis on the latest GPU
technologies for high performance computing.
Work you'll do
- Work within the Deloitte data center group to design and
optimize architecture and technical specifications for the NVIDIA
hardware-enabled data center
- Collaborate with internal stakeholders to understand future
NVIDIA deployments and data center requirements to support their
project exigencies and improve DGX mini-pod efficiency in a
Kubernetes based platform,
- Utilize your capabilities in DevOps, DataOps, and IT
architecture design to streamline infrastructure processes and
optimize scheduling and performance
- Understand business priorities and future goals and translate
to optimized infrastructure tools and improved service
delivery
- Coordinate with data science and business teams to identify
future AI needs and requirements
- Manage system operations for data processing and AI
workloads
- Provide technical architecture leadership with the
responsibility to ensure the efficient use of resources, the
selection of appropriate technology, and the use of appropriate
design methodologies
- Provide leadership and prioritization for business/product
stakeholders in understanding requirements and translating them to
engineering requirements
- Analyze current data designs to optimize and provide structural
improvements to handle the growth of workloads
- Support and enhance data architecture, data instrumentation,
define database schemas, create ETL pipelining as required to
internal customer needs
- Work in ML-Ops/DevOps/GIT-Ops to bring up new systems and
supporting existing orchestration and data services.
- Influence domain-driven design and architecture principles of
data lineage, data security, data privacy, data uptime, and
reliability
- Define and grow HPC/AI infrastructure and best practices
- Set-up alerting, reports, and dashboard for monitoring overall
system health
- Provide reports and updates on usage, maintenance, and
debugging in an agile environment
Requirements
- Bachelor's in computer science (CS), Computer Engineering
(CSEE), or related STEM field and/or equivalent professional
experience
- Expert programming skills in Linux Shell/CLI, Bash, Python, and
Go
- Strong knowledge and understanding of CI/CD processes and
deployment tools, including ArgoCD, Kubernetes, Helm, and
Docker
- Experience with resource management systems and job scheduling,
including running and debugging parallel programs
- Experience developing large-scale data management systems (PB+)
and serving dozens of users/data scientists
- Experience with provisioning and configuration management
tools; Puppet, Ansible, Chef, Terraform, etc
- Excellent critical thinking, verbal communication, and
problem-solving skills
- Must be legally authorized to work in the United States without
the need for employer sponsorship, now or at any time in the
future
Preferred Qualifications
- MS/Ph.D. in Computer Science (CS), Computer Engineering (CSEE),
Electrical Engineering (EE), or related/relevant STEM degree.
- Familiarity with GPU computing (CUDA, OpenCL) and HPC system
software stack
- Experience with enabling modern deep learning software
architectures and frameworks including TensorFlow, Pytorch or other
frameworks
- Ability to engage executive audiences to lead AI engineering
and data science engagements
The Team
Information Technology Services (ITS) helps power Deloitte's
success. ITS is the engine that drives Deloitte, which serves many
of the world's largest, most respected organizations. We develop
and deploy cutting-edge internal and go-to-market solutions that
help Deloitte operate effectively and lead in the market. Our
reputation is built on a tradition of delivering with
excellence.
The -2,200 professionals in ITS deliver services including:
- Security, risk & compliance
- Technology support
- Infrastructure
- Applications
- Relationship management
- Strategy
- Deployment
- PMO
- Financials
- Communications
Technology & Infrastructure
The Technology and Infrastructure Organization works together to
transform how ITS deploys technologies and services that meet the
dynamic needs of Deloitte professionals and help increase their
productivity.
Service lines:
- Unified Communications
- Infrastructure Operations
- Office of Technology and Infrastructure
- Service Management
- Solutions Delivery
- Technology Support Services
- Visual Technology and Solutions
- Cloud Solutions Center
EA_ExpHire
Keywords: Deloitte, Reno , Lead Systems Engineer, Other , Reno, Nevada
Didn't find what you're looking for? Search again!
Loading more jobs...