Today
Top Secret
Unspecified
Unspecified
Los Alamos, NM (On-Site/Office)
HPC Systems Site Lead needed for a Direct Hire opportunity with SOC's client to work onsite in Los Alamos, NM.
*Candidates must have an active (DOE) Q Clearance or have held one in the past 3 years to be considered for this role. Also open to candidates with a TS.
Job Description
Contributions impact technical components of Client products, solutions, or services regularly and sustainable. Applies advanced subject matter knowledge to solve complex business issues and is regarded as a subject matter expert. Provides expertise and partnership to functional and technical project teams and may participate in cross-functional initiatives. Exercises significant independent judgment to determine best method for achieving objectives. May provide team leadership and mentoring to others.
Responsibilities:
Service delivery
Required Qualifications & Experience:
Preferred Qualifications:
Employment Prerequisites
The following requirements must be met to be eligible for this position: successful completion of a background investigation and d rug urinalysis.
SOC, a Day & Zimmermann company, is an Equal Opportunity Employer, EOE AA M/F/Vet/Disability.
Estimated Min Rate: $87500.00
Estimated Max Rate: $125000.00
*Candidates must have an active (DOE) Q Clearance or have held one in the past 3 years to be considered for this role. Also open to candidates with a TS.
Job Description
Contributions impact technical components of Client products, solutions, or services regularly and sustainable. Applies advanced subject matter knowledge to solve complex business issues and is regarded as a subject matter expert. Provides expertise and partnership to functional and technical project teams and may participate in cross-functional initiatives. Exercises significant independent judgment to determine best method for achieving objectives. May provide team leadership and mentoring to others.
Responsibilities:
Service delivery
- Maintain the HPC systems availability to the customer
- Lead technical output of on-site client HW technicians, system admins, and system analysts
- Serve as primary customer focal point for system support of systems and on-site activities
- Full-time 100% presence on customer site for standard business hours.
- Routine face-to-face and group interaction with site team to organize tasks, follow up, and assist with challenges they encounter
- Track system health and Cases, review regularly (weekly) with customers and HPC leadership
- Maintaining availability reports for tracking SLA's
- Pre-plan system upgrades; review plans with team and customers, arrange for staffing and equipment, including pre-arrange open lines of communication in case of issues
- Escalate Cases and assist team members escalating Cases to next-tier support, and follow-up to drive closure via escalation processes
- Manage on-site parts inventory using business tools
- Manage site tools and equipment
- Maintaining the on-call schedule to support our 365 24x7 contracts
- Assisting with hardware and system installation activities in new systems
- Build strong working relationships with teammates, leadership, and customers
- Maintain awareness of upcoming training and prompt team members to complete trainings
- Maintain a team calendar of planned leave including on-call schedule for operational issues
- Provide performance review input to the District Service Manager (DSM) and suggestions for team member performance and development.
- Escalate to DSM any personnel issues, risk of missing SLA, or customer satisfaction
- Maintain a clean and safe working environment
- Support DSM in on-boarding new team members by providing site-specific details (e.g. customer network accounts, badge, parking, etc.)
Required Qualifications & Experience:
- 8+ years of professional experience and a Bachelor of Arts/Science or equivalent degree in computer science or related area of study; without a degree, three additional years of relevant professional experience (11+ years in total).
- In-depth knowledge of high-performance computing (HPC) systems.
- Proficiency in managing and optimizing HPC environments, including system configuration, performance tuning, and troubleshooting.
- Strong understanding of parallel computing, cluster management, and distributed computing technologies.
- Experience with HPC workload managers and schedulers such as SLURM, PBS, or similar.
- Advanced knowledge of Linux operating systems.
- Familiarity with software development tools and environments commonly used in HPC, including compilers, debuggers, and performance analysis tools.
- Experience with various scripting languages such as Python or Bash.
- Proven experience in system administration, including hardware and software installation, maintenance, and upgrades.
- Knowledge of network architecture, storage solutions, and data management within HPC environments.
- Ability to implement and manage security protocols and best practices in a high-performance computing context to maintain customer security posture.
- Strong project management skills, including planning, execution, and monitoring of HPC projects.
- Ability to lead and coordinate a team of technical professionals, ensuring timely and successful project delivery.
- Experience in resource allocation, budgeting, and performance metrics tracking for HPC projects.
- Excellent problem-solving abilities, with a focus on identifying root causes and implementing effective solutions.
- Strong analytical skills to assess system performance and make data-driven decisions for optimization.
- Ability to troubleshoot complex technical issues in a high-stakes HPC environment.
- Exceptional communication skills, both written and verbal, to effectively interact with team members, stakeholders, and clients.
- Ability to convey complex technical information in a clear and concise manner to non-technical audiences.
- Strong collaboration skills to work effectively within a multidisciplinary team and across organizational boundaries.
- Extensive experience in HPC system management and administration, with a track record of successful project and team leadership.
- Willingness to participate in ongoing professional development and training opportunities which may require travel.
Preferred Qualifications:
- CompTIA A+ or Server+ Certification
- Security+ Certification
- Linux+ Certification
- PMP or Project+
- Vendor Certifications
- Experience with ticket-tracking software (Salesforce, SmartSheets : any ticket tracking is good)
Employment Prerequisites
The following requirements must be met to be eligible for this position: successful completion of a background investigation and d rug urinalysis.
SOC, a Day & Zimmermann company, is an Equal Opportunity Employer, EOE AA M/F/Vet/Disability.
Estimated Min Rate: $87500.00
Estimated Max Rate: $125000.00
group id: cxhlpand