HPC Systems Site Lead - Los Alamos, NM - DOE Q Clearance Needed

SOC/Day & Zimmermann Federal Services

Yesterday
Top Secret
Unspecified
Unspecified
Los Alamos, NM (On-Site/Office)

HPC Systems Site Lead needed for a Direct Hire opportunity with SOC's client to work onsite in Los Alamos, NM.

*Candidates must have an active (DOE) Q Clearance or have held one in the past 3 years to be considered for this role. Also open to candidates with a TS.

Job Description
Contributions impact technical components of Client products, solutions, or services regularly and sustainable. Applies advanced subject matter knowledge to solve complex business issues and is regarded as a subject matter expert. Provides expertise and partnership to functional and technical project teams and may participate in cross-functional initiatives. Exercises significant independent judgment to determine best method for achieving objectives. May provide team leadership and mentoring to others.

Responsibilities:
Service delivery
  • Maintain the HPC systems availability to the customer
  • Lead technical output of on-site client HW technicians, system admins, and system analysts
  • Serve as primary customer focal point for system support of systems and on-site activities
  • Full-time 100% presence on customer site for standard business hours.
  • Routine face-to-face and group interaction with site team to organize tasks, follow up, and assist with challenges they encounter
  • Track system health and Cases, review regularly (weekly) with customers and HPC leadership
  • Maintaining availability reports for tracking SLA's
  • Pre-plan system upgrades; review plans with team and customers, arrange for staffing and equipment, including pre-arrange open lines of communication in case of issues
  • Escalate Cases and assist team members escalating Cases to next-tier support, and follow-up to drive closure via escalation processes
  • Manage on-site parts inventory using business tools
  • Manage site tools and equipment
  • Maintaining the on-call schedule to support our 365 24x7 contracts
  • Assisting with hardware and system installation activities in new systems
Team support
  • Build strong working relationships with teammates, leadership, and customers
  • Maintain awareness of upcoming training and prompt team members to complete trainings
  • Maintain a team calendar of planned leave including on-call schedule for operational issues
  • Provide performance review input to the District Service Manager (DSM) and suggestions for team member performance and development.
  • Escalate to DSM any personnel issues, risk of missing SLA, or customer satisfaction
  • Maintain a clean and safe working environment
  • Support DSM in on-boarding new team members by providing site-specific details (e.g. customer network accounts, badge, parking, etc.)

Required Qualifications & Experience:
  • 8+ years of professional experience and a Bachelor of Arts/Science or equivalent degree in computer science or related area of study; without a degree, three additional years of relevant professional experience (11+ years in total).
  • In-depth knowledge of high-performance computing (HPC) systems.
  • Proficiency in managing and optimizing HPC environments, including system configuration, performance tuning, and troubleshooting.
  • Strong understanding of parallel computing, cluster management, and distributed computing technologies.
  • Experience with HPC workload managers and schedulers such as SLURM, PBS, or similar.
  • Advanced knowledge of Linux operating systems.
  • Familiarity with software development tools and environments commonly used in HPC, including compilers, debuggers, and performance analysis tools.
  • Experience with various scripting languages such as Python or Bash.
  • Proven experience in system administration, including hardware and software installation, maintenance, and upgrades.
  • Knowledge of network architecture, storage solutions, and data management within HPC environments.
  • Ability to implement and manage security protocols and best practices in a high-performance computing context to maintain customer security posture.
  • Strong project management skills, including planning, execution, and monitoring of HPC projects.
  • Ability to lead and coordinate a team of technical professionals, ensuring timely and successful project delivery.
  • Experience in resource allocation, budgeting, and performance metrics tracking for HPC projects.
  • Excellent problem-solving abilities, with a focus on identifying root causes and implementing effective solutions.
  • Strong analytical skills to assess system performance and make data-driven decisions for optimization.
  • Ability to troubleshoot complex technical issues in a high-stakes HPC environment.
  • Exceptional communication skills, both written and verbal, to effectively interact with team members, stakeholders, and clients.
  • Ability to convey complex technical information in a clear and concise manner to non-technical audiences.
  • Strong collaboration skills to work effectively within a multidisciplinary team and across organizational boundaries.
  • Extensive experience in HPC system management and administration, with a track record of successful project and team leadership.
  • Willingness to participate in ongoing professional development and training opportunities which may require travel.

Preferred Qualifications:
  • CompTIA A+ or Server+ Certification
  • Security+ Certification
  • Linux+ Certification
  • PMP or Project+
  • Vendor Certifications
  • Experience with ticket-tracking software (Salesforce, SmartSheets : any ticket tracking is good)

Employment Prerequisites
The following requirements must be met to be eligible for this position: successful completion of a background investigation and d rug urinalysis.

SOC, a Day & Zimmermann company, is an Equal Opportunity Employer, EOE AA M/F/Vet/Disability.

Estimated Min Rate: $87500.00
Estimated Max Rate: $125000.00
group id: cxhlpand

Diversity is one of our core values as a Company, and it’s also something very personal and unique to each employee. Who better to tell our story of diversity than the people who are part of that story. “The Many Diverse Voices of Betterment” shares how our unique backgrounds and perspectives make us stronger, together, as a Company as a whole, and as individuals. Our diverse and inclusive culture and what diversity means at SOC and Day & Zimmermann is told through personal, unscripted first-person narratives.

Find SOC/Day & Zimmermann Federal Services on Social Media
Network Employers (9)
Direct Hire Recruiter
Recruiter
Principal Recruiter
Direct Hire Recruiter
National Federal Services Recruiter
About Us
SOC is an experienced mission support provider with a reputation for delivering responsive and agile solutions in support of national security interests in high-threat environments. SOC is an integrated provider of mission support solutions through our global security, operations and maintenance, architecture and engineering, and staffing services to the U.S. Government and commercial clients. We work side-by-side with our customers including, the U.S. Departments of State, Energy, and Defense, the Intelligence Community, other federal agencies, and non-governmental organizations, providing and helping create safe and secure environments in which they can perform their best work.

SOC/Day & Zimmermann Federal Services Jobs


Clearance Level
Top Secret