POSITION/TITLE: HPC System Administrator
EMPLOYMENT STATUS: Regular/Full Time
REPORTS TO: Project Manager
CLEARANCE: Secret or High ADP
POSITION SUMMARY: Responsible for advanced installation, configuration, and management of Unix/Linux operating systems and related toolsets, supporting complex enterprise-level applications, including HPC. The HPC administrator will work closely with other professionals to help to set direction and strategy; evaluate and test emerging HPC technologies; design, deploy and support HPC Solutions (software, middleware, hardware, etc.); tune and optimize software and algorithms for the HPC infrastructure
ESSENTIAL DUTIES & RESPONSIBILITIES:
Help direct HPC optimization efforts across R&D, Software Development, and Production Infrastructure.
Provide expert advice to HPC users and specialists.
Monitor industry developments in HPC hardware and software.
Assist in coordinating HPC activities and events.
Assist in managing relationships with HPC vendors.
ADDITIONAL DUTIES & RESPONSIBILITES:
Manage above areas as per contract’s scope and any new customer requirements.
Implement tools to test, deploy, measure, monitor and scale the HPC infrastructure.
Continuously improve processes and tools to ensure the best possible experience for users, from uptime to performance and reliability.
Monitor and maintain Linux systems, including being available after-hours and on weekends as required due to planned outages or emergency situations.
Administer the various Unix/Linux systems that make up the HPC/storage and its support infrastructure.
Build and maintain automation systems to reduce time for builds and deployments and to enable the team to scale its output exponentially over time.
Work closely with the Enterprise Storage, Network, and Server team members to ensure any issues are resolved in a timely manner.
Experience and Competencies:
10+ yrs experience without degree; 6+ yrs experience with Undergraduate degree;
5+ yrs experience with Graduate Degree
Extensive experience building, managing, and administering computational clusters
At least five years experience of Red Hat Linux/SuSE Linux administration, including server builds, OS installation and configuration, patch management, package updates and performance tuning
VMware Virtualization technology implementation experience.
Proficient with the configuration, deployment and troubleshooting of various distributed Proficient with the configuration, deployment and troubleshooting of various SAN fabrics and related technologies (iSCSI, Fibre Channel, Multipathing file systems such as GPFS, GlusterFS, Lustre
Experience with high performance computational interconnects: NUMA5, Infiniband and experience using and tuning high-performance, high-bandwidth networks
Familiarity with building and using monitoring and trend analysis tools (such as Nagios, Ganglia, Cacti, etc.) and familiarity with Scientific Libraries in support of C, C++, Perl, Python, R, etc.
Experience desired in high level scripting languages: Perl, Python, and UNIX shell tools
Proficient with the configuration, deployment and troubleshooting of various network file systems and protocols
Knowledge of version control systems and with large scale SAN environments
Experience of supporting Active Directory/LDAP/NIS.
Knowledge of large NUMA/SMP hardware.
Knowledge of TeraGrid technologies
Bachelors Degree in related field or equivalent combination of education and experience
CERTIFICATES & LICENSES REQUIRED:
Relevant certification a plus