High Performance Computing Manager

about 8 hours ago
Full time role
Hybrid · New York, NY, US... more

Who we are: First Street is the industry standard for physical climate risk data. We use transparent and peer-reviewed methodologies to calculate the past, present, and future climate risk for every property in the world. We started eight years ago by working with the world’s leading climate scientists to create groundbreaking, climate-adjusted, property specific models and haven’t stopped. 

Our mission: We exist to connect climate change to financial risk

Our data: We create physics-based, deterministic models of flooding, wildfire and hurricanes, and advanced statistical models of extreme heat, air quality, drought, hail, severe convective storms, winter storms, and more. All of this data is used to create property-level financial risk metrics and macroeconomic variables to quantify the impacts of climate change. 

Our customers: We empower governments at the highest levels to make smart regulations, businesses to avoid bad investments, and everyday Americans to understand their personal risk from climate change. We are relied on every day by:

  • Agencies ranging from the U.S. Department of Treasury to Fannie Mae

  • The world's biggest banks such as Bank of America and Wells Fargo 

  • Institutional investors like Nuveen and Blackstone 

  • Millions of users on Redfin, Realtor.com, Homes.com, and more 

We believe:  Our work needs to match the pace and scope of the climate problem. This is why we have invested tens of millions of dollars into our science, data, people, and products and have raised tens of millions more to move even faster.  Read more about our culture here.

Come join us and use your talents to create solutions to address humanity's biggest problem.

Position Overview: The High Performance Computing Manager will be responsible for the administration and optimization of research and development as well as production activities on our on-premises Linux cluster, and managing computational workloads across various platforms, including AWS and other cloud services. This role will involve maintaining the linux-based compute environment, installing and maintaining compute libraries and software packages, utilizing Docker and related technologies, deploying and managing compute jobs using Slurm, developing and maintaining scripts in bash and python, and ensuring efficient operation of our GitHub repositories for collaborative development.

Key Responsibilities:

  • Cluster Administration: Administer and maintain an on-premises Linux cluster running Ubuntu, including system updates, performance tuning, and troubleshooting.

  • Cloud Compute Management: Deploy, manage, and optimize compute jobs on AWS and other cloud platforms, ensuring seamless integration with existing workflows.

  • Job Management: Utilize Slurm for job scheduling and resource management, optimizing job queues and ensuring efficient use of computational resources.

  • Scripting and Automation: Develop and maintain bash and python scripts to automate tasks, streamline workflows, and enhance computational efficiency.

  • Repository Maintenance: Oversee and manage GitHub repositories, including version control, branching strategies, and collaborative code development.

  • Collaboration: Work closely with scientists, researchers and developers to understand computational needs, provide technical support, and ensure that computational resources align with project requirements.

  • Documentation: Maintain comprehensive documentation for system configurations, processes, and best practices.

Qualifications:

  • Education: Bachelor’s degree in Computer Science, Environmental Sciences, Applied Mathematics, or a related field. Advanced degrees or relevant certifications are a plus.

  • Experience: Proven experience managing Linux clusters and commercial cloud computing platforms. Hands-on experience with Slurm job scheduling, bash, and python scripting is essential.

  • Technical Skills:

    • Proficiency in administering Linux-based systems, specifically Ubuntu.

    • Experience with cloud computing platforms such as AWS, Azure, and/or Google Cloud.

    • Strong knowledge of Slurm for job scheduling and resource management.

    • Proficiency in linux utilities, bash and python scripting for automation and workflow optimization.

    • Experience managing GitHub repositories, including version control and collaboration tools.

  • Soft Skills:

    • Strong problem-solving skills and attention to detail.

    • Excellent communication skills and the ability to work collaboratively with interdisciplinary teams.

    • Ability to manage multiple tasks and projects simultaneously in a dynamic environment.

  • Nice to have skills:

    • Experience with massively parallel, cloud-based High Performance Computing

    • Knowledge of very large volume datasets and HDF/netCDF, Zarr, Xarray, and similar technologies

    • Experience with running large physics-based models, including weather forecasting (e.g. WRF) and hydrology (e.g. HEC-RAS) applications.

How we work: 

  • Drive: We are driven by the role we play in connecting climate change to financial risk 

  • Impact: We only focus on things that move the needle 

  • Urgency: We move quickly because the world depends on it 

  • Resilience: We have a growth mindset in all that we do

What we offer: 

  • Competitive salary commensurate with experience 

  • Ownership interest in the company via Employee Stock Option Plan 

  • Hybrid Schedule with in-office work days on Monday, Wednesday and Thursday 

  • 15 vacation days along with 13 company holidays and 10 sick days 

  • Health benefits covered at 100% for employee or a significant contribution for family plans 

  • Vision and dental benefits with partial employee contribution

  • 12 weeks of paid parental leave 

  • Access to One Medical, Teledoc, HealthAdvocate, Kindbody, and Talkspace

  • Company 401k program 

  • Commuter benefits 

  • Life Insurance

  • Tech startup environment 

  • Weekly team meals and an office stocked with coffee and snacks 

  • Working on the world’s biggest issue with other passionate professionals 

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.