Staff Network Engineer

about 2 months ago
Full time role
Hybrid · San Francisco, CA, US... more

Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated,  purpose-built AI infrastructure solutions trusted by Fortune 500 companies to power their most advanced AI applications.
Crusoe is redefining AI cloud infrastructure, with a mission to align the future of computing with the future of the climate. Our AI platform is recognized as the "gold standard" for reliability and performance. Our data centers are optimized for AI workloads and are powered by clean, renewable energy.

Be part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure.

About This Role:

Crusoe Cloud Network Engineering team is actively looking for an ambitious, experienced Staff Network Operations Engineer to join our Network Operations team. The Staff Network Operations Engineer will help lead a talented team of network engineers across the globe that implements and operates the global edge, backbone, and data center network for high-performance compute (HPC) clusters with GPUs. The ideal individual will be highly motivated, hands-on, self-directed, and desire to work on cutting-edge technologies related to the environment. You must possess excellent analytical and communication skills and be a great team player.

You will play a pivotal role in ensuring the seamless operation, security, and scalability of our global network infrastructure. You will help the network engineers and specialists implement and manage network solutions supporting Crusoe Cloud. In this key position, you will also manage the Network Operations Center (NOC) to provide 24/7 monitoring and management of our network infrastructure. This will ensure continuous operational coverage, rapid incident response, and high availability of network services to meet the demands of our customers.

A Day In The Life:

  • Collaborate with other teams and departments to ensure the network meets the needs of the business.

  • Monitor network performance and troubleshoot issues as needed.

  • Manage vendor relationships and contracts related to network services and equipment.

  • Stay up to date with industry trends and emerging technologies.

  • Provide training and mentorship to team members to support their professional development.

  • Balance immediate needs versus long-term goals across a diverse collection of stakeholders.

  • Respond to and address outages and escalations and provide status updates on each.

  • Manage day-to-day activities, prioritizing workloads across the team and overseeing daily and weekly activities.

  • Ensures that the responsibilities and accountability of all direct reports are defined and understood.

  • Report department metrics and statistics to the Director of Network Engineering weekly.

  • Act as a sounding board for the team when they provide expert-level troubleshooting to identify and resolve network-related incidents for Crusoe Cloud quickly.

  • Make recommendations to senior management on network infrastructure modifications and version upgrades while controlling costs and scaling for future growth.

You Will Thrive In This Role If:

  • 10+ years of related experience operating at scale in a production environment

  • Strong knowledge of network protocols, including TCP/IP, QoS, BGP, OSPF/IS-IS, EVPN, VXLAN, QoSand MPLS-related technologies like RSVP-TE, LDP, etc.

  • Strong understanding of network monitoring protocols and tools like SNMP, IPFIX, Sflow/netflow, and Telemetry.

  • Familiar with data center network architecture, such as Fat Tree architecture, CLOS, BGP-TE, and peering for edge.

  • Hands-on experience with major network devices like Mellanox, Cisco, Arista, Juniper, and other mainstream vendors.

  • Familiar with mainstream commercial switch/router chipsets, such as Broadcom, Barefoot, etc.

  • Familiar with technologies like RDMA, Infiniband, and RoCE will be a plus.

  • In-depth knowledge of public cloud architecture connectivity options to AWS, GCP, Azure, Ali Cloud, OCI, etc.

  • Good understanding of IPv6 and IPv4-IPv6 coexistence technologies.

  • Programming/scripting in Python, Ansible, Puppet, Chef, or other languages will be a plus.

  • Team player and participate in Cloud network global on-call rotation.

  • Bachelor's in Computer Science, Information Science, Engineering, Mathematics, or a related field, or experience equivalent to a Bachelor's degree based on three or more years of work experience

  • Must be able to pass a background check

  • Embody the Company values

Benefits: 

  • Hybrid work schedule

  • Competitive Paid Time Off

  • Industry competitive pay

  • Retirement benefits

  • Healthcare benefits including Medical, Dental, and Vision

  • Short and Long-Term Disability Insurance

  • Life Insurance

  • Paid Parental Leave

  • Subscription to Calm App

Compensation Range:

Compensation will be paid as salary. Restricted Stock Units are included in all offers. Compensation to be determined by the applicant’s education, experience, knowledge, skills, and abilities, as well as internal equity and alignment with market data.

Crusoe Energy is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.

Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.