NERSC 2017 Summer Student Project Descriptions
Lawrence Berkeley National Lab, Berkeley, CA
http://nersc.gov
Are you an exceptional engineer who likes working on truly challenging projects? Are you passionate about learning and open minded about the way that networks are built? Do you have a passion for organizing and visualizing data to aid in the understanding and development of scientific solutions? Consider spending your summer with the research and development team for Berkeley Lab’s NERSC Division.
More than 5,000 scientists use NERSC to perform basic scientific research across a wide range of disciplines, including climate modeling, research into new materials, simulations of the early universe, analysis of data from high energy physics experiments, investigations of protein structure, and a host of other scientific endeavors.
NERSC is known as one of the best-run scientific computing facilities in the world. It provides some of the largest computing and storage systems available anywhere, but what distinguishes the center is its success in creating an environment that makes these resources effective for scientific research. NERSC systems are reliable and secure, and provide a state-of-the-art scientific development environment with the tools needed by the diverse community of NERSC users. NERSC offers scientists intellectual services that empower them to be more effective researchers.
Summer Student Projects
FILLED: Performance analysis of the NERSC burst buffer workload to accelerate data-intensive discovery
NERSC's flagship system, Cori, is presently the fifth fastest supercomputer in the world with over 700,000 CPU cores and over 1 PB of memory. To support the data needs of a system this fast, Cori's burst buffer is also one of the world's fastest file systems and is capable of over 1.5 TB/second. Because flash is a relatively new technology in supercomputing, NERSC's users are still discovering new ways to use this burst buffer to accelerate their data-intensive science, and NERSC is still fine-tuning the configuration of the burst buffer to deliver the best performance benefit.
To this end, the student assistant will explore the performance data being collected from the burst buffer hardware and identify opportunities for optimization. The main duties include:
Working closely with the NERSC burst buffer team to develop tools that interface with NERSC's ElasticSearch-based Data Collect
Applying machine learning and other statistical analyses to build an understanding of how NERSC's 6,000 users interact with the burst buffer today
Translating this understanding into recommendations on how users can modify their workflows to most effectively utilize the burst buffer
Providing feedback and guidance to NERSC staff on how to configure the burst buffer's default settings to best suit the needs of its users
The qualifications include:
Familiarity with Linux environments is strongly preferred
Interest in statistical analysis techniques (including machine learning) or parallel I/O are essential
Familiarity with (or interest in learning) Python and libraries relevant to data analytics (including scikit-learn, pandas, and matplotlib) are highly beneficial
Undergraduate or graduate student in computer science, mathematics, or a related field. Ambitious high schools students will also be considered.
––––––––––
FILLED: Understanding application performance on manycore processors
NERSC's flagship system, Cori, is presently the fifth fastest supercomputer in the world and uses 68-core, 272-thread Intel Knights Landing processors. Optimizing applications to effectively utilize such a large degree of parallelism has been a multi-year effort that has resulted in a suite of applications that are now running at scale on Cori. NERSC's Advanced Technologies Group (ATG) will begin using these modernized applications to define the performance targets for the next generation of supercomputers, and it is critical that we develop an understanding of what architectural features will be most important on NERSC's next system. Performance analysis of these extreme-scale applications requires extreme-scale profiling tools and insightful data analysis.
The student assistant will work NERSC ATG to use and develop the Integrated Performance Monitoring (IPM) profiling suite to improve our understanding of application performance. The project will focus on one or more of the following areas according to the assistant's strengths and interests:
Project 1. IPM modernization and feature development. The IPM library collects performance data from many sources, including intercepting MPI, OpenMP and POSIX I/O function calls, accessing the /proc file system and using PAPI to measure hardware performance counters. For this project area, the student assistant's primary duties include:
Enhancing IPM to ensure it provides reliable and complete coverage of modern applications that may use new API calls (from MPI-3, OpenMP-3 and OpenMP-4), multi-threaded MPI (MPI_THREAD_MULTIPLE), and mixed-language (C+Fortran) MPI
Exploring the addition of new data sources to IPM, e.g. monitoring MSR registers on Intel Knights Landing to obtain power usage, or improve the usage of current data sources, e.g. supporting PAPI multiplexing to measure more performance counters in a single application run.
Project 2. Tools to analyze IPM data. The IPM library writes its performance data to an XML file, and at present, researchers produce performance plots using a Perl script and perform custom analysis using ad-hoc Bash and Python scripts. For this project area, the student assistant's primary duties include:
Developing a new Python-based analysis package which will make use of modern analytical packages such as Matplotlib and NumPy
Demonstrating this package as a tool to assist in exploratory data analysis on IPM outputs and generating performance plots and summary statistics from the data
Designing an extensible interface to enable custom analysis using higher-level data analytics libraries including scikit-learn and Caffe.
Project 3. Analysis of exemplar applications. The performance of the NERSC Exascale Science Applications Program (NESAP) applications will be studied to give insight about how NERSC's next system should be architected to ensure scientific productivity. For this project area, the student assistant's primary duties include:
Compiling and running the applications on Cori and collecting performance data with IPM
Comparing and contrasting aspects of performance, such as load imbalance, optimal MPI/OpenMP balance per node, memory footprint, fraction of serial work, and use of vector instructions to inform which architectural features would be most beneficial on NERSC's next system
For all project areas, the desired qualifications include:
Familiarity with C/Python and software engineering practices
Strong foundational knowledge of computer architecture
Experience with MPI and related profiling tools
Senior undergraduate or graduate student in computer science or a related field
Please indicate which project area(s) are of greatest interest on your cover letter.
––––––––––
Project Title: Compression of neurophysiology data
This would be a collaborative project with UCSF to explore the compressibility of neurophysiological data. Labs are generating on the order of 6TB per day. For this project area, the student assistant's primary duties include:
- Running different compression algorithms and then assessing the following measures of feature extraction quality: Sampling rate (30kHz,20kHz, etc.,) and Bit-depth(16bit,12bit, etc)
––––––––––
Project Title: Usage and Performance Monitoring and Plotting with Elastic
Project description: Build tools to monitor and plot usage and performance data from large databases, networks, security systems, storage systems, web sites, and other critical NERSC systems using the open-source Elastic Stack. This is a great opportunity to work directly with large systems in a supercomputing center and build practical experience in data analysis and visualization.
Desired skills/background: data parsing and processing, JSON, programming and scripting, text indexing and mining
––––––––––
Project Title: Automating Linux Installation with The Foreman
Project description: With a fully automated process for installing Linux, new servers can be unboxed and be put into service in under an hour instead of taking days or even weeks. The same process can speed recovery from accidental failures or major disasters. This project will involve learning about the existing Linux installation system (The Foreman) and adding improvements to make it truly automatic and turnkey. This is a great opportunity to learn more about server and Linux internals and build practical experience in systems administration for large-scale server deployments.
Desired skills/background: Linux installation and configuration, networking (basic concepts, DHCP, VLANs), programming and scripting, virtual machines
––––––––––
Project Title: Deploying Scalable Web Services with Mesos and Kubernetes
Project description: Docker containers are an innovative new technology to run applications within miniature isolated environments, similar to virtual machines. Mesos and Kubernetes are frameworks that allow these containers to be assembled together to create full software systems, such as a web application with a database backend. This project will involve researching Mesos and/or Kubernetes and learning how to build small test systems using these tools. This is a great opportunity to learn more about Docker and to get experience with application development with modern methods that are becoming very popular in industry.
Desired skills/background: Docker containers, networking (basic concepts), programming and scripting, web development and web servers (Apache, nginx)
––––––––––
Project Title: Building and Enhancing REST APIs to High-Performance Computing (HPC) Management Systems
Project description: Researchers who use the giant supercomputers at NERSC have to manage their use very carefully; they need to keep track of millions of compute hours and terabytes of storage to make sure it’s not wasted. The web-based tools they use to track these details access central systems at NERSC via APIs. This project will involve coding enhancements to these APIs to fix bugs or provide better management capabilities to users. This is a great opportunity to learn more about the internal operations of a supercomputing center and to apply coding skills to practical problems.
Desired skills/background: web programming and scripting (JavaScript, Perl, PHP, Python)
––––––––––
How to Apply
Students interested in the program must apply on line. Due to the high level of interest in our program, applications will be accepted only through the online application process.
Complete an online profile, and please provide the following:
- Your skills and relevant experience
- Your interest in the program
- Educational information (note: you must be enrolled into a full-time academic program at an accredited college or university)
- List your references (name, contact information, relationship to you)
If selected as a finalist, you will be invited to complete a separate job submission that includes reference, citizenship, and voluntary EEO information.
You will be contacted only if you are being considered for selection for this program. We hope to hear from you soon!
Equal Employment Opportunity: Berkeley Lab is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, age, or protected veteran status. Berkeley Lab is in compliance with the Pay Transparency Nondiscrimination Provision under 41 CFR 60-1.4. Click here to view the poster and supplement: "Equal Employment Opportunity is the Law."