BlueWaters is a supercomputer in the National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana–Champaign. Blue Waters uses AMD based Cray XE nodes.

BlueWaters provides several sets of compilers by default, including Intel, GNU, PGI and Cray. Although PGI comes with best graphics card acceleration support (OpenACC) and is recommended for Cray-based system in the Quantum ESPRESSO’s user guide, still, the Intel compilers suite still aces at computing speeds.

This note investigates how to build a parallel-enable Quantum ESPRESSO on BlueWaters machines using the Intel compilers.

Building Environment & Compilers


BlueWaters platform uses the Environment Modules system to control the programming and compiler system. To switch to an Intel compiler environment, one should use module swap and module load. Using swap instead of load can avoid certain dependency conflicts.

module swap PrgEnv-cray PrgEnv-intel/5.2.82
module load intel/

BlueWaters wraps the compilers. The wrapped version is the recommended way of using those compilers, thus the wrapped compilers are cc, cpp and ftn. The official script has wrapped them up.

Prebuilt Scientific Calculation Libraries

Blue Waters have a bunch of mathematica library built-in already, including a LibSci comprises of BLAS (Basic Linear Algebra Subroutines), LAPACK (Linear Algebra Package), ScaLAPACK (Scalable LAPAC), etc., and a FFTW. Using these optimized library are required would avoid the laborious building process of rebuild these wheels.


See versions of FFTW available, use

module avail fftw


While loading the compiler environmental modules, the Cray-LibSci library is automatically loaded.

There are alternative versions of LibSci packages on the Blue Waters machine is named under cray-libsci category.

module avail cray-libsci

Parallel Computing



#  Generated from by configure.

# compilation rules

.SUFFIXES : .o .c .f .f90

# most fortran compilers can directly preprocess c-like directives: use
# 	$(MPIF90) $(F90FLAGS) -c $<
# if explicit preprocessing by the C preprocessor is needed, use:
# 	$(CPP) $(CPPFLAGS) $< -o $*.F90
#	$(MPIF90) $(F90FLAGS) -c $*.F90 -o $*.o
# remember the tabulator in the first column !!!

	$(CPP) $(CPPFLAGS) $< -o $*.F90
	$(MPIF90) $(F90FLAGS) -c $*.F90 -o $*.o

	$(F77) $(FFLAGS) -c $<

	$(CC) $(CFLAGS)  -c $<

# Top QE directory, useful for locating libraries,  linking QE with plugins
# The following syntax should always point to TOPDIR:
TOPDIR = $(dir $(abspath $(filter,$(MAKEFILE_LIST))))
# if it doesn't work, uncomment the following line (edit if needed):

# TOPDIR = /u/sciteam/luo3/build/qe-6.3

# DFLAGS  = precompilation options (possible arguments to -D and -U)
#           used by the C compiler and preprocessor
# To use libxc (v>=3.0.1), add -D__LIBXC to DFLAGS
# See include/defs.h.README for a list of options and their meaning
# With the exception of IBM xlf, FDFLAGS = $(DFLAGS)
# For IBM xlf, FDFLAGS is the same as DFLAGS with separating commas

# MANUAL_DFLAGS  = additional precompilation option(s), if desired
#                  BEWARE: it does not work for IBM xlf! Manually edit FDFLAGS

# IFLAGS = how to locate directories with *.h or *.f90 file to be included
#          typically -I$(TOPDIR)/include -I/some/other/directory/
#          the latter contains .e.g. files needed by FFT libraries
# for libxc add -I/path/to/libxc/include/

IFLAGS         = -I$(TOPDIR)/include -I$(TOPDIR)/FoX/finclude -I$(TOPDIR)/S3DE/iotk/include/

# MOD_FLAG = flag used by f90 compiler to locate modules

MOD_FLAG      = -I

# BASEMOD_FLAGS points to directories containing basic modules,
# while BASEMODS points to the corresponding module libraries
# Each Makefile can add directories to MODFLAGS and libraries to QEMODS

               $(MOD_FLAG)$(TOPDIR)/Modules \
               $(MOD_FLAG)$(TOPDIR)/FFTXlib \
	       $(MOD_FLAG)$(TOPDIR)/LAXlib \
	       $(MOD_FLAG)$(TOPDIR)/UtilXlib \

# Compilers: fortran-90, fortran-77, C
# If a parallel compilation is desired, MPIF90 should be a fortran-90
# compiler that produces executables for parallel execution using MPI
# (such as for instance mpif90, mpf90, mpxlf90,...);
# otherwise, an ordinary fortran-90 compiler (f90, g95, xlf90, ifort,...)
# If you have a parallel machine but no suitable candidate for MPIF90,
# try to specify the directory containing "mpif.h" in IFLAGS
# and to specify the location of MPI libraries in MPI_LIBS

MPIF90         = ftn
F90           = ftn
CC             = cc
F77            = ftn

# GPU architecture (Kepler: 35, Pascal: 60, Volta: 70 )

# CUDA runtime (Pascal: 8.0, Volta: 9.0)

# CUDA F90 Flags

# C preprocessor and preprocessing flags - for explicit preprocessing,
# if needed (see the compilation rules above)
# preprocessing flags must include DFLAGS and IFLAGS

CPP            = cpp
CPPFLAGS       = -P -traditional $(DFLAGS) $(IFLAGS)

# compiler flags: C, F90, F77
# C flags must include DFLAGS and IFLAGS
# F90 flags must include MODFLAGS, IFLAGS, and FDFLAGS with appropriate syntax

CFLAGS         = -O3 -mavx -fopenmp -fast -no-ipo $(DFLAGS) $(IFLAGS)
F90FLAGS       = $(FFLAGS) -nomodule -fopenmp -fpp $(FDFLAGS) $(CUDA_F90FLAGS) $(IFLAGS) $(MODFLAGS)
FFLAGS         = -O3 -mavx -fopenmp -fast -no-ipo

# compiler flags without optimization for fortran-77
# the latter is NEEDED to properly compile dlamch.f, used by lapack

FFLAGS_NOOPT   = -O0 -assume byterecl -g -traceback

# compiler flag needed by some compilers when the main program is not fortran
# Currently used for Yambo

FFLAGS_NOMAIN   = -nofor_main

# Linker, linker-specific flags (if any)
# Typically LD coincides with F90 or MPIF90, LD_LIBS is empty
# for libxc, set LD_LIBS=-L/path/to/libxc/lib/ -lxcf90 -lxc

LD             = ftn
LDFLAGS        = -fopenmp -parallel
LD_LIBS        = 

# External Libraries (if any) : blas, lapack, fft, MPI

# If you have nothing better, use the local copy via "--with-netlib" :
# BLAS_LIBS = /your/path/to/espresso/LAPACK/blas.a
# BLAS_LIBS_SWITCH = internal

BLAS_LIBS      = -L/opt/cray/libsci/16.11.1/INTEL/15.0/x86_64/lib

# If you have nothing better, use the local copy via "--with-netlib" :
# LAPACK_LIBS = /your/path/to/espresso/LAPACK/lapack.a
# For IBM machines with essl (-D__ESSL): load essl BEFORE lapack !
# remember that LAPACK_LIBS precedes BLAS_LIBS in loading order

LAPACK_LIBS    = -L/opt/cray/libsci/16.11.1/INTEL/15.0/x86_64/lib

SCALAPACK_LIBS = -L/opt/cray/libsci/16.11.1/INTEL/15.0/x86_64/lib

# nothing needed here if the the internal copy of FFTW is compiled
# (needs -D__FFTW in DFLAGS)

FFT_LIBS       = -L/opt/cray/fftw/

# HDF5
FOX_LIB  = -L$(TOPDIR)/FoX/lib  -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common\
            -lFoX_utils -lFoX_fsys 
# For parallel execution, the correct path to MPI libraries must
# be specified in MPI_LIBS (except for IBM if you use mpxlf)

MPI_LIBS       = -L/opt/platform_mpi/lib/linux_amd64

# IBM-specific: MASS libraries, if available and if -D__MASS is defined in FDFLAGS

MASS_LIBS      = 

# CUDA libraries

# ar command and flags - for most architectures: AR = ar, ARFLAGS = ruv

AR             = ar
ARFLAGS        = ruv

# ranlib command. If ranlib is not needed (it isn't in most cases) use
# RANLIB = echo

RANLIB         = ranlib

# all internal and external libraries - do not modify


LIBOBJS        = $(TOPDIR)/clib/clib.a  $(TOPDIR)/iotk/src/libiotk.a

# wget or curl - useful to download from network
WGET = wget -O

# Install directory - not currently used
PREFIX = /usr/local

Other References