Brock Palen and Jeff Squyres speak with the creators of Academic Torrents a distributed system for sharing enormous datasets - for researchers, by researchers. The result is a scalable, secure, and fault-tolerant repository for data, with blazing fast download speeds.
Joseph Paul Cohen is a postdoctoral fellow at the Montreal Institute for Learning Algorithms (MILA) in the University of Montreal. He obtained a Ph.D Degree in computer science from the University of Massachusetts Boston in 2016. His research interests include machine learning, computer vision, ad-hoc networking, and cyber security. Joseph received a U.S. National Science Foundation Graduate Fellowship in 2013. Joseph is the founder and director of the Institute for Reproducible Research (a U.S. 501©3 non-profit) which produces tools for researchers such as AcademicTorrents.com and ShortScience.org. He is also the creator of BlindTool (a mobile application that uses artificial intelligence to provide a sense of vision to the blind) and Blucat (netcat for Bluetooth). He has worked in industry for small startups, large corporations, government research labs, educational museums, as well as been involved in projects sponsored by NASA and the DOE.
Henry Z. Lo is currently a Senior Data Analyst at McKinsey and Company working on deep learning solutions. He obtained a PhD from the University of Massachusetts Boston Computer Science department. During his studies he received a McNair fellowship, a Sanofi Genzyme fellowship, the Randall G. Malbone award for academic achievement, and an NSF EAPSI award to study in Shanghai, China. Before graduating, Henry partnered with Joseph Paul Cohen to start a non-profit to promote accessibility in science.
Brock Palen and Jeff Squyres speak with the creators of Julia. Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. Julia’s Base library, largely written in Julia itself, also integrates mature, best-of-breed open source C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing.
Jeff Bezanson, Alan Edelman, Stefan Karpinski and Viral Shah are all co-creators of the Julia language. They are also co-founders of Julia Computing, Inc., a company that builds products for data scientists to accelerate the cycle of innovation, from discovery to production. Their first blog post announcing Julia to the world captures the essence of what they set out to do.
Julia is a modern and easy to use high performance programming language. Parallel computing is fundamental to Julia rather than being an afterthought. It is a vibrant open source project with a diverse community of 500 contributors around the world. Research on Julia is anchored at Alan Edelman’s Julia Lab at MIT. The Julia community has contributed over 1,000 open source packages to date. A number of universities and MOOCs use Julia for teaching and research. It is also used by businesses in areas as diverse as finance, engineering, aerospace, automotive, robotics, healthcare, and e-commerce, to name a few. All these applications and research have been presented over time in four JuliaCons held over the last several years in the US and India.
Prof. Alan Edelman: http://www-math.mit.edu/~edelman/
Alan Edelman is a professor of applied mathematics and a member of the Computer Science and AI Laboratories at MIT. He has won numerous prizes including the Gordon Bell Prize, the Householder prize, various SIAM and AMS prizes, and is a fellow of SIAM and the AMS. He was CTO of Interactive Supercomputing, a startup in the area of software for high performance and big data computing, which was later acquired by Microsoft. He has consulted or worked for companies such as Microsoft, Akamai, Pixar, IBM, and others most recently working on numerical verification. Before that he worked on “big data” analysis tools, even before “big data” became a household term. He currently leads the MIT group on the Julia project as well as working on practical algorithms and theoretical mathematics.
Dr. Viral B. Shah: https://www.linkedin.com/in/viralbshah
Viral Shah is computer scientist with a keen interest in the interaction of technology with public policy. He has had a long-term track record of building open-source software. Apart from Julia, he is also co-creator of Circuitscape, an open-source program which borrows algorithms from electronic circuit theory for ecological conservation. Prior to founding Julia Computing, he founded FourthLion Technologies in India to build India’s first data-driven political campaigns. In the Government of India, he was an early member of the country’s national ID project - Aadhaar, where his work on re-architecting India’s social security systems led to a significant increase in social and financial inclusion, while simultaneously saving the exchequer over a billion dollars in slippage. The experiences of implementing technology at such scale for a billion people are collected in his book: Rebooting India. Viral has a Ph. D. from the University of California at Santa Barbara, in Computer Science.
Stefan Karpinski: https://www.linkedin.com/in/stefankarpinski
Prior to founding Julia Computing, Stefan previously worked as a software engineer and data scientist at Akamai, Citrix Online, and Etsy. In addition to running Julia Computing, he has a part-time appointment as a Research Engineer at New York University as part of the Moore-Sloan Data Science Initiative. Stefan received a B.A. in mathematics from Harvard University in 2000.
Dr. Jeff Bezanson: https://www.linkedin.com/in/jeffbezanson
Jeff Bezanson is a serial programming language designer. Prior to designing Julia, Jeff wrote compilers at Interactive Supercomputing. He is also the author of a particularly tiny Scheme implementation called femtolisp. He is an alumnus of the Massachusetts Institute of Technology and the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), where his thesis was centred on building high performance dynamic languages for technical computing. He received a B.A. in computer science from Harvard in 2004, and a PhD from MIT in 2015.
Brock Palen and Jeff Squyres speak with Gregory Kurtzer about Singularity a container solution for HPC and research environments. Singularity allows a non-privileged user to "swap out" the operating system on the host for one they control. So if the host system is running RHEL6 but your application runs in Ubuntu, you can create an Ubuntu image, install your applications into that image, copy the image to another host, and run your application on that host in it's native Ubuntu environment.
Gregory Kurtzer has created many open source initiatives related to HPC namely: Centos Linux, Warewulf, Perceus, and most recently Singularity. Currently Gregory serves as a member of the OpenHPC Technical Steering Committee and is the IT HPC Systems Architect and Software Developer for Lawrence Berkeley National Laboratory.
Brock Palen and Jeff Squyres speak with Marcel Kornacker about Impala. Impala brings scalable parallel database technology to Hadoop, enabling users to issue low-latency SQL queries to data stored in HDFS and Apache HBase without requiring data movement or transformation. Impala is integrated with Hadoop to use the same file and data formats, metadata, security and resource management frameworks used by MapReduce, Apache Hive, Apache Pig and other Hadoop software.
Marcel Kornacker is the Chief Architect for database technology at Cloudera and creator of the Cloudera Impala project. Following his graduation in 2000 with a PhD in databases from UC Berkeley, he held engineering positions at several database-related start-up companies. Marcel joined Google in 2003 where he worked on several ads serving and storage infrastructure projects, then became tech lead for the distributed query engine component of Google's F1 project.
Brock Palen and Jeff Squyres speak with Denny Dahl about D-Wave and Quantum Computing. Founded in 1999, D-Wave Systems is the world's first quantum computing company. Our mission is to integrate new discoveries in physics, engineering, manufacturing, and computer science into breakthrough approaches to computation that help solve some of the world’s most complex challenges.
Edward (Denny) Dahl is a Ph.D. physicist who has been at D-Wave Systems for over four years. He works with customers to help them understand the principles of adiabatic quantum computing as implemented in the D-Wave 2X System. He is currently on assignment at the Los Alamos National Laboratory, which recently purchased a one-thousand qubit system from D-Wave. His interests are quantum programming, playing the guitar and exploring the high deserts of north central New Mexico.
Brock Palen and Jeff Squyres speak with Kenneth Hoste about EasyBuild. EasyBuild is a software build and installation framework that allows you to manage (scientific) software on High Performance Computing (HPC) systems in an efficient way.
Kenneth Hoste received his Masters degree and Ph.D in Computer Science from Ghent University in Belgium in 2005 and 2010, respectively. His research topic consisted of applying machine learning techniques to various problems that relate to analysis, estimation and optimization of computer system performance. Particular topics include the characterization of microarchitecture-independent workload behavior, and applying evolutionary search algorithms to optimizing static and JiT compilers.
Since October 2010, he has been working in the HPC support team of Ghent University, focusing on user support topics. As a direct result of this, he has taken up the role as main developer and release manager of EasyBuild, a community-powered framework written in Python that aims to tackle the ubiquitous problem of automating the tedious task of building and installing (scientific) software.
Brock Palen and Jeff Squyres speak with Todd Gamblin about Spack. Spack is a package management tool designed to support multiple versions and configurations of software on a wide variety of platforms and environments. It was designed for large supercomputing centers, where many users and application teams share common installations of software on clusters with exotic architectures, using libraries that do not have a standard ABI. Spack is non-destructive: installing a new version does not break existing installations, so many configurations can coexist on the same system.
Todd is a computer scientist in the Center for Applied Scientific Computing at Lawrence Livermore National Laboratory . His research focuses on scalable tools for measuring, analyzing, and visualizing performance the performance of massively parallel simulations. Todd works closely with production simulation teams at LLNL, and he likes to create tools that users can pick up easily.
Frustrated with the complexity of building HPC performance tools, Todd started developing Spack two years ago to allow users to painlessly install software on big machines. Spack has since been adopted by Livermore Computing, other HPC centers, and LLNL application teams. The open source project now includes several core developers at LLNL and a rapidly growing community on GitHub. A 1.0 release is coming soon.
Cyrus is a computer scientist and group leader in the Applications, Simulations, and Quality (ASQ) division of LLNL's Computation directorate. He is the software architect of the VisIt open source visualization tool and leads major aspects of the technical direction of the project. Cyrus also provides custom data analysis solutions for large scale scientific simulations in WCI's WSC and WPD programs.
The Fasterdata Knowledge Base provides proven, operationally sound methods for troubleshooting and solving performance issues. For over 25 years, ESnet has operated an advanced research network with the goal of enabling the highest levels of performance for the Department of Energy (DOE) scientific community. During this time, our engineers have identified a common set of issues that hinder performance and we would like to share our experiences and findings in this knowledge base.
Eli Dart is a network engineer in the ESnet Science Engagement Group, which seeks to use advanced networking to improve scientific productivity and science outcomes for the DOE science facilities, their users, and their collaborators. Eli is a primary advocate for the Science DMZ design pattern, and works with facilities, laboratories, universities, science collaborations, and science programs to deploy data-intensive science infrastructure based on the Science DMZ model. Eli also runs the ESnet network requirements program, which collects, synthesizes, and aggregates the networking needs of the science programs ESnet serves.
Eli has over 15 years of experience in network architecture, design, engineering, performance, and security in scientific and research environments. His primary professional interests are high-performance architectures and effective operational models for networks that support scientific missions, and building collaborations to bring about the effective use of high-performance networks by science projects.
As a member of ESnet's Network Engineering Group, Eli was a primary contributor to the design and deployment of two iterations of the ESnet backbone network - ESnet4 and ESnet5. Prior to ESnet Eli was a lead network engineer at NERSC, DOE's primary supercomputing facility, where he co-led a complete redesign and several years of successful operation of the high-performance network infrastructure there. In addition, Eli spent 14 years as a member of SCinet, the group of volunteers that builds and operates the network for the annual IEEE/ACM Supercomputing conference series, from 1997 through 2010. He served as Network Security Chair for SCinet for the 2000 and 2001 conferences and was a member of the SCinet routing group from 2001 through 2010. Eli holds a Bachelor of Science degree in Computer Science from the Oregon State University College of Engineering.