Cloud Computing for Students
by Phil Windley
August 2010
For the past ten years, I've taught a class at Brigham Young University
aimed at teaching students how to build large-scale distributed systems.
The goal of the class has always been for students to learn how to
build modern software applications that utilize multiple computers. The
focus has been on assembling pieces of software rather than writing
code.
The problem was that, until recently, providing each
student in the class with even one machine that they could use for the
entire semester was difficult. Yet without control of the machine, they
couldn't experience having root access, installing packages, messing it
all up, and starting over.
Lately however, I've been running the
class project the way I've always wanted to thanks to Amazon's Elastic
Compute Cloud (EC2), Simple Queue Service (SQS), and Simple Storage
Service (S3). Using EC2, students can create as many machines as they
need and SQS gives them the infrastructure they need to hook these
machines to each other asynchronously as well as to other services on
the Web.
The project comprises multiple layers that provide
different parts of the overall application, including a data layer.
Students write each part using any programming language they like. They
are encouraged to put each on a different machine. We run a load
balancer for each layer to provide a single URL for that layer’s API. In
the end, the entire system is running on dozens of machines written in
multiple programming languages by many people. Carefully specifying
RESTful interfaces and a TA-enforced test suite keeps it all working
together smoothly.
Without
opportunities like this, many CS students graduate without ever writing
anything more sophisticated than an application written in one
programming language that runs on a single machine. Modern applications
are rarely that simple, and training students in new technologies
requires access to the right computing platforms.
This need was
recognized several years ago in a joint research initiative commissioned
by Google and IBM ("Google and I.B.M. Join in 'Cloud Computing'
Research," New York Times, October 8, 2007). In the article, Randy
Bryant, the chair of Carnegie Mellon's CS program is quoted: “We in
academia and the government labs have not kept up with the times ...
universities really need to get on board."
The fact is Amazon's
service—and others’ including the Eucalyptus project from the University
of California-Santa Barbara—have taken a big step toward solving that
problem. The cost for EC2 is $0.10 per hour of compute time. With some
careful management of the EC2 cloud (like making sure machines aren't
left running when they don't need to be), I'm able to run the class
project for less than $40 per student. That's cheaper than the textbooks
for many classes.
This leads to the second problem: There aren't
many good texts in this area. In fact, there aren't any that I know of
that are written as college texts! Most are "how-to" books that
emphasize specific technologies over general principles. Students need
to understand principles even as they experiment with the technologies
of today. That way they'll easily adjust to the technologies of
tomorrow.
The final problem is that there aren't a lot of
professors who fully understand these technologies. Many understand the
ideas, but have never actually used them. We need summer training that
can help faculty get up to speed. Maybe IBM, Google, or Amazon would
like to help with that.
Years ago, the computer chip business
faced a similar dilemma. Students had tough time learning about chip
design and fabrication for all the reasons we've discussed above. The
solution was industry and government working together to put programs in
place that made it practical to teach the subject. The MOSIS Integrated
Circuit Fabrication Service, Mead and Conway's "Introduction to VLSI
Systems," and National Science Foundation summer camps for faculty were
vital components. The same model could have a dramatic impact for the
future of distributed systems.
Nevertheless, training students to
write distributed applications and to manage the various pieces that
they are made from is not only doable using services like Amazon, but it
is vital in giving students the proper grounding for much of what
they'll be asked to do as they join the workforce.
Biography
Phil Windley is an adjunct professor of computer science at Brigham Young University and chief technology officer at Kynetx, Inc.