CS228A: Topics in Distributed Systems
Fall 2009: Web-based (Cloud) Computing

NEWS:
  • Project proposals are due on September 25.
  • Details about projects can be found here.
  • The first three lectures were uploaded on LATTE
  • Check the lecture schedule for updates
Course Content
 
Recent developments in web technology have created a new paradigm for storing, accessing and processing massive data. Web companies like Amazon, Google, Yahoo and Microsoft provide web-based infrastructures that offer on-line storage and computing facilities as a service. Such services are typically hosted in large data centers, known as computing clouds, using clusters of commodity machines that are shared among users. The applications developed in these clouds differ from traditional high-performance systems: they are data intensive systems that acquire and maintain continuously changing data sets, in addition to performing large-scale computations over the data. These systems have the potential to achieve major breakthroughs in science, health care, and information access as well as open new research directions in topics like resource and data management, parallel programming, parallel algorithms and system design.  
 
This course will investigate the state of the art in web-based data intensive computing in cloud environments. We will study the existing programming models, platforms and tools for data-intensive computing, as well the active research projects on academic institutions. We will also examine several well-known industrial applications as case studies. The course will primarily consist of technical readings and discussions.
 
Course Information

Time: Tuesday-Friday 1:40-3:00pm
Location: Volen 106
Instructor: Olga Papaemmanouil
Prerequisite: Background in database systems and distributed systems is recommended but not required. First lectures will provide relevant database and networking background. Graduate students should email olga@cs.brandeis.edu to get a permission code. Undergraduates are required to acquire the instructor's consent.
 
Course Objective
 
COSI 228 is a reading-intensive seminar course in the intersection of databases and distributed systems. The theme of the course for the Fall 2009 is Cloud Computing. The objective of this class is to teach you to:
 
  1.  Grasp basic concepts in distributed and parallel computing.
  2.  Understand the basic principles of web services.
  3.  Understand the resource sharing and storage approaches used by leading web companies (Yahoo, Google, Amazon, etc).
  4.  Understand the state-of-the-art in data-intensive processing languages.
  5.  Learn how to develop data-intensive applications on Amazon’s Web Services.  
 
Course Structure
 
We will meet twice a week and each meeting will include the presentation of research papers. The first class will cover the basic logistics of the course and the following three will be lectures on basic principles on databases and distributed systems.  All remaining meetings will be seminar style (40-50 minutes of presentation followed by discussion).
 
Readings
 
The course will be reading-intensive with readings drawn from numerous papers that will be available on this web site. Students are required to read the papers and submit  their reviews before each meeting. The list of readings is available here.  
 
 Evaluation
 
Details on the students’ evaluation is here.
 
 Course Skills
 
This course requires (and will also help you hone) two important skills: a) reading a research paper and b) giving good presentations. Below are some useful tips on these: