CIS-4620/5620 (Big Data Processing) Home Page

This is the home page for Peter Chapin's CIS-4620/5620 course notes for the Fall 2016 semester. Here you will find class handouts, slides used during the lectures, homework assignments, and links to other references of interest.

Jumbo-2017-01-01.ova 7,521,746,432 bytes 4375fea8a268249e6fae9dfab14f9df5

Adobe Connect

All live lectures will be accessed from the same URL.

The list below are approximate lecture-by-lecture topics for this course. The topics with links to Adobe Connect lectures are for this (Fall 2016) edition of the course. The topics without links are approximate and subject to change.




  1. Homework #01 (Due: 2016-09-02) Average Temperatures (Awk)
  2. Homework #02 (Due: 2016-09-09) Average Temperatures (Hadoop)
  3. Homework #03 (Due: 2016-09-16) Hadoop Pseudo-Distributed Mode
  4. Homework #04 (Due: 2016-09-24) Analyzing the Ga Data Set
  5. Homework #5 (Not Assigned!) Ga on the Cluster
  6. Homework #06 (Due: 2016-09-26) Ga with Spark
  7. Homework #10 (Due: 2016-11-11) Ga on the Cluster
  8. Homework #13 (Due: 2016-12-09) Kafka
The following are links to relevant resources for this class.

Last Revised: 2016-12-30
