CIS-4620/5620 (Big Data Processing) Home Page

This is the home page for Peter Chapin's CIS-4620/5620 course notes for the Fall 2016 semester. Here you will find class handouts, slides used during the lectures, homework assignments, and links to other references of interest.

File Size MD5
Jumbo-2017-01-01.ova 7,521,746,432 bytes 4375fea8a268249e6fae9dfab14f9df5

Adobe Connect

All live lectures will be accessed from the same URL.

The list below are approximate lecture-by-lecture topics for this course. The topics with links to Adobe Connect lectures are for this (Fall 2016) edition of the course. The topics without links are approximate and subject to change.




  1. Homework #01 (Due: 2016-09-02) Average Temperatures (Awk)
  2. Homework #02 (Due: 2016-09-09) Average Temperatures (Hadoop)
  3. Homework #03 (Due: 2016-09-16) Hadoop Pseudo-Distributed Mode
  4. Homework #04 (Due: 2016-09-24) Analyzing the Ga Data Set
  5. Homework #5 (Not Assigned!) Ga on the Cluster
  6. Homework #06 (Due: 2016-09-26) Ga with Spark
  7. Homework #10 (Due: 2016-11-11) Ga on the Cluster
  8. Homework #13 (Due: 2016-12-09) Kafka
  9. Homeworks below are subject to change


The following are links to relevant resources for this class.

Last Revised: 2016-12-30
© Copyright 2016 by Peter C. Chapin <>