This course helps students to build big data systems using architecture that utilizes clustered hardware along with new tools designed specifically to capture and analyze web-scale data. We will start this course, by looking at the theory of big data systems. Then, we will learn how to implement big data systems in practice, deploy and operate them once they are built.
Topics include: Scalable computing models; large-scale non-traditional data storage frameworks including graph, key-value, and column-family storage systems; data stream analysis; scalable prediction models; in-memory storage systems; Hadoop, Storm, and NoSQL databases.
Knowledge of statistics.
Upon completion of this course, students will be able to:
To Be Determined
To Be Determined by the Instructor
Nathan Marz: Big Data: Principles and best practices of scalable realtime data systems, First Edition (Manning Publications, May 2010). ISBN: 1617290343, ISBN-13: 978-1617290343.
Anil Maheshwari: Data Analytics Made Accessible (Amazon Digital Services, May 2014). ASIN: B00K2I2JL8.
Class Participation. Students must bring their laptops to class and participate in course learning activities and contribute fully to the completion of group projects.
Homework. Daily homework assignments. Students should expect to spend about 2 hours per 1 hour of in-class time on reading and homework assignments.
Assignments/Midterms/ Final Exam/Group Project. Assignments, projects and exams will be determined by the instructor.
Remember: Programs will be graded based on completeness and correctness. No credit will be given for programs that do not compile. No credit will be given for late assignments.
If a student is aware of extenuating circumstances that warrant an extension, a request for an extension can be made to the instructor. An extension request must have a good reason to be considered, and should be made well in advance if possible. Extensions requested after the time the assignment is due will be considered only in extreme circumstances. If an extension is granted, the normal penalties will not be applied until the granted extension period has lapsed.
Exceptions can generally be made for serious illness, family emergencies, and the like. Exceptions will not be granted for poor planning & time management, or heavy workload.
Students are to show respect to the instructor, teaching assistants, and fellow students.
Students are expected to attend all class meetings; to be on time to class and stay until the class is complete; silence cell phones; to be alert and attentive; and to use laptops only for class purposes
Absences may be excused only in special cases, such as university approved events, and death in immediate family. Absences require proper documentation and instructor approval. Unexcused absences may result in a significant penalty
Policies on cheating, plagiarism, incomplete grades, attendance, discrimination, sexual harassment, and student grievances are described in the Student Handbook
For up-to-date course requirements, readings, assignments,
and announcements, please refer to the course website.
IT IS YOUR RESPONSIBILITY TO CHECK