This course introduces students to the problems and issues in managing large sets of data, focusing on modeling, storing, searching, and transforming large collections of data for analysis. The course will cover database management and information retrieval systems, including relational database systems, massively parallel/distributed computation models (e.g., MapReduce/Hadoop) and various NoSQL (e.g., key-value, document, column, and graph) systems that are designed to handle extremely large-scale and complex data collections. Emphasis is placed on the application of large-scale data management techniques to particular domains. Programming projects are required.
Introduction to Data Science: Management
This course is not being offered at this time. Please contact the concierge to find out more information.