Course DescriptionThis course introduces the student to analytics of big data across an analytical ecosystem. Whether the data sets come from a data warehouse, where the schema is well defined before being stored in the database (schema on write), or the data is simply landed without a schema that is defined later when the data is retrieved (schema on read), each must produce data sets that can be related to each other in order to be joined and to produce an analytical result that generate new insights for the business. This course explores the big data phenomenon and the impact of big data to traditional relational databases. You examine big data technologies such as Hadoop and NoSQL and revisit data warehousing to redefine the analytical ecosystem as a combination of programming languages (polyglot programming) and combination of storage solutions (polyglot persistence). You review information management across the analytical ecosystem and the importance of distributing analytics across the ecosystem. In the second half of the course you apply the concepts of the analytical ecosystem individually within specific data analytic technologies and across analytic technologies.
The following courses are recommended prerequisites: COM SCI X 414.51 Relational Database Management, COM SCI X 450.1 Introduction to Data Science, COM SCI X 450.3 Hadoop and Managing Big Data, or consent of instructor.