The extent of data being produced and stored by organizations is increasing. In fact, IDC has projected to reach 40 zetta bytes by 2020. Organizations understand that being able to extract and leverage value and gain actionable insights from this big data can give them a tremendous competitive advantage. In this course, you learn all about Hadoop—its evolution a framework consisting of tools for distributed storage and data processing, to an open-source framework. This course addresses distributed storage and large data set processing focusing on architectures and technologies, specifically Hadoop. Additionally, students learn about other elements in the Hadoop ecosystem, NoSQL databases, and competing technologies. Students also install, setup, and use Hadoop on a single node.
PrerequisitesX 450.1 Introduction to Data Science, or prior knowledge in R and Python recommended, or consent of instructor.
Applies Towards the Following Certificates
- Applications Programming : Electives
- Applications Programming in C# .NET : Electives
- Data Science : Required
- Database Management : Electives
- Linux/Unix : Electives
- Operating System Administration : Electives
- Study Abroad at UCLA Program : Required
- Systems Analysis : Electives
- Web Technology : Electives