Course Details

CS4830 - Big Data Laboratory

Course Data :

This course is meant for the interdisciplinary dual degree students.

Description: This course will introduce the students to practical aspects of analytics at large scale, i.e., big data. The course will start with a basic introduction to big data concepts spanning hardware, systems and software, and then delve into the following topics.

Course Content:
1. Introduction to Big Data concepts: divide-and-conquer, parallel algorithms, distributed virtualized storage, distributed resource management, orchestration and scheduling, lambda architecture, data flow paradigm, real-time event processing. 2. Big Data Technology: Map-Reduce using Python, Spark for Batch processing, Spark SQL, data flow processing libraries (Beam, Spark Streaming, Flink). 3. Hardware Concepts: Shared-nothing MPP architecture, Cloud architecture, GPU-based acceleration and processing 4. Analytics at Large Scale: Libraries of algorithms including SparkMLlib, H20; integrations with TensorFlow and PyTorch; ML on cloud; use of Zeppelin, Databricks Notebooks.

TextBooks:None

Reference Books:1. Mining of Massive Datasets - Jure Leskovec, Anand Rajaraman and Jeff Ullman. Second Edition. Cambridge. 2014. 2. Big Data Analytics using Spark - https://www.edx.org/course/big-data-analytics-using-spark-0 3. Developing Big Data Solutions using Azure Machine Learning - https://www.edx.org/course/developing-big-data-solutions-azure-machine-learning-0

Prerequisite:EE4708: Data Analytics Laboratory

Pre-Requisites

EE4708: Data Analytics Laboratory

Parameters

Credits	Type	Date of Introduction
	Core	Nov 2018

Previous Instances of the Course

Jan 2023 - May 2023
Instructor(s) : Balaraman Ravindran.

Jan 2022 - Apr 2022
Instructor(s) : Balaraman Ravindran.
Teaching Assistants : Gudivada Harsha Vardhan, Karthikeyan S, N Kausik.

Feb 2021 - May 2021
Instructor(s) : Arun Rajkumar.
Teaching Assistants : Depen Morwani, Mahendra Lacheta, Vakada Naveen, Arun Kumar A, Shivangi Shreya, Arup Das, Sahil.

Jan 2020 - May 2020
Instructor(s) : Balaraman Ravindran.
Teaching Assistants : B Krishnanjali, Pranshu Malviya, Rahul Vashisht.

Jan 2019 - May 2019
Instructor(s) : Balaraman Ravindran.
Teaching Assistants : Abdul Hafeez Kozhithodi, Beeram Akshay Kumar Reddy.

Department of Computer Science & Engineering

Indian Institute of Technology Madras, Chennai, India.

Course Data :

Pre-Requisites

Parameters

Previous Instances of the Course