Acerca de este Curso
47,318 vistas recientes

100 % en línea

Comienza de inmediato y aprende a tu propio ritmo.

Fechas límite flexibles

Restablece las fechas límite en función de tus horarios.

Nivel principiante

Aprox. 19 horas para completar

Sugerido: 8 hours/week...

Inglés (English)

Subtítulos: Inglés (English)

Habilidades que obtendrás

Big DataMongodbSplunkApache Spark

100 % en línea

Comienza de inmediato y aprende a tu propio ritmo.

Fechas límite flexibles

Restablece las fechas límite en función de tus horarios.

Nivel principiante

Aprox. 19 horas para completar

Sugerido: 8 hours/week...

Inglés (English)

Subtítulos: Inglés (English)

Programa - Qué aprenderás en este curso

1 hora para completar

Welcome to Big Data Integration and Processing

Welcome to the third course in the Big Data Specialization. This week you will be introduced to basic concepts in big data integration and processing. You will be guided through installing the Cloudera VM, downloading the data sets to be used for this course, and learning how to run the Jupyter server.

3 videos (Total 18 minutos), 6 readings
3 videos
Summary of Big Data Modeling and Management7m
Why is Big Data Processing Different?8m
6 lecturas
Slides: Summary & Why Is Big Data Processing Different10m
Downloading and Installing the Cloudera VM Instructions (Windows)10m
Downloading and Installing the Cloudera VM Instructions (Mac)10m
Software Installation Frequently Asked Questions (FAQ)10m
Instructions for Downloading Hands On Datasets10m
Instructions for Starting Jupyter10m
1 hora para completar

Retrieving Big Data (Part 1)

This module covers the various aspects of data retrieval and relational querying. You will also be introduced to the Postgres database.

5 videos (Total 40 minutos), 2 readings
5 videos
What is Data Retrieval? Part 27m
Querying Two Relations8m
Querying Relational Data with Postgres6m
2 lecturas
Slides: What is Data Retrieval?10m
Querying Relational Data with Postgres20m
2 horas para completar

Retrieving Big Data (Part 2)

This module covers the various aspects of data retrieval for NoSQL data, as well as data aggregation and working with data frames. You will be introduced to MongoDB and Aerospike, and you will learn how to use Pandas to retrieve data from them.

5 videos (Total 50 minutos), 3 readings, 2 quizzes
5 videos
Aggregation Functions9m
Querying Aerospike6m
Querying Documents in MongoDB11m
Exploring Pandas DataFrames5m
3 lecturas
Slides: Querying Data Part 210m
Querying Documents in MongoDB10m
Exploring Pandas DataFrames20m
2 ejercicios de práctica
Retrieving Big Data Quiz20m
Postgres, MongoDB, and Pandas20m
3 horas para completar

Big Data Integration

In this module you will be introduced to data integration tools including Splunk and Datameer, and you will gain some practical insight into how information integration processes are carried out.

11 videos (Total 83 minutos), 4 readings, 2 quizzes
11 videos
A Data Integration Scenario13m
Integration for Multichannel Customer Analytics6m
Big Data Management and Processing Using Splunk and Datameer1m
Why Splunk?3m
Connected Cars with Ford's OpenXC and Splunk3m
Big Data Management and Processing using Datameer15m
Installing Splunk Enterprise on Windows2m
Installing Splunk Enterprise on Linux4m
Exploring Splunk Queries5m
Optional: Creating Pivot Reports in Splunk8m
4 lecturas
Slides: Information Integration10m
Downloading Splunk Enterprise10m
Exploring Splunk Queries20m
Optional: Instructions for Splunk Pivot Tutorial10m
2 ejercicios de práctica
Information Integration - Quiz14m
Hands-On With Splunk15m
3 horas para completar

Processing Big Data

This module introduces Learners to big data pipelines and workflows as well as processing and analysis of big data using Apache Spark.

9 videos (Total 74 minutos), 4 readings, 2 quizzes
9 videos
Some High-Level Processing Operations in Big Data Pipelines8m
Aggregation Operations in Big Data Pipelines5m
Typical Analytical Operations in Big Data Pipelines10m
Overview of Big Data Processing Systems7m
The Integration and Processing Layer8m
Introduction to Apache Spark8m
Getting Started with Spark10m
WordCount in Spark8m
4 lecturas
Big Data Processing Pipelines Slides10m
Big Data Workflow Management10m
Slides for Big Data Processing Tools and Systems10m
WordCount in Spark20m
2 ejercicios de práctica
Pipeline and Tools18m
WordCount in Spark8m
301 revisionesChevron Right


comenzó una nueva carrera después de completar estos cursos


consiguió un beneficio tangible en su carrera profesional gracias a este curso


consiguió un aumento de sueldo o ascenso

Principales revisiones sobre Big Data Integration and Processing

por AAMar 6th 2018

It was a good course, it could have been better if some examples of Spark were also provided in other Languages like Java, people without having background of python may find it difficult.

por SKDec 8th 2016

The assessments are quite tough compared to the course examples. Moreover, some programming basics should be given or made to understand, especially in Spark, as these are very



Ilkay Altintas

Chief Data Science Officer
San Diego Supercomputer Center

Amarnath Gupta

Director, Advanced Query Processing Lab
San Diego Supercomputer Center (SDSC)

Acerca de Universidad de California en San Diego

UC San Diego is an academic powerhouse and economic engine, recognized as one of the top 10 public universities by U.S. News and World Report. Innovation is central to who we are and what we do. Here, students learn that knowledge isn't just acquired in the classroom—life is their laboratory....

Acerca del programa especializado Macrodatos

Drive better business decisions with an overview of how big data is organized, analyzed, and interpreted. Apply your insights to real-world problems and questions. ********* Do you need to understand big data and how it will impact your business? This Specialization is for you. You will gain an understanding of what insights big data can provide through hands-on experience with the tools and systems used by big data scientists and engineers. Previous programming experience is not required! You will be guided through the basics of using Hadoop with MapReduce, Spark, Pig and Hive. By following along with provided code, you will experience how one can perform predictive modeling and leverage graph analytics to model problems. This specialization will prepare you to ask the right questions about data, communicate effectively with data scientists, and do basic exploration of large, complex datasets. In the final Capstone Project, developed in partnership with data software company Splunk, you’ll apply the skills you learned to do basic analyses of big data....

Preguntas Frecuentes

  • Una vez que te inscribes para obtener un Certificado, tendrás acceso a todos los videos, cuestionarios y tareas de programación (si corresponde). Las tareas calificadas por compañeros solo pueden enviarse y revisarse una vez que haya comenzado tu sesión. Si eliges explorar el curso sin comprarlo, es posible que no puedas acceder a determinadas tareas.

  • Cuando te inscribes en un curso, obtienes acceso a todos los cursos que forman parte del Programa especializado y te darán un Certificado cuando completes el trabajo. Se añadirá tu Certificado electrónico a la página Logros. Desde allí, puedes imprimir tu Certificado o añadirlo a tu perfil de LinkedIn. Si solo quieres leer y visualizar el contenido del curso, puedes auditar el curso sin costo.

¿Tienes más preguntas? Visita el Centro de Ayuda al Alumno.