Acerca de este Curso
3.6
121 calificaciones
33 revisiones
Important note: The second assignment in this course covers the topic of Graph Analysis in the Cloud, in which you will use Elastic MapReduce and the Pig language to perform graph analysis over a moderately large dataset, about 600GB. In order to complete this assignment, you will need to make use of Amazon Web Services (AWS). Amazon has generously offered to provide up to $50 in free AWS credit to each learner in this course to allow you to complete the assignment. Further details regarding the process of receiving this credit are available in the welcome message for the course, as well as in the assignment itself. Please note that Amazon, University of Washington, and Coursera cannot reimburse you for any charges if you exhaust your credit. While we believe that this assignment contributes an excellent learning experience in this course, we understand that some learners may be unable or unwilling to use AWS. We are unable to issue Course Certificates for learners who do not complete the assignment that requires use of AWS. As such, you should not pay for a Course Certificate in Communicating Data Results if you are unable or unwilling to use AWS, as you will not be able to successfully complete the course without doing so. Making predictions is not enough! Effective data scientists know how to explain and interpret their results, and communicate findings accurately to stakeholders to inform business decisions. Visualization is the field of research in computer science that studies effective communication of quantitative results by linking perception, cognition, and algorithms to exploit the enormous bandwidth of the human visual cortex. In this course you will learn to recognize, design, and use effective visualizations. Just because you can make a prediction and convince others to act on it doesn’t mean you should. In this course you will explore the ethical considerations around big data and how these considerations are beginning to influence policy and practice. You will learn the foundational limitations of using technology to protect privacy and the codes of conduct emerging to guide the behavior of data scientists. You will also learn the importance of reproducibility in data science and how the commercial cloud can help support reproducible research even for experiments involving massive datasets, complex computational infrastructures, or both. Learning Goals: After completing this course, you will be able to: 1. Design and critique visualizations 2. Explain the state-of-the-art in privacy, ethics, governance around big data and data science 3. Use cloud computing to analyze large datasets in a reproducible way....
Globe

Cursos 100 % en línea

Comienza de inmediato y aprende a tu propio ritmo.
Calendar

Fechas límite flexibles

Restablece las fechas límite en función de tus horarios.
Clock

Approx. 10 hours to complete

Sugerido: 4 hours/week...
Comment Dots

English

Subtítulos: English...
Globe

Cursos 100 % en línea

Comienza de inmediato y aprende a tu propio ritmo.
Calendar

Fechas límite flexibles

Restablece las fechas límite en función de tus horarios.
Clock

Approx. 10 hours to complete

Sugerido: 4 hours/week...
Comment Dots

English

Subtítulos: English...

Programa - Qué aprenderás en este curso

Week
1
Clock
3 horas para completar

Visualization

Statistical inferences from large, heterogeneous, and noisy datasets are useless if you can't communicate them to your colleagues, your customers, your management and other stakeholders. Learn the fundamental concepts behind information visualization, an increasingly critical field of research and increasingly important skillset for data scientists. This module is taught by Cecilia Aragon, faculty in the Human Centered Design and Engineering Department....
Reading
14 videos (Total: 49 min), 1 quiz
Video14 videos
02 Introduction: Motivating Examples3m
03 Data Types: Definitions3m
04 Mapping Data Types to Visual Attributes3m
05 Data Types Exercise2m
06 Data Types and Visual Mappings Exercises4m
07 Data Dimensions3m
08 Effective Visual Encoding3m
09 Effective Visual Encoding Exercise2m
10 Design Criteria for Visual Encoding2m
11 The Eye is not a Camera4m
12 Preattentive Processing4m
13 Estimating Magnitude3m
14 Evaluating Visualizations3m
Week
2
Clock
1 hora para completar

Privacy and Ethics

Big Data has become closely linked to issues of privacy and ethics: As the limits on what we *can* do with data continue to evaporate, the question of what we *should* do with data becomes paramount. Motivated in the context of case studies, you will learn the core principles of codes of conduct for data science and statistical analysis. You will learn the limits of current theory on protecting privacy while still permitting useful statistical analysis. ...
Reading
14 videos (Total: 85 min)
Video14 videos
Barrow Study Problems4m
Reifying Ethics: Codes of Conduct6m
ASA Code of Conduct: Responsibilities to Stakeholders4m
Other Codes of Conduct6m
Examples of Codified Rules: HIPAA3m
Privacy Guarantees: First Attempts6m
Examples of Privacy Leaks6m
Formalizing the Privacy Problem7m
Differential Privacy Defined9m
Global Sensitivity5m
Laplacian Noise4m
Adding Laplacian Noise and Proving Differential Privacy5m
Weaknesses of Differential Privacy7m
Week
3
Clock
4 horas para completar

Reproducibility and Cloud Computing

Science is facing a credibility crisis due to unreliable reproducibility, and as research becomes increasingly computational, the problem seems to be paradoxically getting worse. But reproducibility is not just for academics: Data scientists who cannot share, explain, and defend their methods for others to build on are dangerous. In this module, you will explore the importance of reproducible research and how cloud computing is offering new mechanisms for sharing code, data, environments, and even costs that are critical for practical reproducibility....
Reading
17 videos (Total: 71 min), 2 quizzes
Video17 videos
Reproducibility Gold Standard5m
Anecdote: The Ocean Appliance4m
Code + Data + Environment3m
Cloud Computing Introduction2m
Cloud Computing History5m
Code + Data + Environment + Platform3m
Cloud Computing for Reproducible Research3m
Advantages of Virtualization for Reproducibility5m
Complex Virtualization Scenarios3m
Shared Laboratories3m
Economies of Scale4m
Provisioning for Peak Load2m
Elasticity and Price Reductions5m
Server Costs vs. Power Costs2m
Reproducibility for Big Data5m
Counter-Arguments and Summary4m
Quiz1 ejercicio de práctica
AWS Credit Opt-in Consent Form2m

Instructor

Bill Howe

Director of Research
Scalable Data Analytics

Acerca de University of Washington

Founded in 1861, the University of Washington is one of the oldest state-supported institutions of higher education on the West Coast and is one of the preeminent research universities in the world....

Acerca del programa especializado Data Science at Scale

Learn scalable data management, evaluate big data technologies, and design effective visualizations. This Specialization covers intermediate topics in data science. You will gain hands-on experience with scalable SQL and NoSQL data management solutions, data mining algorithms, and practical statistical and machine learning concepts. You will also learn to visualize data and communicate results, and you’ll explore legal and ethical issues that arise in working with big data. In the final Capstone Project, developed in partnership with the digital internship platform Coursolve, you’ll apply your new skills to a real-world data science project....
Data Science at Scale

Preguntas Frecuentes

  • Once you enroll for a Certificate, you’ll have access to all videos, quizzes, and programming assignments (if applicable). Peer review assignments can only be submitted and reviewed once your session has begun. If you choose to explore the course without purchasing, you may not be able to access certain assignments.

  • When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

¿Tienes más preguntas? Visita el Centro de Ayuda al Alumno.