Let's move on to the different types of troubles seen in Dataflow pipelines. Typically, a failed or problematic Apache Beam pipeline run can be attributed to one of the following causes. One, graph for pipeline construction errors when building the job graph. Two, errors in job validation by the Dataflow service. Three, exceptions during pipeline execution, and finally, slow running pipelines or lack of output that affect the performance of the pipeline. Let's start by looking at the first type of trouble. These types of errors occur while Apache Beam is building your dataflow pipeline and validating the beam aspects as well as the input, output specification of your pipeline. These errors can typically be reproduced using the direct runner and can be tested against using unit tests. Keep in mind that no job will be created on the Dataflow service if there is an error while building the pipeline. The example shown here depicts an airway the pipeline code is performing an illegal operation that is checked and caught while building the code on the beam side. This message should be visible in the console or terminal window where you ran your Apache Beam pipeline. Let's move on to the second type of job troubles. Once the Dataflow service has received your pipeline's graph, the service will attempt to validate your job. This validation includes the following; making sure the service can access your jobs associated Cloud Storage buckets for files staging and temporary output, checking the required permissions in your Google Cloud project, making sure the service can access input and output sources such as files. If your job fails the validation process, you'll see an error message in the Dataflow monitoring interface, as well as in your console or a terminal window if you are using blocking execution. The error displayed is an example of a situation in which the pipeline code passed its foundation, but the pipeline was rejected by Dataflow due to lack of permissions in the project where the job was tried to run. These errors cannot be reproduced with the direct runner. They require the Dataflow runner and potentially the Dataflow service. To iterate quickly and protect against regression, build a small test that runs your pipeline or a fragment of it since the air does not depend on scale running on a tiny amount of data so it isn't costly. Let's move on to the third type of job troubles. While your job is running, you may encounter errors or exceptions in your worker code. These areas generally mean that the due functions in your pipeline code have generated unhandled exceptions which result in failed test in your Dataflow job. Exceptions in your user code are reported in a Dataflow monitoring interface. You can investigate these exceptions using the general troubleshooting workflow described in the beginning of this module. The above screenshot shows that using Cloud logging to see the error from the Dataflow interface gives us a more detailed stack trace on the exception. Consider guarding against errors in your code by adding exception handlers. For example, if you'd like to drop elements that fail some custom input validation done in a ParDo, handle the exception within your due fun and drop the elements or return it separately. More details on this can be found in the best practices module of the developing pipelines with Dataflow course. You can also track failing elements in a few different ways. You can log the failing elements and check the output using Cloud logging. You can check the Dataflow worker and worker startup logs for warnings or errors related to work item failures and finally, you can have your ParDo rate the failing elements to an additional output for later inspection. It is important to note that batch and streaming pipelines have different behaviors and handle exceptions differently. In batch pipelines, the Dataflow service retries failed tests up to four times. In streaming pipelines, your failed job may stall indefinitely. You will need to use other signals to troubleshoot your job. High data freshness, job blogs, cloud monitoring, metrics for pipeline progress, and error counts. Let's move on to the final type of job troubles. Multiple factors such as pipeline design, data shape, interactions with sources, sinks, and external systems can affect the performance of a pipeline. These will be covered in further detail in the performance module of this course. The User Interface provides useful information to debug performance problems at a step level. Use that user interface to identify expensive steps. The step info section can provide useful information including wall time, input elements, input bytes, output elements, and output bytes. Wall time for a step provides the total approximate time spent across all threads in all workers on the following actions; initializing the step, processing data, shuffling data, and ending the step. The input element count is the approximate number of elements that the step received and the estimated size provides the total volume of data that was received. Similarly, the output element is the approximate number of elements produced by the stem and the estimated size provides the total volume of data that was produced. This is the end of this module. You should now be able to use a structured approach to debug dataflow pipelines and examine common causes for pipeline failures.