DOPS-242: Ingesting with Cloudera DataFlow

Duration: 4 Days (32 Hours)

DOPS-242: Ingesting with Cloudera DataFlow Course Overview:

In the context of a data-centric organization, effective management of data ingestion and flow within intricate ecosystems holds paramount importance. Is your team equipped with the requisite tools and expertise for this mission?

Enter Apache NiFi — a pivotal solution, and the core focus of this comprehensive four-day course. This training imparts the foundational principles and practical exposure needed to automate the entire lifecycle of data, encompassing its ingress, flow, transformation, and egress, all accomplished through NiFi. Furthermore, the curriculum delves into fine-tuning, troubleshooting, and monitoring intricacies inherent to the dataflow process. Notably, the course extends to integrating a dataflow within the Cloudera CDP Hybrid ecosystem and external systems, ensuring a holistic understanding of seamless data management.

Intended Audience:

  • This course is designed for developers, data engineers, administrators, and others with an interest in learning NiFi’s innovative no-code, graphical approach to data ingest.

Learning Objectives of DOPS-242: Ingesting with Cloudera DataFlow:

During this course, you learn how to: 

  • Define, configure, organize, and manage dataflows 
  • Transform and trace data as it flows to its destination 
  • Track changes to dataflows with NiFi Registry 
  • Use the NiFi Expression Language to control dataflows 
  • Optimize dataflows for better performance and maintainability
  • Connect dataflows with other systems, such as Apache Kafka, Apache Hive, and HDFS
  • Utilize the Data Flow Service
Introduction to Cloudera Flow Management
  • Overview of Cloudera Data-in-Motion
  • The NiFi User Interface
  • DataFlow Catalog
  • ReadyFlows
  • Instructor-Led Demo: NiFi User Interface
  • Hands-On Exercise: Build Your First Dataflow
  • Overview of Processors
  • Processor Surface Panel
  • Processor Configuration
  • Hands-On Exercise: Start Building a Dataflow Using Processors
  • Overview of Connections
  • Connection Configuration
  • Connector Context Menu
  • Hands-On Exercise: Connect Processors in a Dataflow
  • Command and Control of a Dataflow
  • Processor Relationships
  • Back Pressure
  • Prioritizers
  • Labels
  • Hands-On Exercise: Build a More Complex Dataflow
  • Hands-On Exercise: Creating a Fork Using Relationships
  • Hands-On Exercise: Set Back Pressure Thresholds
  • Anatomy of Process Group
  • Input and Output Ports
  • Hands-On Exercise: Simplify Dataflows Using Process Groups
  • Data Provenance Events
  • FlowFile Lineage
  • Replaying a FlowFile
  • Hands-On Exercise: Using Data Provenance
  • Querying Record Data
  • QueryRecord Processor
  • Writing Record Data
  • Hands-On Exercise: TBD (Creating a function to read and write data?)
  • ETL Operations
  • Split and Join Processor
  • Update Record Processors
  • Wait and Notify Processors
  • NiFi Architecture Overview
  • Public Cloud Architecture
  • Private Cloud Architecture
  • Overview
  • Serverless functions
  • Demo: Deploying a Flow Definition as a Function
  • Parameter Contexts
  • Referencing Parameters
  • Managing Parameters
  • Migrating from Variables 
  • Hands-On Exercise: Creating, Using, and Managing Parameters
  • Flow Definition Overview
  • Creating a Flow Definition
  • Importing and Deploying a Flow
  • Using (migrating from) Templates
  • Hands-On Exercise: Creating, Using, and Managing Flow Definitions
  • Apache NiFi Registry Overview
  • Using the Registry
  • Hands-On Exercise: Versioning Flows Using NiFi Registry
  • FlowFile Attribute Overview
  • Routing on Attributes
  • Hands-On Exercise: Working with FlowFile Attributes
  • NiFi Expression Language Overview
  • Syntax
  • Expression Language Editor
  • Setting Conditional Values
  • Hands-On Exercise: Using the NiFi Expression Language
  • Controller Services Overview
  • Common Controller Services
  • Hands-On Exercise: Adding Apache Hive Controller
  • Record-oriented data
  • Record-based Processors
  • Avro Schema Registry
  • Schema Format
  • Dataflow Optimization
  • Control Rate
  • Managing Compute
  • Hands-On Exercise: Building an Optimized Dataflow
  • Monitoring from NiFi
  • Reporting
  • Examples of Common Reporting Tasks
  • Hands-On Exercise: Monitoring and Reporting
  • NiFi Security Overview
  • Securing Access to the NiFi UI
  • Metadata Management
  • NiFi Integration Architecture
  • Available ReadyFlows
  • A Closer Look at NiFi and Apache Hive

DOPS-242: Ingesting with Cloudera DataFlow Course Prerequisites

  • Although programming experience is not required, basic experience with Linux is presumed, and previous exposure to big data concepts and applications is helpful.

Discover the perfect fit for your learning journey

Choose Learning Modality

Live Online

  • Convenience
  • Cost-effective
  • Self-paced learning
  • Scalability

Classroom

  • Interaction and collaboration
  • Networking opportunities
  • Real-time feedback
  • Personal attention

Onsite

  • Familiar environment
  • Confidentiality
  • Team building
  • Immediate application

Training Exclusives

This course comes with following benefits:

  • Practice Labs.
  • Get Trained by Certified Trainers.
  • Access to the recordings of your class sessions for 90 days.
  • Digital courseware
  • Experience 24*7 learner support.

Got more questions? We’re all ears and ready to assist!

Request More Details

Please enable JavaScript in your browser to complete this form.

Subscribe to our Newsletter

Please enable JavaScript in your browser to complete this form.
×