Date Engineering BPMN
Use immediately

Date Engineering BPMN

Edge

Cleaning

Stream Processing

Pool

Transfer Data To

Where It Is Needed

(eg NiFi)

Determine if

Event Is

Notable (eg

Temperature

change > 0.1C)

n days

Feed To Big Data Landing Zone (eg Kafka)

Local Models React To

Event Using local data

(eg H2O)

Send to Disaster Recovery Site / Audit / Logs / Other Downstream Systems

Streaming Data

Cleaners Check Data

(eg JSON matches

Schema)

Feed To Manual

Investigation Zone

(eg Kafka)

Is there GDPR or

Personal info?

Store Private Data To

Special Secure DB

(Possibly with tokens)

Feed To "Good"

Landing Zone (eg

Kafka)

Streaming System

Responds to Event

(eg Flink / Spark

Streaming)

Alert, or other action

Feed To "Good"

Landing Zone New

Topic (eg Kafka)

Store "temporarily" for

online reporting

Store for Long Term

Analytics (eg HDFS /

S3 / DataLake)

Feed to "other'

Specialised

databases

?

Event deteted

Ignore

Not Ok

OK

other teams

orgnisaton

local needs

Event In "Raw" Landing

Zone eg Kafka

tokenised

or clean

Private

Event In "Good" Landing

Zone eg Kafka

Data Ready for future

batch historic queries

13
0
0
publish time: 2021-09-18
Kiraaaa

Swimlanes in BPMN are meant to convey the parts of a process done by different people or roles. There are three different areas categories in the below data engineering business process model and notation model. The 'Edge' is where data is generated. It is typically not in the same location as the Big Data infrastructure and, in the case of IoT, may not even be in a Data Center. The next lane here is 'Cleaning Etc,' where data events pass through this area when they are being checked for valid formats and personal private information. The last area is 'stream processing,' where we use the data retrieved from the 'Cleaning Etc.' This BPMN diagram starts with the top left event: something has happened, which we have measured. Rather than waiting for hours, with Streaming, we want to react to events pretty much as soon as they are generated, so there is almost no waiting around.

See More Related Templates