Batch Processing
[It involves:]
- Collecting and storing data over a period of time (hours, days, or even weeks).
- Processing this data in bulk at scheduled intervals.
- Producing some output data.
[Full Process]
- Data Collection (File, Buffer, Warehouse, DB)
- Pre-Process
- Execution -> Usually in parts
- Post Process (mark ACK)
Can Use: Hadoop (Map Reduce) | AWS Batch (AWS) | CRON (Custom)
Stream Processing
[Full Process]
- Ingestion (Kafka Topics, Buffer)
- Processing: Fillter -> Aggregate -> Windowing (stream - batching)
Can Use: Apache Kafka / Flink | AWS Kinesis