Stream Processing

Learned in SE464.

Stream processing is real-time data processing: continuous streams of data are processed as they arrive, rather than collected into a batch first.

When stream is appropriate

  • Real-time analytics
  • Low latency requirements
  • Continuous inputs that can’t wait for a batch window

Compared to easy scalability problems (handle independent requests from millions of customers, where the data is already there), stream processing is closer to the hard side, but the time budget is tight, so the algorithms and infrastructure differ from batch.