This week’s episode discusses the nature of batch vs. streaming analytical applications, and the relationship to balance creation and history requirements.
Analytical processes are moment in time processes; one creates a balance which says at this point, here is how much has been done. These analytical processes typically do not line up with the nature of transaction processing. Transactions occur at different times than when we analyze them.
This means that our analytical systems have to deal with history in some way. If I am going to change my analysis method, I will have to reprocess history or wait for the passage of time before history has been accumulated under my new analysis method.
There are also certain business processes which are likely to continue to be done periodically, particularly for example control processes. Some control processes ensure that all the created balances match and are valid as of a point in time. The periodic nature of these processes do not align very well with a streaming, real-time application.
>>> Related Posting: The Apache Spark Series of Videos <<<
As compute capacity continues to grow and become more ubiquitous, our streaming analytical systems will grow, maintaining balances and positions on a real-time basis. But this will be gradual, and it is unlikely to eliminate the need for the application to deal with history, and have some level of batch, or periodic, functionality to it.
Our systems should be designed for streaming workloads, but also have the functionality required for periodic analysis.
Watch all episodes in order at the Conversations with Kip Playlist