Apache Spark IO and Index Access

In the next three week’s episodes of Conversations with Kip, I focus on Apache Spark, a technology that recognizes the power of scanning large transactional data sets to produce analytical outputs.

This week we focus on the Input Processes for analytical processes. The larger the amounts of data we can scan efficiently and quickly, the greater our potential analytics. Yet the scan process can be separated from how the data is stored. Scanning data in a deeply encrypted manner means that, although Spark may be faster than other methods of getting at the data, the amount of data we can process is reduced.

Storage choices for data are important. Often choices about the storage of data are not based upon access patterns for the data, but rather upon the skills of the developers involved. Certainly data stored in a format that is unusable by anyone is of no use, but more often the usage patterns are not evaluated. Indexed access methods, like SQL, are often useful in transaction capture, and perhaps in individual balance presentation, but infrequently in posting processes.

Poor or lazy data storage choices perpetuates our poor data availability as much as lack of compute resources in decades past.

Watch the 95th episode of Conversations with Kip here, the best financial system vlog there is.

By finsysvlogger|2018-10-19T08:24:50-07:00December 18, 2017|Vlog|1 Comment

About the Author: finsysvlogger

I'm all about financial systems education...and who isn't excited about that?

Apache Spark IO and Index Access

Like this:

About the Author: finsysvlogger

A Conversation with Neil Beesley, an Assembler Developer

BSV Blockchain Scale and Performance

Finance Transform in Financial Services: Part 3 Data Considerations

Finance Transform in Financial Services: Part 2 Platform Considerations

One Comment

Leave a ReplyCancel reply

Apache Spark IO and Index Access

Share this:

Like this:

Share This Story, Choose Your Platform!

About the Author: finsysvlogger

Related Posts

A Conversation with Neil Beesley, an Assembler Developer

BSV Blockchain Scale and Performance

Finance Transform in Financial Services: Part 3 Data Considerations

Finance Transform in Financial Services: Part 2 Platform Considerations

One Comment

Leave a ReplyCancel reply