In business computing, it’s simple to break types of work done into two broad categories: (1) Transaction Capture and (2) Reporting Processes. The amount of code supporting transaction capture processes is likely much more extensive than reporting processes. But my work focuses much more on reporting and analytical processes. A Metric Engine is the idea for something much closer to the simplicity of a search engine, but which calculates needed metrics, such as financial analyses.
The drivers of the computer resources needed for these two types of work differ. Transaction Capture processes are typically driven by the number of users, and the number of clicks performed (in prior days it would have been the number of times the enter key is used). The driver for Reporting Processes though is the business events or transactions need to be accumulated to calculate the metric.
I have found that estimating reporting processes requires a simple but clear understanding of the major resources in a computer. An easy way to understand these is to think of a computer as a business meeting. Computers have four main resources to them:
- CPUs which are like participants in a meeting
- Memory which is like a whiteboard
- Persistent storage which is like a notebook or meeting minutes, and
- A network which is like a telephone
CPUs or Meeting Participants
The whiteboard, telephone, or notebooks don’t do anything by themselves in a meeting. People do the work in a meeting. And in computers CPU’s do the work of taking in inputs and producing outputs. In the early days of computing, there was typically only one CPU in a computer, which was more like a person working at his or her desk; but today’s computers typically have multiple CPUs, thus making it more like a meeting. Large computer farms are like office buildings full of meeting rooms, each meeting conducting its own work. Perhaps each meeting room might be called a computer “node.” The meeting might be in the same building, or in a different building. Each meeting has an agenda which is a detailed set of instructions as to what should be accomplished in the meeting; these “agendas” are like the computer programs that instruct the CPUs as to what work they should accomplish.
Memory or the Whiteboard
Meeting participants require data to analyze and use to produce outputs. Data in meetings is typically shown on a whiteboard. Data in a computer is presented to the CPUs via memory. There are different kinds of memory and caches, but they share a common set of attributes like a whiteboard: they are very fast to access and update, but they are volatile. It is so easy to change them, and if a meeting is interrupted or adjourns for a while, the contents can be wiped out very quickly. Whiteboard space is also limited; it is expensive to add more whiteboards to a meeting room, when compared to other types of ways data can be stored, like in a notebook.
Persistent Storage or Notebooks
To overcome the shortcomings of memory, we use persistent storage; in early days tape, then disk, now often SSD (which starts to bring memory closer to permanent storage, but still with some reduction in speed to access and update). Adding these types of storage to the meeting can be very inexpensive, but they are at the cost of speed of access. Getting data from this storage onto the whiteboard, and again after update back into storage takes a long, long time compared to access and update on the whiteboard.
Network or Telephone
The last major computer resource is the network, like a telephone or conference call in a business meeting. Networks allow CPUs to speak with each other, or perhaps in the case of fax machines or even mail to transmit data for storage and use in the meeting. But they are the slowest means of computing, taking the longest time to communicate and requiring the clearest protocols and speech.
Well use these resources in the following episodes as we estimate what it takes to “do the math” of calculating the computing resources needed to produce analytical outputs from our business systems. This is Episode 132 of Conversations with Kip, the best financial system vlog there is.
Next in the series: Computers Explained: Big Data Implications