How to build a reporting system that (1) doesn’t run for multiple days to produce daily reports, (2) doesn’t require a boat load of compute capacity (3) nor require a master file for each report with the associated reconciliation costs, (4) and yet allows new questions to be asked of the system after its initial built? These requirements point to needing a Metric Engine, as opposed to a search engine. This series of videos provides the steps.
A theoretical way of doing it would be to have no posting processes, and instead use the business events for all reports. The problem with this is that either the system will take a long time to produce the information, or it will require a large amount of hardware. But in this set of videos, we’ll delay solving those problems until the last steps. Start with the objective of doing the work completely from the transactional data.
If one wants to be able to produce daily customer profit and loss statements, then the needed transactional data will become clear: customer revenue, directly attributed costs and allocated overhead transactions. Gather data about the record counts, kinds of attributes, and required allocation steps.
It may require some estimation, like multiplying customers by the average number of products, by average transactions per day. Do the best you can in gathering this information, either from business requirements or perhaps from at least existing systems (estimate the associated volumes when aggregation steps are removed from legacy systems; we want transactional data if we can get it).
The reference, static, or categorizing data also needs to be understood, particularly if there are high volume aspects to this, like customer, instrument, or time-series attribution. One also needs to gather data about the frequencies of the reports to be produced. All these will be important to then use the predictable nature of computer processes to estimate what kind of system will be needed.
This is Episode 134 of Conversations with Kip, the best financial system vlog there is.
[…] combine, if performed at the detail level, to produce very high data volumes. The vlog series on estimating performance can help you understand this […]