Chapter 40. Extract Only View

Similar to a copy view, our extract only view will not require the Reference and Format phases. But now, instead of copying the input records to the output files, we will only select those fields we wish to use and write those to the output file. In this example, we’ll also still execute the copy view at the same time, demonstrating how the single pass architecture works.

The only change we have to make to the select phase is to ask it to execute two views, so the input parameter to the VDP Builder now includes both 3261 and 3262.

 ****** ************** Top of Data ***************
 000001 3261                                      
 000002 3262                                      
 ****** ************** Bottom of Data ************

The VDP that is created after running the VDP Builder contains two views, plus all the metadata needed for both.

Logic Table

The following is the logic table created from the VDP which contains both these views.

ExtractOnlyViewLogicTable
Figure 97. Extract Only View Logic Table

The HD Header, RENX Read Next, ES End of String and EN End of Logic Table functions are all pretty much the same. The rows for view 3261, logic table rows 3, 4 and 5 area also unchanged.

The Extract Only view, 3262, also is reading file 1284, and so it is placed immediately after view 3261. So when a record is read from that file, it will be passed first to view 3261 and then to 3262. This view is more complex. The Logic Table functions it performs are as follows:

  • Output Columns 1, 2, and 3: Rows 7, 8 and 9 are DTE functions, which tell the extract program to perform three “Data Transformations”, moving data to the extract file. The view asks that data at position 10, for a length of 5 be moved, next the data at position 1 for a length of 9, then the data at position 15 for a length of 3. The sequence number shows the column of the view requesting these functions.
  • If Statement: Rows 10, 11, and 12 are generated based upon the following view logic text in column 4:
    If ({ACCOUNT} = 111 Or {ACCOUNT} = 121 Or {ACCOUNT} = 123  )
      Then 
        COLUMN = "AssetAcct"
      Else
        COLUMN = "LiabAcct"
    EndIf

    Similar to the copy only view, the row 10 CFEC function tests data in the event file at position 15 for a length of 3 (the same data put in column 3) to see if it is equal to “111”. If the value at position 15 is “111” then the next row executed is row 13; the DTE. If it is not, then the next row executed is row 11. Row 11 tests for a value of “121” in a similar way. Then row 12 tests for a value of “123”. If the value is not “123” then the next row that executes is row 15. In other words, all “or” conditions failed.

  • Output Column 4: Row 13 is a DTC, a Data Transformation Constant. This row, the “then” condition, places a constant of “AssetAcct” into column 4. Row 14 then instructs the program to skip row 15. The DTC in row 15, the “else” condition, places a value of “LiabAcct” in that same column. The only way to get to this row is based upon failing all the “or” conditions on row 12.
  • Output Column 5: Another DTE function is executed on row 16 to place the value contained in the event file at position 18 for a length of 8 in the output for column 5.
  • Numeric Test: Row 17 is generated based upon this logic text:
    If ISNUMERIC({AMOUNT}) 
      Then 
       COLUMN = {AMOUNT}
      Else
        COLUMN = 0
    EndIf

    The CNE is a Class test, Numeric against Event file field. The value in the event file at position 26 for a length of 5 with a format of 4 (which is packed), is tested to see if it contains a numeric value. If the value is numeric, the row 18 is executed. Otherwise, row 20 is executed.

  • Output Column 6: The DTE places the Amount field, which is in the event file at position 26 for a length of 5, in the output file. The DTC function places a zero in the same column if the value from the event file is not numeric.
  • Write: Instead of the WRIN function used in the copy view which wrote the Input record, this view contains a WRDT function, which writes the results of the Data Transforms, namely any of the data placed in the extract record by DT type functions.

The following table shows how the different areas in memory in GVBMR95 are used by the different function codes.

LogicTableFunctionsandFileBuffers.JPG
Figure 98. Logic Table Functions and File Buffers
The Logic Table is used to generate the machine code. The RENX brings a record from the event file into the input data buffer. CFEE, Compare Field Event File Field to Event File Field and CNE, Class Numeric Event functions work against only the input record. The DTE Data Transform Event file field moves data from the input buffer to the output record buffer. The constants, in either the CFEC or DTC are in the logic table and in the generated machine code. What record is written to the output file is determined by the Write Functions, either WRIN or WRDT.

Extract Process

The trace from running the same event used in the last chapter is as follows:

ExtractOnlyLogicTableTrace
Figure 99. Extract Only Logic Table Trace
Note that event file record 1 is first processed by view 3261, and then serially (not in parallel) processed by view 3262. View 3261 has two WRIN functions, against event record 1 and event record 2. But three records are written for view 3262 by the WRDT function. Here are the records output for view 3262, in hex mode: The record copied by 3261 had the value “alpha” in it, but note that the second record no longer has a value of “alpha” where the number should be because of the numeric test in the view 3262. Instead, logic table row 20 was executed, a DTC function, which placed instead a set of zero’s in that location.
ExtractOnlyViewSampleOutput
Figure 100. Extract Only View Sample Output

Extract Program Control Report

A key control report in the process is the Extract Engine report, shown below:

extract only mr95 report
Figure 101. GVBMR95 Control Report
  • Because we only read one event file, only one thread was executed1.
  • There were two views reading this file. If one of these views had been reading another event file as well, the count by “views per file processed” would have been 3, even though there were only two outputs.
  • The “logic table rows” is equal to the row number on the EN row, i.e., the total number of rows in the logic table.
  • The machine code generated from those logic table rows is 424 bytes. In other words, in the prior chapter we noted the CFEC created a CLC and BC assembler instructions, which required 20 bytes of the 424 generated.
  • The record count of the records read from the Event file (with a DD Name of Event) is 3. Because we only read one event file, the total event records read is the same on the next line. Our process used no “pipes” which are virtual event files.
  • The next section shows the results of each view against each event file.
    • View 3261 extracted two records, and wrote them to DD Name F0003261.
    • View 3262 extracted three records, and wrote them to DD Name F0003262.
    • The zeros next to “F” and “NF” are for lookups “Found” and “Not Found”, because these views require no joins.
  • Multiple views can write to an extract file. Thus the next set of rows show the total of record written to each extract file, and the total to all extract files, and to all Pipes.
  • The Elapsed time is the total wall-clock time for the extract process.

If the outputs are not what was expected, this is perhaps the first places to look.

Analysis

The Extract Engine in this process has significantly increased efficiency over alternative methods of producing these two outputs, because in a single pass of the file, one IO to get the event file into memory for processing has allowed both outputs to be done. Certainly programs can be written to do this same thing, but it demands a programmer writing the program to design it that way. Here, two people independently can create views, and the tool will resolve them efficiently.

Remember, though, that this process does not include any parallelism. View number 2 is executed after view number 1 has seen the event record.


The next step in growing our understanding is to understand how to do lookups.

1 See Multi-Threading for a discussion of reading multiple event files.