Phase-III Macro System: Unterschied zwischen den Versionen
| KKeine Bearbeitungszusammenfassung | KKeine Bearbeitungszusammenfassung | ||
| Zeile 30: | Zeile 30: | ||
| == Approach == | == Approach == | ||
| # Avoid dependency of programs to data scope, study characteristics or personal styles. | '''# Avoid dependency of programs to data scope, study characteristics or personal styles.''' | ||
| # Have modules implemented in a way to operate in any emerging environment. | '''# Have modules implemented in a way to operate in any emerging environment.''' | ||
| # Be prepared to add new output structures without substantial delay. | '''# Be prepared to add new output structures without substantial delay.''' | ||
| # Produce a wide variety of output with a minimum set of modules. | '''# Produce a wide variety of output with a minimum set of modules.''' | ||
| # Minimize maintenance efforts through self-documenting and limited program code. | '''# Minimize maintenance efforts through self-documenting and limited program code.''' | ||
| # Maximize validation throughput by adopting a non-mutual-impact architecture. | '''# Maximize validation throughput by adopting a non-mutual-impact architecture.''' | ||
| == Architecture == | == Architecture == | ||
| Zeile 69: | Zeile 69: | ||
| Reads dataset header and returns attributes as undeclared macro variables using the requested attributes names. Information becomes available when the particular variable is declared in the calling environment using a %global or %local statement. | Reads dataset header and returns attributes as undeclared macro variables using the requested attributes names. Information becomes available when the particular variable is declared in the calling environment using a %global or %local statement. | ||
| [[MACRO GET ATTR|Source]] | |||
| === %GRP_DESC() === | === %GRP_DESC() === | ||
Version vom 28. Juni 2013, 16:03 Uhr
General
The Phase-III Macro System is a flexible, data independent and parameter controlled set of SAS macros.
The Phase-III Macro System is not an end-to-end reporting tool.
- It is a highly interacting collection of macro modules providing transformation methods for study emergent datasets making use of all the information available in the description part of the dataset processed. The user is provided with (an) output dataset(s) containing character columns with standard names and externally controlled attributes.
- The Phase-III Macro System provides subroutines that care for data types, formats, labels, headers, missing values, loops and more. Runtime generated information used to control processing is kept in standardized data structures using macro-variable lists (mlists), SAS formats and datasets.
- Input data structures may need some form of pre-processing as well as output data structures may need some post-processing to perfectly fulfil requirements. The Phase-III Macro System already supports these steps to some extent by providing condense, struct and missline functions.
Objective
The Phase-III Macro System is aimed at serving as a base for an extendable system that provides mechanisms for shaping input datasets, processing calculations and generating SAS datasets with ready made text content.
Scope
The Phase-III Macro System interacts with and makes use of other programs, modules, systems and datasets available. Communication and information interchange use SAS macrovariables, environment variables from the operating system and data structures compatible with the SAS System.
Input data streams will require preprocessing in general by assigning formats and labels. Output datasets will need postprocessing using merge and set operations mainly.
Characteristics
Module size is kept small (not more than three screen pages) for maintainability and avoids hard-coded references to any application related information like data types, labels and formats. Coding style makes broad use of automatic documentation and generation of meta data and lookup tables at runtime.
Approach
# Avoid dependency of programs to data scope, study characteristics or personal styles. # Have modules implemented in a way to operate in any emerging environment. # Be prepared to add new output structures without substantial delay. # Produce a wide variety of output with a minimum set of modules. # Minimize maintenance efforts through self-documenting and limited program code. # Maximize validation throughput by adopting a non-mutual-impact architecture.
Architecture
Info Modules
Provide information about datasets and variables for correct processing.
Service Modules
Provide frequently requested tasks in a standard format with limited parameter set
Core Modules
Perform input transformation, calculations and output transformation
User Modules
Generate datasets carrying subtables controlled by user-supplied parms.
Module Details
Info Modules
%GET_ATTR()
Function
Return single attributes like label, format, etc.
Description
Reads dataset header and returns attributes as undeclared macro variables using the requested attributes names. Information becomes available when the particular variable is declared in the calling environment using a %global or %local statement.
%GRP_DESC()
Function
Provide info about a categorial variable.
Description
Investigates given categorial variable and provides results using undeclared macro variables: &n_grp - number of distinct values; &v_grp – structured list of distinct unformatted values; &l_grp – structured list of distinct formatted values.
%CHK_LIST()
Function
Provide info about a list type macrovar.
Description
Reads supplied list of tokens and returns undeclared macro variables: &n_lst - number of list elements; &v_lst – structured list of supplied elements. Input list elements may be separated by blank and comma only.
User Modules
%TWO_CATV()
Function
Deliver PCT/count table from 2 nested categorial variables.
Description
Perform nested processing of two categorial variables looping the context variable from the row_* modules over the categories of the "outer" categories.
Parameters
| Name | Description | 
|---|---|
| dsn | input dataset name | 
| row, row2 | categorial variable name, 2=nested variable | 
| exclude | decode for excluded group from &ROW | 
| weight | Y/N (multiply percentages for &ROW and &ROW2) | 
| col | categorial variable name used for columns | 
| head2 | Y/N (block header for nested variable) | 
| indent, indinc | n (number of indent columns and increment for nested variable) | 
| num | n (sequence number of output) | 
| stat | Y/N (column with statistics names) | 
| space | 1/2/3 (blank line before or after output and between nesting levels) | 
| struct, struct2 | name of reference dataset used for full decode structure, 2=nested variable | 
| condense | var#value (non-distinct variable and true value for &ROW) | 
| misslin2 | Y/N (force missing line for nested variable) | 
Source
declares and upper level processing
%MACRO TWO_CATV(dsn=
               ,exclude=
               ,row=
               ,row2=
               ,col=
               ,indent=0
               ,num=
               ,stat=N
               ,weight=Y
               ,space=2
               ,condense=
               ,struct=
               ,struct2=
               ,head2=N,misslin2=
               ,indinc=2)
/ store des="" 
;
%LOCAL n_grp v_grp n name;
%LET name=TWO_CATV; %IF &STRUCT eq %THEN %LET struct =&DSN; %IF &STRUCT2 eq %THEN %LET struct2=&DSN;
%GRP_DESC(dsn=&DSN
         ,grp=&ROW
         ,miss=n)
;
%TOP_FILT(dsn=&DSN
         ,grp=&ROW
         ,by=&COL
         ,grplvl=&NUM
         ,var=
         ,condense=&CONDENSE)
;
%TOP_FREQ(dsn=top_filt
         ,struct=&STRUCT
         ,grp=&ROW
         ,by=&COL)
;
%TOP_OUTC(dsn=top_freq
         ,head=n
         ,total=n
         ,stat=&STAT
         ,indent=&INDENT
         ,grp=&ROW
         ,rev=n
         ,use=
         ,by=&COL
         ,missline=)
;
loop for lower level processing
%DO n=1 %TO &N_GRP;
  %IF %SCAN(&V_GRP,&N) ne &EXCLUDE %THEN %DO;
    %ROW_FILT(dsn=&DSN
             ,context=&ROW
             ,subgrp=&N
             ,grp=&ROW2
             ,by=&COL
             ,var=
             ,miss=n)
    ;
    %ROW_FREQ(dsn=row_filt
             ,sum=top_freq
             ,struct=&STRUCT2
             ,context=&ROW
             ,grp=&ROW2
             ,by=&COL
             ,weight=&WEIGHT)
    ;
    %ROW_OUTC(dsn=row_freq
             ,sum=main_3rd
             ,head=&HEAD2
             ,stat=&STAT
             ,indent=%EVAL(&INDENT+&INDINC)
             ,context=&ROW
             ,grp=&ROW2 
             ,by=&COL
             ,missline=&MISSLIN2)
    ;
  %END;
%END;
care for naming and send completion mail
%IF &TAB_NAME ne %THEN %DO;
  data %SUBSTR(&TAB_NAME,1,3)&NUM%SUBSTR(&TAB_NAME,5,4);
   set
  %DO n=1 %TO &N_GRP;
    %IF &SPACE eq 1 %THEN dummy ;
    %IF %SCAN(&V_GRP,&N) ne &EXCLUDE %THEN row&NUM._&N ;
    %IF &SPACE eq 2 %THEN dummy ;
  %END;
    %IF &SPACE eq 3 %THEN dummy ;
   ;
  run;
%END;
%GEN_MAIL(name=&NAME);
%MEND TWO_CATV;