Survey basic frequency macro

This coding can be used to run

Case 1: Count frequency of distribution for one question and exclude a category from the counting results (eg. exclude records from the question with 88 as the answer in the frequency distribution).

%macro count  (field=);

proc freq data=indata (where = (&field ne 88));
tables var1*(&field) ;
run;
%mend count;

%count (field=q1);

Case 2: Count frequency of distribution for one question conditional on the result of a different question. In the example, the where statement will delete records with q1 equals to 88 and then delete records with q10 not equal to 2 or 3 or 4.  For the rest of the records, proc freq will run the distribution of q1.

%macro count (field1=, field2=);

proc freq data=indata (where = (&field1 ne 88 and &field2 in (2,3,4)));
tables var1*(&field1) ;
run;
%mend count;

%count (field1=q1, field2=q10);

Merge data with Data step

Because usually the information for the subject of interest are scattered in different datasets, it is common to join the tables to gather all the variables of the subject.  There needs to be a common id field with the same data format in all the datasets that need to be merged.

Both dataset A and B have a variable id. Merge statement join the variables from A and B by the id variable in A.  All the records and fields from A will be in the merged table and those records with equal id field from B will be added to the records in the merged table.

Caution:

  1. Make sure id in A is unique
  2. id field in both A and B should be the same type of field (character or numeric)
  3. After merging the table, need to check if there is any duplicated information and why.

data merged;

merge A (in =a) B;

by id;

if a;

run;

 

SAS: data extraction and merge routine (2)

In the following data step, table 1 and table 2 have the same data structure. Setting multiple tables in the data step will stack the tables. First, the where statement will make the program more efficient by applying the conditions to all the datasets in the Set statement. If statements create subsets and apply action(s) to specific subsets.
Data temp;
set table1 table2;
where  var1 = ‘condition1’ and var2 = ‘condition2’ and var3 NE ‘condition3’;
if var4  = a and var5 = ‘3’ then do; output; end;
if var4 = b and var5 IN (‘1′,’2’) then do; output;end;
run;

SAS: Export to Excel with Label Option and 9.3/9.4 Difference

It is common to export SAS data tables or outputs to Excel spreadsheets.  The default Excel output will only contain field names in the first row of the spreadsheet.  The ‘label’ option will allow the SAS label names to be the column names on the first row of the exported table.

Method 1:

PROC EXPORT DATA=table1.
OUTFILE= “C:\Work\SAS\table1.xlsx”
label DBMS=EXCEL REPLACE;
SHEET=”table1″;
NEWFILE=YES;
RUN;

method 2:

libname excelout “X:\Work\excelfile.xlsx”;

data excelout.sheet1 (dblabel=yes) ;
set tablename;

run;

libname excelout clear;

9.4 SAS Difference

PROC EXPORT

PROC EXPORT DATA= SASHELP.CARS 
OUTFILE= "C:\Excel\car.xlsx" 
            DBMS=EXCELCS REPLACE;
     SHEET="car"; 
RUN;
  • Bug: a blank tab (_SAS_empty_) with A1 = “_empty_(CHECK_OTHER_SHEET) was generated in the xlsx file.

Solution: https://communities.sas.com/t5/SAS-Programming/Proc-Export-creating-a-blank-SAS-empty-sheet-in-the-exported/td-p/483623

  • Change DBMS=EXCEL to DBMS=XLSX and the blank tab will disappear.
PROC EXPORT DATA= SASHELP.CARS 
OUTFILE= "C:\Excel\car.xlsx" 
            DBMS=XLSX REPLACE;
     SHEET="car"; 
RUN;

Alternative 1: LIBNAME method

  • Add xlsx in the libname statement;
  • dblabel = option not work for the xlsx engine.
Libname exl xlsx "C:\Excel\car.xlsx";
data exl.cars;
set SASHELP.CARS ;
run;
libname exl clear;

Alternative 2: ODS EXCEL method

  • ods excel; ods excel close;
  • file =” “;
  • ods excel options (sheet_name = ” “);
  • proc print noobs label data=;run;
ods excel file= "C:\Excel\&fac._research.xlsx";
ods excel options (sheet_name = "&fac.");
proc print noobs label data =research; 
where faculty = "&fac.";
run; 
ods excel close;

SAS: Identfy and Remove Duplicate Records

Step 1: Check duplicate records by one field or a combination of fields.

proc freq data=temp noprint;
table var1 /out =dupl (keep =var1 count where = (count >1));
run;

proc freq data=temp noprint;
table var1*var2 /out =dupl (keep =var1 var2 count where = (count >1));
run;

Examine the output to see what circumstances can cause duplicate records.

Step 2: Remove duplicate records.

proc sort data=temp NODUPKEY;
by var1 ;
run;

proc sort data=temp NODUPKEY;
by var1 var2 ;
run;

Alternative method to output unique records and duplicated records in two separate datasets.

proc sort data =temp out=temp1;
by var1 var2;
run;
data unique dup;
set temp1;
by var1 var2;
if first.var2 and last.var2 then output unique;
else output dup;
run;

 

SAS:Output SAS Results to Excel Pivot Table

Purpose:

  • To create a pivot table from sas data sets. Apply specific configuration to the pivot table.

Download the tagset from SAS support website.

  • “http://support.sas.com/rnd/base/ods/odsmarkup/tableeditor/tableeditor.tpl”

Save the tagset to local directory

  • “C:\Users\xxxxxx\Documents\My SAS Files(32)\9.3\TableEditor\tableeditor.tpl”

Code Example:

filename temp  “C:\Users\xxxxxx\Documents\My SAS Files(32)\9.3\TableEditor\tableeditor.tpl”;
%include temp;
ods tagsets.Tableeditor file=”C:\Users\xxxxxx\Documents\My SAS Files(32)\9.3\AAPR\Excel\temp\temp.html”
options(BUTTON_TEXT=”Create PivotTable”
AUTO_EXCEL=”yes”
UPDATE_TARGET=”C:\\EXAMPLE.XLSX”
OPEN_EXCEL=”no”
PIVOT_SHEET_NAME= “Academic Program Report_UG”
PIVOTPAGE=”faculty”
PIVOTROW=”progname,category, level2, level3″
PIVOTCOL=”year”
PIVOTDATA=”value”
PIVOT_GRANDTOTAL=”no”
PIVOT_SUBTOTAL=”no”
PIVOTDATA_FMT=”#,###”
PIVOT_FORMAT=”medium14″
DELETE_SHEETS=”Sheet1, Sheet2, Sheet3, Table_1″
QUIT=”yes”
);

Proc print data = gr;
var faculty progname level2 level3 year value category;
run;
ods tagsets.tableeditor close;

Code Structure:

  1. set up tableedit targetset directory
  2. call ods statement
  3. config pivot table in options
  4. run proc print statement
  5. close ods

Notes for Options

  • Auto_Excel = “yes” (open Excel automatically)
  • Update_target=”C:\\example.xlsx” ( need to create the excelfile example in the directory first. Notice C:\\  in the directory)
  • Open_Excel= “no” (running excel at the backgroup)
  • Pivotpage= “faculty” (identify field for report filter)
  • Pivot_grandtotal=”no” (hide the grandtotal in the pivot table default view)
  • Pivot_subtotal= “no” (hide the sub-total in the pivot table default view)
  • Quit= “yes” (exit excel)

Problems Identified

  • format in the option setting not working
  • print procedure doesn’t take long (22 seconds) for 20,000 records table, but ods process take longer time to generate the pivot table.  There is no stop signal to show that the process has been completed.
  • The excel file doesn’t seem to be updated until you open the file and the prompt would ask if you would like to save the change and if you click yes the updated the pivot table will show.

 

SAS SQL: Merge dataset with multiple conditions

Assume all the macro variables are defined. There are two lists A and B. A is list of year n students and B is a list student id for year n+1 students in the same program.  Step 1 is to create table that includes all the students with study level 1 or 2 in A but not in B (1 year later).  This step identifies the student population of study level 1 and 2 who are lost to the program one year later. Notice A and B are used as table aliases to simplify the statement. Also the syntax for left join needs to be followed by  the on statement to specify the column field to join the two table.

proc sql;
create table nov&yr.not as
select A.* from nov&yr. as A
left join nov&nextyr.id as B
on A.sisid = B.id
where B.id is null and A.std_lvl in (’01’,’02’)
;
quit;

Second step is to identify the students who changed their major to Psychology from results of step 1.  Notice I omit As when I assign the aliases.
Proc sql;
create table nov&yr.chg as
select A.*, B.sisall, B.newm1 from nov&yr.not A, nov&nextyr.all B
where A.sisid = B.sisall
and A.major1 ne B.newm1
and B.newm1 eq ‘PSYC’
;
quit;

SAS: Set start and stop for macro variables

Sometimes I need to repeat the same exercise for multiple times, for example faculty want retention rate not just for one year but multiple year, it is easy to use %macro macroname (parameter1, parameter2) in conjunction with %do varname = &value1 &to &value2 to customize the year range in the do loop statement per clients request.

eg;

%macro step (strtyear, stopyear);
%do yr= &strtyear %to &stopyear;

….continue coding

%end;

%mend step;

%step (11, 14);

 

SAS: Convert Variable between Numeric and Character Format

A. Use Put or Input
Convert between numeric and character variable.

  • character to numeric (input)
old_char = "2018";
new_num = input(old_char, 8.);
new_num = 2018;
  • character to numeric (input) to character (put)
old_char = "2018";
new_char = put (input (substr(old_char , 1, 4 ), 8.) -1 , 4.);
new_char = "2017";

/* or */
new_char = put (old_char*1-1 , 4.);
new_char = "2017";
  • numeric to character (put), and with leading Zero.
old_num = 2018;
new_char = put(old_num, 4.); *new_char = "2018";
new_char1 = put(old_num, z8.); *new_char1 = "00002018";
  • numeric to character (put) to numeric (input)
old_num = 2018;
new_num = input(substr(put(old_num, 4.),3, 2) , 8.);
new_num = 18;

Use the following functions to check whether any digit or alphabetic character is in the character variable. The results will be the position of the first digit/alphabetic character in the field.

check1 = anydigit(var); *return position of first digit, 0 if not found in the string;
check2 = anyalpha(var); *return position of first alphabetic character, 0 if not found in the string;
check3 = notalpha(var); *return position of first non digit, 0 if not found in the string;
check4 = notdigit(var); *return position of first non alphabetic character, 0 if not found in the string;
  • Please note there should be no space in between % and statement eg. macro, let, mend, put, eval etc.
  • To apply the macro variable, use &varname. format.

CODE:

%macro setyr;
%let yr = 14;
%let nextyr =%eval(&yr. +1);  /* 15 */
%let next2yr=%eval(&yr. +2);  /* 16 */
%let fisyr = “20%substr(&yr., 1,2)/%substr(&nextyr., 1,2)”;   /* “2014/2015” */
%let fisnextyr = “20%substr(&nextyr., 1,2)/%substr(&next2yr., 1, 2)”;   /* “2015/2016” */
%let year = “20&yr.”; /* “2014” */
%put &yr;
%put &nextyr;
%put &next2yr;
%put &fisyr;
%put &fisnextyr;
%put &year;
%mend setyr;
%setyr;

OUTPUT:
14
15
16
“2014/15”
“2015/16”
“2014”

B. Use Vvalue fuction
Vvalue() returns the formatted value that is associate with the variable.

data want;
data have;
new_charvar1 = vvalue(formatted_numvar1);  *formatted means the variable has been applied with format;
new_numvar2= vvalue(formated_charvar2);
run;

SAS: Variables/Fields with same Prefix

I have a SAS file that stores students course selection (up to 20 fields) and grades for each course (up to 20 fields) for a certain semester. To set the sas working file, I use the follow data step. Both course variables and grades variables have the common prefix “crs” and “grade”.  It is more efficient to list the 20 course variables using “crs1-crs20” than list each and every one of them.

%let varlist = crs1-crs20 grade1-grade20 studentid major;

Data tem1;

Set  coursefile_fw16 (keep = &varlist.);

run;