SAS: Create SAS Table from List of Strings

Normally I need to compile a unique list of text strings from SAS table and run report for each item in the list through a %do loop. See details in the previous post. In this case, I need to convert the list of strings back to SAS table using space as delimiter.

Solution

%let lst = apple orange grape carrot banana watermelon peach;
%macro createtbl;
data lst; 
length fruit $15.;
%let i = 1;
%do %while (%scan(&lst., &i., ' ') ne );
  fruit = "%scan(&lst., &i. , ' ')"; 
  output;
%let i = %eval(&i. +1);
%end; 
run;
%mend createtbl;
%createtbl;

Notes

  • Have to use macro with %macro and %mend because %do, %end, and %eval are not valid in open code. Macros with %macro and %mend are closed marcos.
  • Use single quotation ‘ ‘ for space and use double quotation ” ” for string. The other way will not work. In the above example, if I change the equation for fruit to ‘%scan(&lst., &i. , ” “)’, then the string %scan(&lst., &i., ” “) becomes the values in the fruit field without interpreting the macro variable in the equation and applying the scan function.
  • Need %let i = %eval(&i. +1) with the %do % while statement

Wrong Code Demo and Error Diagnosis

data lst; 
length fruit $15.;
do i =1 to 7;
  fruit = "%scan(&lst., i , ' ')"; 
  output;
end;
run;

Error Message

  • ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric
  • operand is required. The condition was: i
  • ERROR: Argument 2 to macro function %SCAN is not a number.

Diagnosis

  • The second argument i is not a number. To solve this, have to use closed macro and %eval, so the value of i +1 can be calculated.

Other Methods that don’t work

  • Create empty table first and then add values by row.
  • Don’t know how to add values by row without using macro.
/* NOT USEFUL*/
data lst;
attrib 
    fruit length=$15 format=$15. label="fruit";
stop;
run;

Tableau User Experience Summary

Since our office got us both the Tableau and Power BI license, it is up to us to incorporate these data visualization tools in our analytical work. It takes a while to figure out how to perform particular tasks in both Tableau and Power BI, though the learning curve for Tableau is steeper, but once you figure out a way to accomplish what you suppose to do with the software, the efficiency and the output are quite amazing.

The Good

  • Perfect for presentation. Able to define the Size of the Dashboard to optimize the presentation results. For example, we can define the dashboard size to 1550 x768 for the wide screen presentation.
  • Easy to define data hierarchy. Once defined, the list result of the fields in the low hierarchy in the Filters pane will just show the filtered results determined by the fields in the higher hierarchy in the Filters pane. Details see example: Hierarchical Filter.
  • Abundant data connections. Able to link directly to statistical files, including SAS, SPSS, and R. Especially SAS dataset is the basic type of data that we maintain and update for all the reporting function. Tableau can also read different type of spatial files which is super convenient. Details see example of loading Census Shape Data.
  • Tableau prep can recognize tables in the PDF documents and combine different PDF tables into one data table. Details see example: Extract and Union Join Tables in PDF.

The Bad

  • The Individual visual has to be set up in each Worksheet tab first and then combined into Dashboard tab. It is kind of inefficient in terms of using the Worksheet/Dashboard/Story tab spaces, comparing to Power BI that you can directly create different visuals in one canvas.
  • Not able to union the datasets from multiple data sources in Tableau Desktop. You can only union datasheets in the same workbook, or files with similar names under same folder. The tables need to have the same number of fields with same field names and data type.
  • The name for the data connection doesn’t automatically update when you change the data source to a different spreadsheet. Have to use “Rename …” to manually change the name of the data connection.

The In-between

  • Sometimes it is not easy to understand what certain menu items are doing to the visualization, or where to click to change a particular aspect of the visual. eg. font size for title, label and legend; remove lines on the visual; show/hide title, label, legend, and axis etc.

The Particular

  • Ratio calculation: Need to understand the LOD expression to properly use the Fixed function and Include function. See example for ratio calculation using Fixed function.
  • Alias:
  • Set:
  • Entire View:
  • Size vs Angle:
  • Dual Axis:

VALIDVARNAM and VALIDMENNAME options

I have been using SAS for number of years, so I thought I have encountered enough cases for problem solving to cover the routine analysis work. This time I was trying to read some large SAS files and the SAS9.4 wouldn’t take it. I add the folder to the SAS library, but the SAS file in that folder doesn’t show up. Then I realized the SAS file name contains spaces and starts with numbers which violate the rules for SAS name. One of my colleague mentioned because he can view and open the SAS file through SAS enterprise guide, he think the file is fine. So I was wondering why SAS9.4 can’t open/view the SAS file but SAS enterprise guide can. It turns out SAS enterprise guide runs the options statements at the backend so it can take noncompatible filename and variable name.

OPTIONS VALIDVARNAME = ANY;
/* This option allows SAS9.4 to read variable names that contains space.  The default is V7.  */
/* V7 rules for SAS variable name:
- up to 32 char
- first char must begin with a letter or the underscore
- trailing blanks are ignored
- cannot contain blanks or special characters */ 

OPTIONS VALIDMEMNAME = EXTEND;
/* This allows SAS9.4 to view data set with name contains blanks and some special characters.  The default is COMPATIBLE. */

Once you run the options of VALIDVARNAME and VALIDMENNAME as shown above in SAS9.4. The SAS dataset will show up in the explorer window and you can click to see the column information. You still need to change the SAS dataset name for the data step to run properly.

Data: REST API

API: Application Program Interface

A. Operation

  • GET: Retrieve
  • POST: Create
  • PUT: Update
  • DELETE: Remove

B. Formats

  • HTML
  • XML
  • Plain text

C. Use

  • Social: Twitter, Facebook
  • Utilities: Dropbox, Google Maps
  • Commerce: Stripe, Mailchimp

D. Other services

  • HATEOAS: Hypermedia As The Engine Of Application State
  • Postman: API testing
  • Spring Data REST

E. Application

  • Python: request

SAS: Multiple Set statements to combine and summerize datasets

A. Code

  • Gather data
%let var = field1 field2 field3 field4 field5;
%let yr = 17;
%let nextyr = 18;
data data&yr.;
length group $1.; 
set Jun&yr. (keep=&var.)
jul&yr. (keep=&var.)
nov&yr. (keep=&var.)
feb&nextyr. (keep=&var.)
mar&nextyr. (keep=&var.);
where field1= "YES";
group = substr(field2, 1, 1);
count = field3;
run;
  • Summarize data
%macro sum (varname, cntvar, group, where, out);
%put &varname.;
%put &cntvar.;
%put &where.;
proc summary nway missing data =data&yr. (where= (field4=&where.));
class field1 &group.;
var &cntvar.;
output out=&out. (drop= field1 _freq_ _type_ ) sum (&cntvar.)=&varname.;
run;
%mend sum;
%sum ( var6, count, group, 'CAT1' , t1);
%sum ( var6, count, , 'CAT1' , sum1);
%sum ( var7, count, group, 'CAT2' , t2);
%sum ( var7, count, , 'CAT2' , sum2);
%sum ( var10, count, group, 'CAT1' and field5 = 'NOV' , t3);
%sum ( var10, count, , 'CAT1' and field5 = 'NOV', sum3);
%sum ( var11, count, group, 'CAT2' and field5 = 'NOV' , t4);
%sum ( var11, count, , 'CAT2' and field = 'NOV' , sum4);
  • Combine data
data data_new;
set t1 sum1 ;
set t2 sum2;
set t3 sum3;
set t4 sum4;
if group = '' then group = 'X';
run;

B. Output

T1 output
Sum1 output
combining T1, T2, T3, T4, Sum1, Sum2, Sum3, Sum4 by group alignment

PowerBI: App Access and Manage Roles

A. Manage roles – in PowerBI desktop

  • on the report, there is a “Language Course Subject Selection” Box
  • need to set up roles so user can access report information only for certain language.
  • under the “Modeling” tab, click “Manage roles”
  • In the “Manage roles” dialog box, click “Create” button to create new roles.
  • I have created 3 roles: “ESL”, “German”, and “Universal Access”.
  • Tables section listed the all tables in the powerBI file. I have two tables in this file: data1 and lang_dimension1. data1 is the student level demographic fact data and lang_dimenstion1 is the language course information.
  • “ESL” role will filter the results for language course description for only “English as a Second Language” courses.
  • “German” role will filter the results for only German language course.
  • “Universal Access” role has no filter.
  • The “Table filter DAX expression” section can be used to define the filter(s)
  • If the filter is applied to a particular role, the filter sign will appear next to the … (more) sign of the table .
  • Click … to access more options of the roles or tables.
The examples shows configuration of the “German” role with the DAX expression on the “lang_dimension1” table subjdesc field.

B. View as Roles – in PowerBI desktop

  • under the “Modeling” tab, click “View as”
  • check “German” and OK
View as “German” role
  • The visual will only show the results for German language courses and “Language Course Subject Selection” will only have “German” listed.
  • There is also a yellow warning message appeared on the top on the visual stating “Now viewing report as: German” and click the “Stop viewing” to stop viewing the report as German role.
  • after setting up the roles and viewing the roles to make sure the roles are functional, publishing the report to the PowerBI workspace.

C. App Access – in PowerBI service

  • Be careful with creating and assigning roles to the report. As soon as you create a role in report, all the end users of the report need to be assign to a role. If no role is assigned, even you add the end users to the app, they will not able to view the powerbi visuals, and all the visuals become blank boxes with a message that you don’t have permission to the underlying dataset.
  • In the PowerBI workspace where the PowerBI report was published, go to the “Datasets” tab and click … of the report and select “Security”
  • select row-level security and add the emails of the member and save.
  • update app.

Git: GitHub Desktop

A. Download

B. Steps

  • set up username and password in github.com
  • in github desktop, create a new repository
Provide name of the repository and local path.
  • Commit files to branch
Make the initial commit. The commit is in the master branch.
  • Publish repository to GitHub
Undo if there is anything wrong.
  • Review the commit in History tab
  • ignore files setup
    • in the same directory of the files being committed, set up a .gitignore text file.
    • *.* and !*.sas : ignore all files in the directory except for .sas files and files in the subfolders.
    • /foldername: ignore subfolders

PowerBI: Analysis of MS Human Resources sample PBIX

A. Sample Dowload: https://docs.microsoft.com/en-us/power-bi/sample-human-resources

B. Report Layout (pages)

  • Info
    • Text Box
    • Image
  • New Hires
    • 4 Text Boxes (page Header, chart footer, additional instructions)
    • New Hires and Actives by Region and Ethnicity (line and Stacked column chart, Employee[actives] is a calculated field)
  • Actives and Separations
  • Bad Hires
  • New Hires Scorecard

C. Tables

D. Measures

  • Employee[Actives]= CALCULATE([EmpCount], FILTER(Employee, ISBLANK(Employee[TermDate])))
  • Employee[actives SPLY]= CALCULATE([Actives],SAMEPERIODLASTYEAR(‘Date'[Date]))
  • Employee[Active YoY % change = Divide([Actives YoY Var], [Actives SPLY])
  • Employee[Actives YoY var]= [Actives]-[Actives SPLY]