PowerBI and R Integration – Connect SAS dataset

Our database infrastructure is in SAS. Although many data requests can be handled directly through Base SAS, but SAS doesn’t have good visualization capability.

Directly connect PowerBI with SAS raw datasets or analytical result in SAS datasets from PowerBI through R integration can efficiently demonstrate the insights of data.

Here is the R script to connect the SAS dataset to PowerBI.

SAS: Proc SQL Group By generate duplicate results and fix

I need to get a list of sum of award amount by year by student id and by award. The following codes generated duplicated results and a warning message.

Problem:

proc sql;
create table want as
select year, id, award, funding_type, type, sum(amount) as sum, 1 as count
from have
where (funding_type eq "A" and type = "Internal") or (funding_type eq "B")
group by year, funding_type, sisid, award;quit

“The query requires remerging summary statistics back with the original data. “

Fix: “In SAS SQL, in a query with a group by clause that includes extraneous columns on the select statement (i.e. columns not part of the group by and not derived from an aggregating function), SAS “remerges” the summary statistics back to the original data (with a note to that effect).”
Reference:
https://stackoverflow.com/questions/25538392/sas-proc-sql-returning-duplicate-values-of-group-by-order-by-variables

Because “type” variable is in the select statement but not in the group by statement, SAS needs to remerging the summary statistics back with the original data to get information on the “type” variable. In the following codes, I added “type” to the group by statement and the warning message was gone.

proc sql;
create table want as
select year, id, award, funding_type, type, sum(amount) as sum, 1 as count
from funding
where (funding_type eq "A" and type = "Internal") or (funding_type eq "B")
group by year, funding_type, type, id, award;
quit;

Proc SQL: Update with other table

Update statement in Data step:

Syntax:

Data master;
update master  transaction;
by key1 key2;
run;

Limitation:

  • The master data set should be sorted by the by group with no duplicated grouping id.
  • The transaction data set should be sorted by the by group with no duplicated grouping id.
  • Transaction data set should have just the column that contains the updated information.

Update using Proc SQL Update statement:

Syntax:

proc sql;
update table1 as t1
set major=(select major from table2 as t2 where t1.id = t2.id and t1.year=t2.year)
where exists (select 1 from table2 t2 where t1.id = t2.id and t1.year=t2.year);
quit;

Limitation:

  • Can’t use join statement. Can only use nested Select statement.
  • Code is a little complicated to comprehend.

SAS: Identify Lag and Lead Status by Row

For character variable.

data want;
set have;
by id;
set  have (firstobs =2 keep = index0 rename= (index0= index1))
     have (obs=1 drop = _all_ );
indexlag1 = ifc( first.id, '', lag1(index0));
index1 = ifc( last.id, '', index1);
run;

For numeric variable.

data want;
set have;
by id;
set  have (firstobs =2 keep = index0 rename= (index0= index1))
     have (obs=1 drop = _all_ );
indexlag1 = ifn( first.id, (.), lag1(index0));
index1 = ifn( last.id, (.) , index1);
run;

SAS: Identify Row Level Changes and Tagging

Sort the dataset first and identify the change using lag(var).
The example below shows if the student cumulative credits drops from previous term, the count of degree will increase by 1.

/* identify the row observation that the change took place */
data data2;
set data1;
by sisid term; 
if first.sisid then deg =1;
else if sisid = lag1(sisid) and cumm_cr < lag1(cumm_cr) then deg = 2;
run;
/* tagging the observations associated with the change */
data data3 ;
set data2;
by sisid;
retain temp;
if first.sisid then do; temp = 0; end;
if deg ne . then temp = deg;
else if deg eq . then deg= temp;
run;

Updated to

data data2;
set data1;
by sisid term; 
retain deg;
if first.sisid then deg =1;
else if sisid = lag1(sisid) and cumm_cr < lag1(cumm_cr) then deg = deg +1;
run;

SAS: Reading Census Data

Census data is exceptionally large. The 2016 census profile for Ontario is 4.5G and has more than 46,694,909 lines of records. To extract the data efficiently, StatsCan provides a csv file that identifies the starting row number for each geography. Using this file, you can compile the parameter list for the geographical area of interest at the selected geographic level, eg. province level, census division level, and census subdivision level.
Census file can be downloaded at link
Step 1: Compile parameter lists

%put &name.;
Canada Ontario Durham York Toronto Peel Halton
 %put &start.;
2 2249 7287023 9513800 12198965 20521853 25289987
 %put &end.;
2248 4495 7289269 9516046 12201211 20524099 25292233

Step 2: Extract and Compile data

%macro ext (namelst=, startlst=, endlst=);
proc datasets library=work noprint;
delete census;
quit;
%let i=1;
%do %while (%scan(&namelst., &i, ' ') ne );
%let parm1=%scan(&namelst., &i, ' ');
%let parm2=%scan(&startlst., &i, ' ');
%let parm3=%scan(&endlst., &i, ' ');
data census_&parm1.;
infile 'X:\Work\Stats Can\98-401-X2016044_ONTARIO_eng_CSV\98-401-X2016044_ONTARIO_English_CSV_data.csv'
delimiter = ',' firstobs=&parm2. obs=&parm3. TRUNCOVER  DSD LRECL=32767 ;
INFORMAT 
year 8.
geo_code  $13.
geo_level 8.
geo_name $80.
gnr 8.1
gnr_lf 8.1
quality_flag $5. 
alt_geo_code 8.
Item $50.
itemID 8.
Notes 8.
Total 8.
Male $8.
Female $8.
;
FORMAT 
year 8.
geo_code  $13.
geo_level 8.
geo_name $50.
gnr 8.1
gnr_lf 8.1
quality_flag $5. 
alt_geo_code 8.
Item $80.
itemID 8.
Notes 8.
Total 8.
Male $8.
Female $8.
;
input
year 
geo_code $
geo_level 
geo_name $
gnr
gnr_lf
quality_flag 
alt_geo_code 
Item $
itemID 
Notes 
Total
Male $
Female $
;
run;
proc append base = census data = census_&parm1.;
run;
%let i = %eval (&i +1);
%end;
%mend ext;
%ext (namelst=&name., startlst=&start., endlst=&end. );

The resulting dataset is only 4.1mb, which you can manipulate efficiently.

SAS: Convert format catalogs from 32bit to 64bit

Reference: http://support.sas.com/kb/44/047.html

Error message:
ERROR: File FORMATS.CATALOG was created for a different operating system.
Step 1: In windows 32-bit SAS, create a transport file (.cpt) with PROC CPORT and file option.

libname my32 'X:\Work\SAS\formats'; /* path where commonfmt.sas7bcat exists */
filename cat1 'X:\Work\SAS\formats\commonfmt.cpt';  /* transport file you are creating */

proc cport lib=my32 file=cat1 memtype=catalog;
   select commonfmt;
run;

The .cpt file will contain the format information of commonfmt.sas7bcat catalog file.

Step 2: In windows 64-bit SAS, unload the transport file (.cpt) using PROC CIMPORT and infile option.

libname my64 'X:\Work\SAS\format64';  /* path to store the new Formats.sas7bcat file */
filename trans1 'X:\Work\SAS\formats\commonfmt.cpt';  /* same as in Step 1 above */

proc cimport infile=trans1 lib=my64;
run;

Step 3: Check the formats in windows 64-bit SAS.

libname sasfmt  'X:\Work\SAS\format64';

PROC CATALOG CATALOG = SASFMT.COMMONFMT;
CONTENTS;
QUIT;

SAS 9.4 ODS RTF file corrupt

Recently, I updated from SAS 9.3 to 9.4. The code was working fine on 9.3. It contains a macro running do loops, producing graphs for each loop, and outputting all the graphs into a rtf file. When I ran the code in 9.4, I can’t open the file in Word. Google search recommends update the sasv9.cfg file to increase the -memsize and -memmaxsz to 6G (https://communities.sas.com/t5/ODS-and-Base-Reporting/SAS-ODS-RTF-file-corrupt/td-p/282779). Another alternative I find is to just change ODS RTF to ODS PDF without changing any code. When running the ODS PDF statements, I got the following warning, but the PDF file was generated without a problem.

Warning: Unsupported device "ACTIVEX" for LISTING destination.  Using device "ACTXIMG".

To get rid of the warning, in SAS EG, navigating to Tools/Options…/Results/Graph. In the “Graph Format” dropdown list, changing “ActiveX” to “ActiveX image”.

SAS Installation: ” Connect: Class not registered “

Issue: Installed SAS 9.4 on the machine where SAS 9.3 has already been installed. Not able to import Excel file in SAS 9.4.
Error message:
ERROR: Connect: Class not registered
ERROR: Error in the LIBNAME statement
Configuration:
– office 365 (32 bit)
– office 2010 (64 bit)
– Windows 7 (64 bit)
– SAS 9.3
– SAS 9.4 (64 bit)
– odbc 64bit: C:\Windows\System32\odbcad32.exe
– odbc 32bit: C:\Windows\SysWOW64\odbcad32.exe

Possible Reason: SAS PC Files Server not installed or not working properly. Need Excel ODBC driver in the same bitness as in SAS.
reference: http://support.sas.com/kb/60/356.html
Attempt 1: Stop SAS PC Files Server in Service (administrative tools); uninstal SAS PC Files Server; download SAS PC Files Server from (pcfilesrv__94200__wx6__en__web__1.zip) from https://support.sas.com/downloads/package.htm?pid=2167; unzip, install from setup.exe; get Java private environment error when installing PC Files Server.
Results 1: not working

Possible Reason: SAS 9.4 is not connected to the right ODBC. ODBC 32bit has Excel in the driver list but ODBC 64bit doesn’t have Excel in the driver list.
reference: https://support.microsoft.com/en-ca/help/942976/odbc-administrator-tool-displays-both-the-32-bit-and-the-64-bit-user-d
Attempt 2: run as administrator on both ODBC files; add excel driver as the system data source in the “System DSN” tab; change target from “\system32\odbcad32.exe” to \syswow64\odbcad32.exe” in the property of “Data Sources (ODBC) in Administrative Tools
Results 2: not working

Possible Reason: office 365 doesn’t have the match Access Engine for program external to Microsoft Office to access the Office 365 application data
Attempt 3:
— download Access Database engine 2016 distribution (32bit)
— install to cmd, cd to where the installation file is saved, run AccessDatabaseEngine.exe /quiet /passive
— open registry editor on window 7: windows key + r -> run regedit -> yes
— navigate to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\16.0\Common\FilesPaths, delete mso.dll
reference: http://support.sas.com/kb/43/933.html
https://knowledge.autodesk.com/support/autocad-map-3d/learn-explore/caas/screencast/Main/Details/7132d430-dc09-40bc-9148-94ff9db41c24.html
Results 3: not sure why but WORKED, but doesn’t work on the Office 365 32bit installation.

Attempt 4:
— uninstall Microsoft Office 365 (32bits)
— install Microsoft Office 365 (64bits)
— uninstall Access Database engine 2016 distribution (32bits)
— install Access Database engine 2016 distribution (64bits)
— install to cmd, cd to where the installation file is saved, run AccessDatabaseEngine.exe /quiet /passive
— open registry editor on window 7: windows key + r -> run regedit -> yes
— navigate to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\16.0\Common\FilesPaths, delete mso.dll
Results 4: worked. matched SAS 64bits with Office 365 64bits