The 27th annual SAS Users' Group International conference (SUGI27), April 14-17, fit right in with the magic and adventure of the Walt Disney World resorts where it was held. For those of us old enough to remember E-tickets having nothing to do with airport check-in, the SUGI27 conference was a first rate Disney "E-ticket ride." Indeed, I was happy to see a solidarity throughout the software product line that bespoke refinement rather than the radical revolutions we've seen in recent years. The transition from version 6.12 whizzing through an uneasy version 7 to a fledgling version 8.x, now appears to be setting pace with measured improvements for version 9, characteristic of a mature product line.
Don't get me wrong: This has nothing to do with becoming "old fashioned". It has everything to do with making data management, analysis, and reporting easier, faster, and more like what we want it to be. The theme stressed by the company this year, "speed, flexibility, agility and strength" is a fair representation of what I saw throughout the conference for both the software features and SAS Institute's relationship with their clientele. SAS version 9, dubbed "Project Mercury," presumably for it's speed enhancements (see "SAS Parallel Processing on your PC" below), is expected to be released later this year, and the changes this time will be subtle for the most part, but powerful.
For news from SAS Institute announced at the SUGI27 conference, see: http://www.sas.com/news/feature/22apr02/sugiversion9.html
Point-and-Click SAS
The point-and-click wizard-based Enterprise Guide interface to SAS
has been such a success that it will find its way into base SAS software
in version 9. One might think of Enterprise Guide as "Visual SAS",
enabling the researcher or analyst to drag and drop data sets, actions,
and analyses in the process of building a data/analysis project. The
pathways you build to manipulate data and perform analyses are preserved
from one session to the next in the form of a project. And if you'd like to use
part of one project to create another, or replicate a series of steps within
a project, it's just a matter of copying and pasting.
Enterprise Guide has truly impressed me, despite my tradition as a programmer who likes to type everything in from the keyboard.
Sometimes it will be quicker, easier, or just more suitable to write out a SAS program the good old-fashioned way, but at other times, the point-and-click Enterprise Guide method will be just what you need. And you can tell it to show you the SAS code it generated to carryout the directions from the wizard.
In the current version of SAS (version 8.2), Enterprise Guide is still an add-on component, installed separately. It's available free to Penn State students as an individual annual license product through the software distribution program at Penn State. It's also included in the departmental site license package, or for faculty and staff purchasing individual licenses, it's available through the MOC Computer Store. For more information on Enterprise Guide, see: http://www.sas.com/products/guide/. A good method to get started using the software is through the tutorial, which is included in the Enterprise Guide install CD, or on the web at: http://www.sas.com/software/tutorialsv8/eg/
SAS Learning Edition
While at Penn State during the SAS software grant period,
students have free access to install a rather comprehensive set of SAS
products. However, once they leave Penn State, the software provided through
the grant program will expire and must be uninstalled. The good news is
that SAS Institute has announced a new "SAS Learning Edition",
which features all the SAS components available through Enterprise
Guide. This includes Base SAS, SAS/STAT, and SAS/GRAPH, among others.
The retail price will be $125 for a four-year license (expires December
31, 2006). It includes Enterprise Guide and an interactive tutorial as part
of the regular install, but the product will be limited to being able to process
no more than 1000 observations. Maps and sample datasets are not
included, and technical support will be web-based only (i.e. no live
technical support). For more information about SAS Learning Edition, see:
http://www.sas.com/le/
SAS/Genetics
SAS software has long been a favorite among geneticists
working with huge arrays of data. However, it's never been enough. Most
genetic researchers who employ SAS in the management or analysis of their
data usually have found the software to be useful only in some intermediate
step of the process.
Once the data are organized nicely, packaged statistical analyses in SAS are performed easily, but the results are then usually exported into some other genetic analysis program, either commercial or homegrown, for further analyses and graphical presentation.
Naturally, a well-seasoned, advanced SAS programmer might frown and suggest that this all could be done within SAS anyway-all it takes is a LOT of fancy advanced programming. We've all heard this one before. Thankfully, SAS Institute finally has tailored a new set of tools designed for gene mapping.
Coming soon, will be a new set of SAS procedures, collectively called SAS/Genetics. SAS/Genetics procedures will include PROC ALLELE, PROC HAPLOTYPE, PROC CASECONTROL, and PROC FAMILY. In addition to these procedures, will be a system-defined SAS macro, %TPLOT, which will simplify the process of graphing the results from genetic marker tests.
SAS/Genetics has not yet been released, but it will be added to the Penn State SAS license as soon as it becomes available.
Take a look at the SUGI27 paper I heard presented by Wendy Czika of SAS Insititute at: http://www.sas.com/rnd/papers/sugi27/genetics.pdf
Another powerful genetics solution from SAS Institute is SAS Research Data Management, formerly called "G23". RDM focuses on web-accessible scalable management of genomics data warehouses for multiple remote research groups. It's a pricey specialized feature that's beyond the regular university licensing program. However, interested researchers may apply for a matching grant from SAS Institute through the Campus Innovation Grants Program (see below). A fact sheet on RDM/G23 in PDF format can be found at: http://www.sas.com/industry/pharma/g23fact.pdf , and an MS PowerPoint product presentation is available at: http://www.sas.com/industry/pharma/g23.ppt
SAS Campus Innovation Grants Program
Although Penn State currently licenses a wide array of SAS
software products, a number of higher-level advanced products and solutions
are not included. SAS solutions are pre-packaged tools that the Institute
has developed using the underlying SAS software to make certain analytic
and data management procedures much easier and seamless. With
these solutions, SAS Institute has done the high-level programming and
packaged it up into an easy-to-use interface. The Research Data
Management package (see above) is an example of one such solution.
SAS Institute has announced the Campus Innovation Grants Program to assist researchers in obtaining such products and solutions/methods to assist with their teaching and/or research. Through this program, SAS will match funds on a 2-for-1 basis for academic, research and service initiatives and on a 1-for-1 basis for administrative initiatives.
For more information on the SAS Campus Innovation Grants Program, see: http://www.sas.com/industry/highered/grant/
SAS Parallel Processing on your PC
With version 9, SAS software finally will support threads.
This means those of you with more than one CPU in your computer will
be able to run SAS jobs that take advantage of multiple CPUs.
Depending on what you're doing, this may dramatically reduce the wall time*
it takes to run a SAS job. This comes with no changes to your existing
SAS code. The SORT procedure, along with SUMMARY, REG, and
GLM, among others, will be enhanced. This doesn't mean that all your
SAS programs will automatically run faster; speed improvements
will depend on a variety of specifics,
such as the exact nature of the analysis you're performing, your
hardware, and the operating system running on your computer.
Already available in version 8.2 is SAS/MPCONNECT, which gives us the ability to run entire DATA and PROC steps simultaneously on multiple CPUs, either within the same computer, or across remote machines. This is different than thread support. Thread support enables parts of a single analysis to be run simultaneously--such as may be the case for solving separate parts of a single matrix calculation within a procedure. MPCONNECT simply allows one to take chunks of a SAS program that can be run independently, and run them simultaneously. For example, one could run two DATA steps simultaneously to create two SAS data sets, sort them simultaneously, and then after the two PROC SORTs are finished, MERGE them. After that, a PROC LOGISTIC and a PROC GLM could be run simultaneously to analyze the merged data set. Without MPCONNECT, each of these steps would be run sequentially.
Making use of SAS/MPCONNECT to speed-up long-running SAS jobs is fairly easy. The first step is to figure out which parts of your SAS program can be run simultaneously, and which parts' completion depends on earlier code (e.g. creating a data set before analyzing it). After that, it's simply a matter of adding some MPCONNECT directives to your code, perhaps rearranging some sections of your code, and running the job. For more information about SAS/MPCONNECT, see: http://www.sas.com/service/news/feature/21aug00/mpconnect.html
* "wall" time (i.e. the time as measured by the clock on the wall, in contrast to CPU time)