01. In the example below, the input data set is a Hive table accessed using a SAS/ACCESS to Hadoop LIBNAME statement.
proc freq data=hivelib.myhivetable;
Which statement is true about this program?
A. The procedure will fail unless the table HIVELIB.MYHIVETABLE is already stored ordered by YEAR.
B. SAS will generate a HiveQL query to return the data to SAS ordered by YEAR so that the procedure receives the data ordered by YEAR as required.
C. BY statements do not require the data be received by the procedure in any specific order.
D. BY statements are not supported for Hive tables because it is not possible to order data that is distributed on different nodes of the Hadoop cluster.
02. When working with data stored in Hadoop, which SAS function is NOT passed to Hive by default?
03. What is an advantage of using a LIBNAME statement to interact with your Hadoop cluster?
A. It enables you to submit user-written HiveQL code to Hive.
B. The GENERATE_PIG_CODE= option enables you to bypass Hive and generate Pig Latin code.
C. It enables some SAS procedures to push processing into Hive.
D. It ensures that Hive will handle all processing.
04. Refer to the log message shown below:
58 proc ds2;
59 data test;
60 dcl double date;
61 method run();
62 set work.one;
ERROR: Compilation error.
ERROR: Parse encountered type when expecting identifier.
ERROR: Parse failed on line 60: dcl double >>> date <<< ;
NOTE: PROC DS2 has set option NOEXEC and will continue to prepare statements.
Which of the following changes will fix the errors shown in the log?
A. Replace line 60 with dcl double ‘date’;
B. Replace line 60 with dcl double “date”;
C. Replace line 60 with dcl string date;
D. Replace line 60 with dcl double ‘date’n;
05. Which statement creates a temporary array within DS2?
A. vararray double a;
B. vararray double a(2);
C. array a(2) s1-s2;
D. dcl double a;
06. The following SAS program is submitted:
dcl timestamp order_timestamp;
dcl double order_datetime;
order_timestamp = to_timestamp(order_datetime);
What happens when the program is executed?
A. — The variable order_timestamp is created and processed as an ANSI timestamp value in the DS2 program.
— The order_timestamp value is converted to a SAS datetime when it is written to the output SAS data set.
B. — The variable order_timestamp is created and processed as an ANSI timestamp value in the DS2 program.
— The output data set stores the values as a SAS timestamp value.
C. — The variable order_timestamp is converted to a SAS time value.
— The output data set stores this as the number of seconds since midnight.
D. — The program does not execute because order_datetime is a SAS datetime value.
07. Which operator is NOT a diagnostic operator for testing a Pig program?
08. Web server logs are written in an HDFS directory. The following lines indicate the format and an example of the comma-separated values for one line in the log file.
# IP address, timestamp, request, status, size
192.168.12.41,24/Nov/2015:10:09:58 -0500, “GET /services/config.xml HTTP/1.1”,200,816
Which CREATE TABLE statement enables a Hive query to access each of the fields?
A. create external table weblogs (ip string, dt string, req string, status int, sz string) row format delimited fields terminated by ‘,’ location ‘/data/weblogs’;
B. create external table weblogs (ip string, dt string, req string, status int, sz string) fields terminated by ‘,’ location ‘/data/weblogs’;
C. create external table weblogs (ip string, rest string) row format delimited fields terminated by ‘,’ location ‘/data/weblogs’;
D. create external table weblogs (ip string, dt string, req string, status int, sz string) fields delimited fields by ‘,’ location ‘/data/weblogs’;
09. Many temporary tables may be created in the LASR server by PROC IMSTAT analysis actions. What happens to temporary tables when a PROC IMSTAT session is terminated?
A. All temporary tables are saved to the SAS server WORK library.
B. All temporary tables are purged from the LASR server.
C. The last temporary table created is saved to the SAS WORK library.
D. The last temporary table created is saved to storage in the HDFS.
10. This question will ask you to provide a line of missing code.
Which line of code would you insert to get the mean and standard deviation of INCOME and AGE, calculated separately for GENDER variable values F (female) and M (male)?
<insert code here>
A. summary income age / groupby=gender;
B. crosstab income*gender age*gender;
C. summary income age / by=gender;
D. univariate income age / by=gender;