Question: 19

Question: 19

You have used k-means clustering to classify behavior of 100, 000 customers for a retail store. You decide to use household income, age, gender and yearly purchase amount as measures. You have chosen to use 8 clusters and notice that 2 clusters only have 3 customers assigned. What should you do?

A. Decrease the number of clusters
B. Increase the number of clusters
C. Decrease the number of measures used
D. Identify additional measures to add to the analysis

Answer: A

Question : 17

Question : 17

Which activity might be performed in the Operationalize phase of the Data Analytics Life cycle?

A. Run a pilot
B. Try different analytical techniques
C. Try different variables
D. Transform existing variables

Answer: A

Question : 16

Question : 16

In data visualization, which type of chart is recommended to represent frequency data?

A. Line chart
B. Histogram
C. Q-Q chart
D. Scatterplot

Answer: B

Question: 15

Question: 15

For which class of problem is MapReduce most suitable?

A. Embarrassingly parallel
B. Minimal result data
C. Simple marginalization tasks
D. Non-overlapping queries

Answer: A

Question: 14

Question: 14

What does R code nv <- v[v < 1000] do?

A. Selects the values in vector v that are less than 1000 and assigns them to the vector nv
B. Sets nv to TRUE or FALSE depending on whether all elements of vector v are less than 1000
C. Removes elements of vector v less than 1000 and assigns the elements >= 1000 to nv
D. Selects values of vector v less than 1000,modifies v,and makes a copy to nv

Answer: A

Questions: 11

Questions: 11

The web analytics team uses Hadoop to process access logs. They now want to correlate this data with structured user data residing in their massively parallel database. Which tool should they use to export the structured data from Hadoop?

A. Sqoop
B. Pig
C. Chukwa
D. Scribe

Answer: A

Questions : 10

Questions : 10

You are attempting to find the Euclidean distance between two centroids:

Centroid A’s coordinates: (X = 2, Y = 4)

Centroid B’s coordinates (X = 8, Y = 10)

Which formula finds the correct Euclidean distance?

A. SQRT((2-8)2+(4-10)2) or 8.49
B. SQRT(((2-8) x 2) + ((4-10) x 2)) or 12.17
C. ((2-8)2+(4-10)2) or 72
D. ((2-8) x 2 + (4-10) x 2) or 148

Answer: A

