### Latest Deals

Unstructured Data Classification Fresco Play MCQs Answers

Disclaimer: The main motive to provide this solution is to help and support those who are unable to do these courses due to facing some issue and having a little bit lack of knowledge. All of the material and information contained on this website is for knowledge and education purposes only.

Try to understand these solutions and solve your Hands-On problems. (Not encourage copy and paste these solutions)

Course Path: Data Science/MACHINE LEARNING METHODS/Unstructured Data Classification

All Question of the Quiz Present Below for Ease Use Ctrl + F to find the Question.

Suggestion: If you didn't find the question, Search by options to get a more accurate result.

Classification Quiz

1.Identify the unstructured data from the following.

1. Image
2. Data from mySQL DB
3. Excel data

Quiz on Dataset

1.What kind of classification is our case study 'Spam Detection'?

1. Multi label
2. Binary
3. Multi class

Quiz on Pre-processing

1.Which pre-processing technique is used to remove the most commonly used words?

1. Stopword removal
2. Lemmatization
3. Tokenization

Quiz on Cross validation

1.The cross-validation technique is used to evaluate a classifier by dividing the data set into a training set to train the classifier and a testing set to test the same.

1. False
2. True

Quiz on Performance Evaluation Measures

1.True Negative is when the predicted instance and the actual instance are positive.

1. True
2. False

2.True Positive is when the predicted instance and the actual instance are not negative.

1. False
2. True

Quiz-Final Assessment

1.The following are all classification techniques, except ___________

1. SGDClassifier
2. SVM
3. StratifiedShuffleSplit
4. Random Forest

2.The classification where each data is mapped to more than one class is called ___________

1. Binary Classification
2. Multi Label Classification
3. Multi Class Classification

3.The following are pre-processing methods used for unstructured data classification, except _________

1. Stop word removal
2. Lemmatization
3. Confusion_matrix
4. Stemming

4.In a Document Term Matrix (DTM), each row represents ___________

1. word
2. TF value
3. document

5.Imagine you have just finished training a decision tree for spam classification, and it is showing abnormal bad performance on both your training and test sets. Assume that your implementation has no bugs. What could be the reason for this problem?

1. Your decision trees are too shallow
2. You need to increase the learning rate   X
3. You are overfitting
4. All the options

6.Identify the stop word(s) from the following.

1. Both "the" and "it"
2. "fragment"
3. "the"
4. "computer"
5. "it"

b) Give the column names as 'label' and 'message'.

c) Try out the code snippets and answer the questions.

What does the command sentiment_analysis_data['label'].value_counts() return?

1. The number of rows in the dataset
2. The total count of elements in the 'label' column
3. The count of unique values in the 'label' column
4. The number of columns in the dataset

Answer: 3)The count of unique values in the 'label' column

b) Give the column names as 'label' and 'message'.

c) Try out the code snippets and answer the questions.

Which of the following commands is used to view the dataset SIZE, and what is the value returned?

1. sentiment_analysis_data.shape, (6918, 2)
2. sentiment_analysis_data.shape, (6918, 3)
3. sentiment_analysis_data.size, (6918, 3)
4. sentiment_analysis_data.size(), (6918, 2)

b) Give the column names as 'label' and 'message'.

c) Try out the code snippets and answer the questions.

What is the output of the following command: print(sentiment_analysis_data['label'].unique())

1. [true false]
2. [1 0]
3. None of the options
4. [yes no]

10.Choose the correct sequence for classifier building from the following.

1. Initialize -> Train -> Predict -> Evaluate
2. Train -> Test -> Initialize -> Predict
3. Initialize -> Evaluate -> Train -> Predict
4. None of the options

Answer: 1)Initialize -> Train -> Predict -> Evaluate

b) Give the column names as 'label' and 'message'.

c) Try out the code snippets and answer the questions.

Is there a class imbalance problem in the given data set?

1. Yes
2. No

b) Give the column names as 'label' and 'message'.

c) Try out the code snippets and answer the questions.

What kind of classification is the given case study (Sentiment Analysis dataset)?

1. Binary classification
2. Multi label classification
3. Multi class classification

13.Choose the correct sequence from the following.

1. Pre-Processing -> Predict -> Train
2. Data Analysis -> Pre-Processing -> Model Building -> Predict
3. Data Analysis -> Pre-Processing -> Predict -> Train
4. Pre-Processing -> Model Building -> Predict

Answer: 2)Data Analysis -> Pre-Processing -> Model Building -> Predict

14.Inverse Document frequency is used in the term-document matrix.

1. False
2. True

15.Pruning is a technique associated with __________

1. SVM
2. Decision tree
3. Logistic regression
4. Linear regression

16.Which of the given hyperparameters, when increased, may cause the random forest to overfit the data?

1. Number of Trees
2. Depth of Tree
3. Learning Rate

17.The higher value of which of the following hyperparameters is better for the decision tree algorithm?

1. Cannot say
2. Samples for leaf
3. Depth of tree
4. Number of samples used for split

18.TF-IDF is a feature extraction technique.

1. False
2. True

19.What is the purpose of lemmatization?

1. To convert words into a proper base form
2. To split into sentences
3. To remove redundant words
4. To convert a sentence into words

Answer: 1)To convert words into a proper base form

20.Supervised learning differs from unsupervised learning as supervised learning requires __________

1. Unlabeled data
2. Labeled data
3. None of the options
4. Raw data

If you want answers to any of the fresco play courses feel free to ask in the comment section, we will surely help.