Question Bank - Fundamentals of Machine Learning and NLP
Unit 1:
Introduction to Machine Learning
- A
company is analyzing customer data to generate insights, build intelligent
systems, and automate predictions. Explain how Data Science (data
analysis), AI (intelligence), ML (learning from data), and DL (neural
networks) are related in this scenario.
- A
business uses Netflix recommendations, Google Maps navigation, and fraud
detection systems. Justify the need of ML using automation, prediction,
big data handling, real-time decisions.
- A
project involves collecting raw data, cleaning it, building a model,
training, testing, and deploying it. Describe the Machine Learning life
cycle (data collection → preprocessing → training → testing → deployment).
- A
company is building a product recommendation system like Amazon. Identify
the required skills such as Python, statistics, ML algorithms, data
handling, domain knowledge.
- In
different applications:
- Predicting
marks using past data
- Grouping
customers without labels
- Training
a robot using rewards
Differentiate using Supervised (labeled), Unsupervised (no labels), Reinforcement (reward-based) learning. - A
student marks prediction model performs very well on training data but
poorly on test data. Explain using terms features, labels, training,
testing, overfitting, underfitting, bias, variance, and suggest
solutions.
- While
building a model, matrices are used for data, probability for uncertainty,
statistics for analysis, and calculus for optimization. Explain the role
of Linear Algebra, Probability, Statistics, Calculus.
Unit 2:
Supervised Machine Learning
- A
real estate company wants to predict house prices based on size and
location. Identify and justify Regression (continuous output)
technique.
- A
dataset shows relationship between house size and price. Explain Simple
Linear Regression (y = mx + c, slope, intercept).
- A
dataset includes size, location, and number of rooms affecting price.
Explain Multiple Linear Regression (multiple inputs) and its use.
- A
dataset shows a curved relationship between variables. Explain when to use
Polynomial Regression (non-linear, curve fitting).
- A
model is evaluated using values like error and accuracy. Explain R²,
MSE, RMSE and interpret their meaning.
- A
bank wants to classify customers into defaulters and non-defaulters.
Identify and explain Classification (binary output) approach.
- A
model predicts probability of customer default between 0 and 1. Explain Logistic
Regression (sigmoid function, probability interpretation).
- A
model classifies data based on nearest points using distance. Explain KNN
(K value, nearest neighbors, Euclidean distance).
- A
model separates two classes using a boundary with maximum gap. Explain SVM
(hyperplane, margin, support vectors).
- A
classification model gives results like TP, TN, FP, FN. Evaluate using Accuracy,
Precision, Recall, F1-score, Confusion Matrix.
Unit 3: Unsupervised Machine Learning
- A
company groups customers based on similar buying behavior without labeled
data. Explain clustering (grouping, similarity, segmentation).
- A
retail company wants to divide customers for targeted marketing. Explain
the need using segmentation, personalization, and choose K-Means
(fast, simple).
- While
applying K-Means, different values of K are tested and plotted in a graph.
Explain Elbow Method (WCSS, optimal K, bend point).
- A
dataset requires hierarchical grouping and visualization using a tree
structure. Compare Hierarchical Clustering (dendrogram) and K-Means
(centroid-based).
- In
clustering, terms like center, distance, and similarity are used. Define cluster,
centroid, distance, similarity.
- A
dataset contains noise and outliers, and clustering should ignore them.
Explain DBSCAN (density-based, eps, min_samples, noise handling).
- A
supermarket analyzes which products are bought together. Explain Association
Rule Learning (market basket, relationships).
- A
model finds frequent itemsets and rules like bread → butter. Explain Apriori
(support, confidence, lift).
Unit 4:
Natural Language Processing
- A
system processes human language like text and speech. Explain NLP (text
processing, language understanding, AI communication).
- Applications
like chatbots, translation, and sentiment detection are used in real life.
Explain NLP applications (chatbot, sentiment, translation, voice).
- A
text dataset is cleaned by splitting words, removing stopwords, and
reducing words to root form. Explain tokenization, stopword removal,
stemming, normalization.
- Words
like “running” are converted to “run” or meaningful base form.
Differentiate stemming vs lemmatization.
- Text
is converted into numbers using word frequency and importance. Explain BoW
(frequency) and TF-IDF (importance).
- Customer
reviews are classified as positive or negative. Explain sentiment
analysis (classification techniques).
- A
sentence is broken into sequences like single words, pairs, or triples.
Explain n-grams (unigram, bigram, trigram).
- A
model processes sequence data but fails to remember long-term patterns,
while another handles long memory. Compare RNN vs LSTM.
- A
modern model understands context using attention mechanism and pre-trained
models. Explain Transformers and BERT (self-attention, context).
0 टिप्पण्या
कृपया तुमच्या प्रियजनांना लेख शेअर करा आणि तुमचा अभिप्राय जरूर नोंदवा. 🙏 🙏