Hello to all ML enthusiasts who bumped into this post. ML or Machine Learning is solving serious problems for many engineers, mathematicians, researchers, farmers and I can go one with the list.
Here, I thought of coming up with some important terms which I think we all must know if we are in the world of ML.
Labels
A Label could be defined as the entity we are want to predict. It could be anything, for example, it could be the estimated price of a house after 5 years, identifying what is there in a picture, is it a picture of a fruit, or animal or a plant. The variable y is used to denote the label in a simple linear regression.
##Features A feature is an entity that helps us to predict the label. This is the input that we give to the ML model. To explain this better, let us take an example.
Suppose, we need to predict the price of a house after 5 years, what all factors would you consider?
- Area of the house.
- Age of the house.
- The number of rooms in the house.
and many more factors could be taken in, but let us consider only the above-mentioned one. So, here these three factors are the features or the input variables for the ML model. Features can be denoted by variable x1, x2, x3 and so on.
Examples
An example could be defined as the instance of features. Now examples could be broken into two categories.
- Labeled Examples.
- Unlabeled Examples.
Labeled Examples
Labeled examples include both features and label. what does this even mean? ML model needs training, and for that training it needs data. Labeled examples are features tagged with the labels. This is the data that actually tells what will be the output for a certain set of features. This is the data that is used to train the model.
A labeled example includes both feature(s) and the label. That is: labeled examples: {features, label}: (x, y)
For example
The above figure shows whether an object will be bought or not based on three parameters- Country, Age, and Salary. And, the outcome is also given. So, these parameters are the features and the result is the label tagged to it.
Unlabeled Examples
Unlabeled examples have only the features and do not have the label.
unlabeled examples: {features, ?}: (x, ?)
Once the ML model is trained with the labeled data, the unlabeled data is given as input to the ML model and the labels are tagged thereafter.
Models
A model defines the relationship between the labels and features. It can also be defined as the mathematical representation of any real-world process. To make an ML model, we have to pass the training data to an algorithm so that it can learn from it. Models have two important aspects. These are as follows:
Training the algorithm with labeled data.
Applying the ML model on the unlabeled examples.
Regression vs Classification
Regression models are the models which help in predicting the continuous values. For example; What is the estimated cost of a house in Toronto or what is the probability that an item will be purchased by a customer.
On the other hand, a classification model is a model that gives discrete label values like, whether an email is a spam or not, whether a person is a smoker or non-smoker.
Don't worry
“A baby learns to crawl, walk and then run. We are in the crawling stage when it comes to applying machine learning.” ~Dave Waters
So don't worry if this new to you. All this might seem overwhelming to you and some terminologies could be a bit confusing, but it is okay. We all can get a hang of it as we study more and practice more. Even I could not get all of it just by studying once. Till then, take care and be happy!