QBoard » Artificial Intelligence & ML » AI and ML - Conceptual » What does fit() exactly does here?

What does fit() exactly does here?

  • Well, basically i want to know what does the fit() function does in general, but especially in the pieces of code down there.

    Im taking the Machine Learning A-Z Course because im pretty new to Machine Learning (i just started). I know some basic conceptual terms, but not the technical part.

    CODE1:

    from sklearn.impute import SimpleImputer
    
    missingvalues = SimpleImputer(missing_values = np.nan, strategy = 'mean', verbose = 0) 
    
    missingvalues = missingvalues.fit(X[:, 1:3])
    
    X[:, 1:3] = missingvalues.transform(X[:, 1:3])

     

    Some other example where I still have the doubt

    CODE 2:

    from sklearn.preprocessing import StandardScaler
    sc_X = StandardScaler()
    print(sc_X)
    X_train = sc_X.fit_transform(X_train)
    print(X_train)
    X_test = sc_X.transform(X_test)

     

    I think that if I know like the general use for this function and what exactly does in general, I'll be good to go. But certaily I'd like to know what is doing on that code

     
      September 13, 2021 1:58 PM IST
    0
  • Here is also a nice check-up possibility: https://scikit-learn.org/stable/tutorial/basic/tutorial.html
    The fit-method is always to learn something in machine learning.
    You normally have the following steps:
    1. Seperate your data into two/three datasets
    2. Pick one part of your data to learn/train something (normally X_train) with fit
    3. Use the learned algorithm you predict something to unseen data (normally X_test) with predict
    In your first example: missingvalues.fit(X[:, 1:3]) You are training SimpleImputerbased on your data Xwhere you are only using column 1,2,3, with transform you used this training to overwrite this data.
    In your second example: You are training StandardScalerwith X_trainand are using this training for both datasets X_train, X_test, the StandardScaler learnes from X_trainthat means if he learned that 10 has to be converted to 2, he will convert 10 to 2 in both sets X_train, X_test.
      September 13, 2021 6:22 PM IST
    0
  • Sklearn uses Classes. See the Python documentation for more info about Classes in Python. For more info about sklearn in particular, take a look at this sklearn documentation.

    Here's a short description of how you are using Classes in sklearn.

    First you instantiate your sklearn Classes with sc_X = StandardScaler() or missingvalues = SimpleImputer(...).

    The objects, sc_X and missingvalues, each have methods. You can use the methods typing object_name.method_name(...). For example, you used the fit_transform() method of the sc_X instance when you typed, sc_X.fit_transform(...). This method will take your data and return a scaled version of it. It both fits (determines the scaling parameters) and transforms (applies scaling) to your data. The transform() method will transform new data, using the same scaling parameters it learned for your previous data.

    In the first example, you have separated the fit and transform methods into two separate lines, but the idea is similar -- you first learn the imputation parameters with the fit method, and then you transform your data.

    By the way, I think missingvalues = missingvalues.fit(X[:, 1:3]) could be changed to missingvalues.fit(X[:, 1:3]).
      September 13, 2021 11:49 PM IST
    0
  • In contrast to machine learning, fitting means training. There is a fit function in ML, that is used for training of model using data examples. Fit function adjusts weights according to data values so that better accuracy can be achieved. After training, the model can be used for predictions, using .predict() method call. 

     

      September 16, 2021 1:41 PM IST
    0