What is the use of Sklearn preprocessing?

QBoard » Artificial Intelligence & ML » AI and ML - Python » What is the use of Sklearn preprocessing?

User Dashboard

What is the use of Sklearn preprocessing?

Back To Topics

Tags : ai data python datasciecne

Rishi Pandya

131 2

What is the use of Sklearn preprocessing

August 16, 2021 3:32 PM IST

0
Tarun Reddy

84

Firstly, from sklearn.preprocessing, StandardScaler is imported for standardizing the training dataset. Standardizing means getting the z-score $http://www.w3.org/1998/Math/MathML"><mo stretchy="false">(</mo><mstyle displaystyle="true" scriptlevel="0"><mfrac><mrow><mi>x</mi><mo>−</mo><mrow class="MJX-TeXAtom-ORD"><mo>µ</mo></mrow></mrow><mrow class="MJX-TeXAtom-ORD"><mo>σ</mo></mrow></mfrac></mstyle><mo stretchy="false">)</mo></math>">(x−µσ)(x−µσ)$ for each of your training datapoint (x). So when you apply fit() on it, the training dataset is fitted into this standardized model. Thereafter, you can use transform() to use this dataset for transforming it to different forms. There is another easy function fit_transform() which jointly does the work of both fit() and transform().

August 16, 2021 9:45 PM IST

0
Samar Patil

346 3

StandardScaler removes the mean and scales each feature/variable to unit variance. This operation is performed feature-wise in an independent way. StandardScaler can be influenced by outliers (if they exist in the dataset) since it involves the estimation of the empirical mean and standard deviation of each feature.

August 17, 2021 1:03 PM IST

0
Advika Banerjee

319 1

Normalize samples individually to unit norm.

Each sample (i.e. each row of the data matrix) with at least one non zero component is rescaled independently of other samples so that its norm (l1, l2 or inf) equals one.

This transformer is able to work both with dense numpy arrays and scipy.sparse matrix (use CSR format if you want to avoid the burden of a copy / conversion).

Scaling inputs to unit norms is a common operation for text classification or clustering for instance. For instance the dot product of two l2-normalized TF-IDF vectors is the cosine similarity of the vectors and is the base similarity metric for the Vector Space Model commonly used by the Information Retrieval community.

December 14, 2021 11:54 AM IST

0
Maryam Bains

317

The sklearn. preprocessing package provides several common utility functions and transformer classes to change raw feature vectors into a representation that is more suitable for the downstream estimators. In general, learning algorithms benefit from standardization of the data set.

December 15, 2021 12:39 PM IST

0

Member Sign In

Member Sign In

Create Account

What is the use of Sklearn preprocessing?

Connect With Us