I have lots of excel files(xlsx format) and want to read and handle them.
For example, file names are ex201901, ex201902, 201912.
Its name is made by ex YYYYMM format.
Anyway,... moreI have lots of excel files(xlsx format) and want to read and handle them.
For example, file names are ex201901, ex201902, 201912.
Its name is made by ex YYYYMM format.
Anyway, to import these files in pandas as an usual case, it's easy.
I am new to AI/ML field and I need to solve the following problem using python language.
Basically, I have certain parameters that come in order and I would like to use supervised... moreI am new to AI/ML field and I need to solve the following problem using python language.
Basically, I have certain parameters that come in order and I would like to use supervised techniques to discover the error.
I would like to figure out the error in the production process that has a sequential paradigm as follows.
Product ID, Product type, Category type, Product Line, Result (Good, Bad).
Let's say the system takes the following training dataset
Product ID, Product type, Category type, Product Line, Result (Good, Bad).
ID1, PT, CT, , Good
ID2, PT, CT, , Good
ID3, PT, CT, , Good
And the given test dataset is
ID4, PT, CT, , Bad
What are the AI/ML techniques that can detect the reason of getting "Bad", which is L3 product line? Also, can I add this new data to the training set to predict the reason of the error later? How can it be implemented in Python?
I would like to convert a NumPy array to a unit vector. More specifically, I am looking for an equivalent version of this function
def normalize(v):
norm = np.linalg.norm(v)
... moreI would like to convert a NumPy array to a unit vector. More specifically, I am looking for an equivalent version of this function
def normalize(v):
norm = np.linalg.norm(v)
if norm == 0:
return v
return v / norm
Is there something like that in sklearn or numpy?This function works in a situation where v is the 0 vector.
I'm trying to use scikit-learn's LabelEncoder to encode a pandas DataFrame of string labels. As the dataframe has many (50+) columns, I want to avoid creating a LabelEncoder... moreI'm trying to use scikit-learn's LabelEncoder to encode a pandas DataFrame of string labels. As the dataframe has many (50+) columns, I want to avoid creating a LabelEncoder object for each column; I'd rather just have one big LabelEncoder objects that works across all my columns of data.Throwing the entire DataFrame into LabelEncoder creates the below error. Please bear in mind that I'm using dummy data here; in actuality I'm dealing with about 50 columns of string labeled data, so need a solution that doesn't reference any columns by name.
import pandas
from sklearn import preprocessing
Traceback (most recent call last): File "", line 1, in File "/Users/bbalin/anaconda/lib/python2.7/site-packages/sklearn/preprocessing/label.py", line 103, in fit y = column_or_1d(y, warn=True) File "/Users/bbalin/anaconda/lib/python2.7/site-packages/sklearn/utils/validation.py", line 306, in... less
I ran a logistic regression model and made predictions of the logit values. I used this to get the points on the ROC curve:
from sklearn import metrics
fpr, tpr, thresholds =... moreI ran a logistic regression model and made predictions of the logit values. I used this to get the points on the ROC curve:
from sklearn import metrics
fpr, tpr, thresholds = metrics.roc_curve(Y_test,p)
I know metrics.roc_auc_score gives the area under the ROC curve. Can anyone tell me what command will find the optimal cut-off point (threshold value)?
Can I extract the underlying decision-rules (or 'decision paths') from a trained tree in a decision tree as a textual list?
Something like:
if A>0.4 then if B<0.2 then if... moreCan I extract the underlying decision-rules (or 'decision paths') from a trained tree in a decision tree as a textual list?
Something like:
if A>0.4 then if B<0.2 then if C>0.8 then class='X'
I am trying to apply deep learning for a binary classification problem with high class imbalance between target classes (500k, 31K). I want to write a custom loss function which... moreI am trying to apply deep learning for a binary classification problem with high class imbalance between target classes (500k, 31K). I want to write a custom loss function which should be like: minimize(100-((predicted_smallerclass)/(total_smallerclass))*100)
Appreciate any pointers on how I can build this logic.
When I trained my neural network with Theano or Tensorflow, they will report a variable called "loss" per epoch.
How should I interpret this variable? Higher loss is better or... moreWhen I trained my neural network with Theano or Tensorflow, they will report a variable called "loss" per epoch.
How should I interpret this variable? Higher loss is better or worse, or what does it mean for the final performance (accuracy) of my neural network?
I am looking at working on an NLP project, in any programming language (though Python will be my preference).I want to take two documents and determine how similar they are.