I need to utilize TensorFlow for a project to classify items based on their attributes to a certain class (either 1, 2, or 3).
Only problem is almost every TF tutorial or example... moreI need to utilize TensorFlow for a project to classify items based on their attributes to a certain class (either 1, 2, or 3).
Only problem is almost every TF tutorial or example I find online is about image recognition or text classification. I can't find anything about classification based on numbers. I guess what I'm asking for is where to get started. If anyone knows of a relevant example, or if I'm just thinking about this completely wrong.
We are given the 13 attributes for each item, and need to use the TF neural network to classify each item correctly (or mark the margin of error). But nothing online is showing me even how to start with this kind of dataset.
Example of dataset: (first value is class, other values are attributes)
2, 11.84, 2.89, 2.23, 18, 112, 1.72, 1.32, 0.43, 0.95, 2.65, 0.96, 2.52, 500
3, 13.69, 3.26, 2.54, 20, 107, 1.83, 0.56, 0.5, 0.8, 5.88, 0.96, 1.82, 680
3, 13.84, 4.12, 2.38, 19.5, 89, 1.8, 0.83, 0.48, 1.56, 9.01, 0.57, 1.64, 480
2, 11.56, 2.05, 3.23, 28.5, 119,... less
Out of curiosity, I've been reading up a bit on the field of Machine Learning, and I'm surprised at the amount of computation and mathematics involved. One book I'm reading... moreOut of curiosity, I've been reading up a bit on the field of Machine Learning, and I'm surprised at the amount of computation and mathematics involved. One book I'm reading through uses advanced concepts such as Ring Theory and PDEs (note: the only thing I know about PDEs is that they use that funny looking character). This strikes me as odd considering that mathematics itself is a hard thing to "learn."
Are there any branches of Machine Learning that use different approaches?
I would think that a approaches relying more on logic, memory, construction of unfounded assumptions, and over-generalizations would be a better way to go, since that seems more like the way animals think. Animals don't (explicitly) calculate probabilities and statistics; at least as far as I know. less
I'm following a course on EdX on Programming with Python in Data Science. When using a given function to plot the results of my linear regression model, the graph seems very off... moreI'm following a course on EdX on Programming with Python in Data Science. When using a given function to plot the results of my linear regression model, the graph seems very off with all the scatter points clustered at the bottom and the regression line way up top.
I'm not sure if it is the defined function drawline to be incorrect or sth else is wrong with my modeling process.
here is the defined function
def drawLine(model, X_test, y_test, title, R2):
fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(X_test, y_test, c='g', marker='o')
ax.plot(X_test, model.predict(X_test), color='orange', linewidth=1, alpha=0.7)
plt.show()
here is the code I wrote
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn import linear_model
from sklearn.model_selection import... less
When we have to predict the value of a categorical (or discrete) outcome we use logistic regression. I believe we use linear regression to also predict the value of an... moreWhen we have to predict the value of a categorical (or discrete) outcome we use logistic regression. I believe we use linear regression to also predict the value of an outcome given the input values.
Then, what is the difference between the two methodologies?
I am facing problem while downloading 'caret' package in R studios. The code below was taken from the caret documentation.
install.packages("caret", dependencies = c("Depends",... moreI am facing problem while downloading 'caret' package in R studios. The code below was taken from the caret documentation.
install.packages("caret", dependencies = c("Depends", "Suggests"))
it works fine while installing but it gives Errors and Warnings while unpacking few packages like mentioned below:
ERROR: dependencies ‘eiPack’, ‘ei’, ‘MCMCpack’, ‘Zelig’ are not available for package ‘ZeligEI’
* removing ‘/home/shazil/R/x86_64-pc-linux-gnu-library/3.4/ZeligEI’
Warning in install.packages :
installation of package ‘ZeligEI’ had non-zero exit status
At the end when the whole installation process is finished it says:
The downloaded source packages are in
‘/tmp/RtmpeiP5GO/downloaded_packages’
After that when I use the library() command, the following Error appears
> library(caret)
Error in library(caret) : there is no package called ‘caret’
I am using Ubuntu 16.04, Dell machine Core i5 7th Gen, 6GB RAM AMD RADEON GRAPHICS
Would really appreciate... less
I'm a bit confused by the cross entropy loss in PyTorch.
Considering this example:
import torch
import torch.nn as... moreI'm a bit confused by the cross entropy loss in PyTorch.
Considering this example:
import torch
import torch.nn as nn
from torch.autograd import Variable
criterion = nn.CrossEntropyLoss()
loss = criterion(output, target)
print(loss)
I would expect the loss to be 0. But I get:
Variable containing:
0.7437
As far as I know cross entropy can be calculated like this:
But shouldn't be the result then 1*log(1) = 0 ?
I tried different inputs like one-hot encodings, but this doesn't work at all, so it seems the input shape of the loss function is okay.
I would be really grateful if someone could help me out and tell me where my mistake is.
Thanks in advance! less
It seems like R is really designed to handle datasets that it can pull entirely into memory. What R packages are recommended for signal processing and machine learning on very... moreIt seems like R is really designed to handle datasets that it can pull entirely into memory. What R packages are recommended for signal processing and machine learning on very large datasets that can not be pulled into memory?
If R is simply the wrong way to do this, I am open to other robust free suggestions (e.g. scipy if there is some nice way to handle very large datasets)
I'm new to TensorFlow and Data Science. I made a simple module that should figure out the relationship between input and output numbers. In this case, x and x squared. The code in... moreI'm new to TensorFlow and Data Science. I made a simple module that should figure out the relationship between input and output numbers. In this case, x and x squared. The code in Python:
import numpy as np
import tensorflow as tf
# TensorFlow only log error messages.
tf.logging.set_verbosity(tf.logging.ERROR)
model = tf.keras.Sequential([
tf.keras.layers.Dense(units = 1, input_shape = )
model.compile(loss = "mean_squared_error", optimizer = tf.keras.optimizers.Adam(0.0001))
model.fit(features, labels, epochs = 50000, verbose = False)
print(model.predict())
I tried a different number of units, and adding more layers, and even using the relu activation function, but the results were always wrong. It works with other relationships like x and 2x. What is the problem here? less
I'm interesting in getting the connection from Python to machine learning part of OpenCV 2.2. OpenCV 2.2 already includes python bindings but only to the computer vision (cv) part... moreI'm interesting in getting the connection from Python to machine learning part of OpenCV 2.2. OpenCV 2.2 already includes python bindings but only to the computer vision (cv) part of it and not to the machine learning (ml) part.
Where could I get some third party bindings to also have access to the machine learning part
I'm looking for information on how should a Python Machine Learning project be organized. For Python usual projects there is Cookiecutter and for R ProjectTemplate.
This is my... moreI'm looking for information on how should a Python Machine Learning project be organized. For Python usual projects there is Cookiecutter and for R ProjectTemplate.
This is my current folder structure, but I'm mixing Jupyter Notebooks with actual Python code and it does not seems very clear.
.
├── cache
├── data
├── my_module
├── logs
├── notebooks
├── scripts
├── snippets
└── tools
I work in the scripts folder and currently adding all the functions in files under my_module, but that leads to errors loading data(relative/absolute paths) and other problems.
I could not find proper best practices or good examples on this topic besides this kaggle competition solution and some Notebooks that have all the functions condensed at the start of such Notebook. less
I recently started studying deep learning and other ML techniques, and I started searching for frameworks that simplify the process of build a net and training it, then I found... moreI recently started studying deep learning and other ML techniques, and I started searching for frameworks that simplify the process of build a net and training it, then I found TensorFlow, having little experience in the field, for me, it seems that speed is a big factor for making a big ML system even more if working with deep learning, so why python was chosen by Google to make TensorFlow? Wouldn't it be better to make it over an language that can be compiled and not interpreted?
What are the advantages of using Python over a language like C++ for machine learning? less
Coming from a programming background where you write code, test, deploy, run.. I'm trying to wrap my head around the concept of "training a model" or a "trained model" in data... moreComing from a programming background where you write code, test, deploy, run.. I'm trying to wrap my head around the concept of "training a model" or a "trained model" in data science, and deploying that trained model.
I'm not really concerned about the deployment environment, automation, etc.. I'm trying to understand the deployment unit.. a trained model. What does a trained model look like on a file system, what does it contain?
I understand the concept of training a model, and splitting a set of data into a training set and testing set, but lets say I have a notebook (python / jupyter) and I load in some data, split between training/testing data, and run an algorithm to "train" my model. What is my deliverable under the hood? While I'm training a model I'd think there'd be a certain amount of data being stored in memory.. so how does that become part of the trained model? It obviously can't contain all the data used for training; so for instance if I'm training a chatbot agent (retrieval-based),... less
Well, basically i want to know what does the fit() function does in general, but especially in the pieces of code down there.
Im taking the Machine Learning A-Z Course because im... moreWell, basically i want to know what does the fit() function does in general, but especially in the pieces of code down there.
Im taking the Machine Learning A-Z Course because im pretty new to Machine Learning (i just started). I know some basic conceptual terms, but not the technical part.
CODE1:
from sklearn.impute import SimpleImputer
Some other example where I still have the doubt
CODE 2:
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
print(sc_X)
X_train = sc_X.fit_transform(X_train)
print(X_train)
X_test = sc_X.transform(X_test)
I think that if I know like the general use for this function and what exactly does in general, I'll be good to go. But certaily I'd like to know what is doing on that code
I have a django form, which is collecting user response. I also have a tensorflow sentences classification model. What is the best/standard way to put these two together.... moreI have a django form, which is collecting user response. I also have a tensorflow sentences classification model. What is the best/standard way to put these two together. Details:
tensorflow model was trained on the Movie Review data from Rotten Tomatoes.
Everytime a new row is made in my response model , i want the tensorflow code to classify it( + or - ).
Basically I have a django project directory and two .py files for classification. Before going ahead myself , i wanted to know what is the standard way to implement machine learning algorithms to a web app.
It'd be awesome if you could suggest a tutorial or a repo. Thank you ! less
This may not be the type of question to ask on SO, but just wanted to hear what about other people have to say regarding what factors to consider in implementing machine-learning... moreThis may not be the type of question to ask on SO, but just wanted to hear what about other people have to say regarding what factors to consider in implementing machine-learning algorithms in a large enterprise environment.
One of my goals is to research industry machine-learning solutions that can be tailored to my company's specific needs. Being pretty much the only person who has a math background on my team and and who has done some background reading on machine-learning algorithms previously, I'm tasked with explaining/comparing machine-learning solutions in the industry. From what I've gleaned by googling around, it seems that:
a. Machine-learning and predictive analytics aren't exactly the same thing, so what's inherently different when a company offers predictive analytics software vs. machine-learning software? (e.g. IBM Predictive Analytics vs. Skytree Server)
b. A lot of popular terminology often gets entangled together, especially regarding Big Data, Hadoop, machine-learning, etc. Could... less
I am attacking a combinatorial optimization problem similar to the multi-knapsack problem. The problem has an optimal solution, and i prefer not to settle for an approximate... moreI am attacking a combinatorial optimization problem similar to the multi-knapsack problem. The problem has an optimal solution, and i prefer not to settle for an approximate solution.
Are there any recommended tutorials regarding the quick prototyping and deployment of combinatorial optimization solutions (for senior software engineers that are also Big Data newbies)? I want to move quickly from prototype to deployment onto a docker cluster or AWS.
My background is in distributed systems (a focus on .NET, java, kafka, docker containers, etc...), thus I'm typically inclined to solve complex problems by parallel processing across a cluster of machines (via scaling on a docker cluster or AWS). However, this particular problem can NOT be solved in a brute force manner as the problem space is too large (roughly 100^1000 combinations are possible).
I've limited experience with “big data”, but I'm studying up on knapsack solvers, genetic algorithms, reinforcement learning, and some other AI/ML... less
This question came to my mind while working on 2 projects in AI and ML. What If I'm building a model (e.g. Classification Neural Network,K-NN, .. etc) and this model uses some... moreThis question came to my mind while working on 2 projects in AI and ML. What If I'm building a model (e.g. Classification Neural Network,K-NN, .. etc) and this model uses some function that includes randomness. If I don't fix the seed, then I'm going to get different accuracy results every time I run the algorithm on the same training data. However, If I fix it then some other setting might give better results.
Is averaging a set of accuracies enough to say that the accuracy of this model is xx % ?
I'm not sure If this is the right place to ask such a question/open such a discussion. less
Is there any way, I can add simple L1/L2 regularization in PyTorch? We can probably compute the regularized loss by simply adding the data_loss with the reg_loss but is there any... moreIs there any way, I can add simple L1/L2 regularization in PyTorch? We can probably compute the regularized loss by simply adding the data_loss with the reg_loss but is there any explicit way, any support from PyTorch library to do it more easily without doing it manually?
Does tensorflow have something similar to scikit learn's one hot encoder for processing categorical data? Would using a placeholder of tf.string behave as categorical data?
I... moreDoes tensorflow have something similar to scikit learn's one hot encoder for processing categorical data? Would using a placeholder of tf.string behave as categorical data?
I realize I can manually pre-process the data before sending it to tensorflow, but having it built in is very convenient.
Suppose I have a Tensorflow tensor. How do I get the dimensions (shape) of the tensor as integer values? I know there are two methods, tensor.get_shape() and tf.shape(tensor),... moreSuppose I have a Tensorflow tensor. How do I get the dimensions (shape) of the tensor as integer values? I know there are two methods, tensor.get_shape() and tf.shape(tensor), but I can't get the shape values as integer int32 values.
For example, below I've created a 2-D tensor, and I need to get the number of rows and columns as int32 so that I can call reshape() to create a tensor of shape (num_rows * num_cols, 1). However, the method tensor.get_shape() returns values as Dimension type, not int32.
import tensorflow as tf
import numpy as np
How do I save a trained Naive Bayes classifier to disk and use it... moreHow do I save a trained Naive Bayes classifier to disk and use it to predict data?
I have the following sample program from the scikit-learn website:
from sklearn import datasets
iris = datasets.load_iris()
from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
y_pred = gnb.fit(iris.data, iris.target).predict(iris.data)
print "Number of mislabeled points : %d" % (iris.target != y_pred).sum()
I've trained a tree model with R caret. I'm now trying to generate a confusion matrix and keep getting the following error:
Error in confusionMatrix.default(predictionsTree,... moreI've trained a tree model with R caret. I'm now trying to generate a confusion matrix and keep getting the following error:
Error in confusionMatrix.default(predictionsTree, testdata$catgeory) : the data and reference factors must have the same number of levels
prob <- 0.5 #Specify class split
singleSplit <- createDataPartition(modellingData2$category, p=prob,
times=1, list=FALSE)
cvControl <- trainControl(method="repeatedcv", number=10, repeats=5)
traindata <- modellingData2
testdata <- modellingData2
treeFit <- train(traindata$category~., data=traindata,
trControl=cvControl, method="rpart", tuneLength=10)
predictionsTree <- predict(treeFit, testdata)
confusionMatrix(predictionsTree, testdata$catgeory)
The error occurs when generating the confusion matrix. The levels are the same on both objects. I cant figure out what the problem is. Their structure and levels are given below. They should be the same. Any help would be greatly appreciated as its... less
There doesn't seem to be too many options for deploying predictive models in production which is surprising given the explosion in Big Data.
I understand that the open-source PMML... moreThere doesn't seem to be too many options for deploying predictive models in production which is surprising given the explosion in Big Data.
I understand that the open-source PMML can be used to export models as an XML specification. This can then be used for in-database scoring/prediction. However it seems that to make this work you need to use the PMML plugin by Zementis which means the solution is not truly open source. Is there an easier open way to map PMML to SQL for scoring?
Another option would be to use JSON instead of XML to output model predictions. But in this case, where would the R model sit? I'm assuming it would always need to be mapped to SQL...unless the R model could sit on the same server as the data and then run against that incoming data using an R script?
Any other options out there? less
Given a vector of scores and a vector of actual class labels, how do you calculate a single-number AUC metric for a binary classifier in the R language or in simple English?
Page... moreGiven a vector of scores and a vector of actual class labels, how do you calculate a single-number AUC metric for a binary classifier in the R language or in simple English?
Page 9 of "AUC: a Better Measure..." seems to require knowing the class labels, and here is an example in MATLAB where I don't understand
R(Actual == 1))
Because R (not to be confused with the R language) is defined a vector but used as a function?