I'm using linear_model.LinearRegression from scikit-learn as a predictive model. It works and it's perfect. I have a problem to evaluate the predicted results using the... moreI'm using linear_model.LinearRegression from scikit-learn as a predictive model. It works and it's perfect. I have a problem to evaluate the predicted results using the accuracy_score metric.This is my true Data :
array()
My predicted Data:
array()
My code:
accuracy_score(y_true, y_pred, normalize=False)
Error message:
ValueError: Can't handle mix of binary and continuous target
I am a beginner and I want to learn computer programming. So, for now, I have started learning Python by myself with some knowledge about programming in C and Fortran.
Now, I have... moreI am a beginner and I want to learn computer programming. So, for now, I have started learning Python by myself with some knowledge about programming in C and Fortran.
Now, I have installed Python version 3.6.0 and I have struggled finding a suitable text for learning Python in this version. Even the online lecture series ask for versions 2.7 and 2.5 .
Now that I have got a book which, however, makes codes in version 2 and tries to make it as close as possible in version 3 (according to the author); the author recommends "downloading Anaconda for Windows" for installing Python.
So, my question is: What is this 'Anaconda'? I saw that it was some open data science platform. What does it mean? Is it some editor or something like Pycharm, IDLE or something?
Also, I downloaded my Python (the one that I am using right now) for Windows from Python.org and I didn't need to install any "open data science platform". So what is this happening?
Please explain in easy language. I don't have too much knowledge about these. less
Good afternoon.
I have this question I am trying to solve using "panda" statistical data structures and related syntax from the Python scripting language. I am already graduated... moreGood afternoon.
I have this question I am trying to solve using "panda" statistical data structures and related syntax from the Python scripting language. I am already graduated from a US university and employed while currently taking the Coursera.org course of "Python for Data Science" just for professional development, which is offered online at Coursera's platform by the University of Michigan. I'm not sharing answers to anyone either as I abide by Coursera's Honor Code.
First, I was given this panda dataframe chart concerning Olympic medals won by countries around the world:
# Summer Gold Silver Bronze Total # Winter Gold.1 Silver.1 Bronze.1 Total.1 # Games Gold.2 Silver.2 Bronze.2 Combined total ID
I am working on a problem for a Intro to Data Science course on Coursera, and I am struggling with adding data to a column in a dataframe.
This is the data set I'm working... moreI am working on a problem for a Intro to Data Science course on Coursera, and I am struggling with adding data to a column in a dataframe.
This is the data set I'm working with:
SUMLEV REGION DIVISION STATE COUNTY STNAME CTYNAME
1 50 3 6 1 1 Alabama Autauga County
2 50 3 6 1 3 Alabama Baldwin County
3 50 3 6 1 5 Alabama Barbour County
4 50 3 6 1 7 Alabama Bibb County
What I am trying to do is to insert a column called TotalCounties that has the total count of counties by state as a last column. I've done similar things in SQL, but it doesn't seem to work quite the same in Python.
I have tried the code below, but the column ends up displaying as NaN instead of a number like I want it to.
counties_only_df = census_df[census_df
x = counties_only_df.groupby('STNAME').count()
counties_only_df = x
I'm facing an issue with allocating huge arrays in numpy on Ubuntu 18 while not facing the same issue on MacOS.
I am trying to allocate memory for a numpy array with... moreI'm facing an issue with allocating huge arrays in numpy on Ubuntu 18 while not facing the same issue on MacOS.
I am trying to allocate memory for a numpy array with shape (156816, 36, 53806) with
np.zeros((156816, 36, 53806), dtype='uint8')
and while I'm getting an error on Ubuntu OS
>>> import numpy as np
>>> np.zeros((156816, 36, 53806), dtype='uint8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
numpy.core._exceptions.MemoryError: Unable to allocate array with shape (156816, 36, 53806) and data type uint8
I'm not getting it on MacOS:
>>> import numpy as np
>>> np.zeros((156816, 36, 53806), dtype='uint8')
array(,
,
,
...,
,
,
,
Suppose I have a Tensorflow tensor. How do I get the dimensions (shape) of the tensor as integer values? I know there are two methods, tensor.get_shape() and tf.shape(tensor),... moreSuppose I have a Tensorflow tensor. How do I get the dimensions (shape) of the tensor as integer values? I know there are two methods, tensor.get_shape() and tf.shape(tensor), but I can't get the shape values as integer int32 values.
For example, below I've created a 2-D tensor, and I need to get the number of rows and columns as int32 so that I can call reshape() to create a tensor of shape (num_rows * num_cols, 1). However, the method tensor.get_shape() returns values as Dimension type, not int32.
import tensorflow as tf
import numpy as np
I'm following a course on EdX on Programming with Python in Data Science. When using a given function to plot the results of my linear regression model, the graph seems very off... moreI'm following a course on EdX on Programming with Python in Data Science. When using a given function to plot the results of my linear regression model, the graph seems very off with all the scatter points clustered at the bottom and the regression line way up top.
I'm not sure if it is the defined function drawline to be incorrect or sth else is wrong with my modeling process.
here is the defined function
def drawLine(model, X_test, y_test, title, R2):
fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(X_test, y_test, c='g', marker='o')
ax.plot(X_test, model.predict(X_test), color='orange', linewidth=1, alpha=0.7)
plt.show()
here is the code I wrote
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn import linear_model
from sklearn.model_selection import... less
Suppose we are training a keras model on 1000 images of 3 classes and the labels list is . How can we save these labels to a file and use them again during predictions to get the... moreSuppose we are training a keras model on 1000 images of 3 classes and the labels list is . How can we save these labels to a file and use them again during predictions to get the label name from the prediction array?
I often want to quickly save some Python data, but I would also like to save it in a stable file format in case the date lingers for a long time. And so I have the question, how... moreI often want to quickly save some Python data, but I would also like to save it in a stable file format in case the date lingers for a long time. And so I have the question, how can I save my data?
In data science, there are three kinds of data I want to store -- arbitrary Python objects, numpy arrays, and Pandas dataframes. -- what are the stable ways of storing these?
Classification problems, such as logistic regression or multinomial logistic regression, optimize a cross-entropy loss. Normally, the cross-entropy layer follows the softmax... moreClassification problems, such as logistic regression or multinomial logistic regression, optimize a cross-entropy loss. Normally, the cross-entropy layer follows the softmax layer, which produces probability distribution.In tensorflow, there are at least a dozen of different cross-entropy loss functions:tf.losses.softmax_cross_entropytf.losses.sparse_softmax_cross_entropytf.losses.sigmoid_cross_entropytf.contrib.losses.softmax_cross_entropytf.contrib.losses.sigmoid_cross_entropytf.nn.softmax_cross_entropy_with_logitstf.nn.sigmoid_cross_entropy_with_logitsWhich one works only for binary classification and which are suitable for multi-class problems? When should you use sigmoid instead of softmax? How are sparse functions different from others and why is it only softmax?
I'm learning object oriented programing in a data science context.
I want to understand what good practice is in terms of writing methods within a class that relate to one... moreI'm learning object oriented programing in a data science context.
I want to understand what good practice is in terms of writing methods within a class that relate to one another.
When I run my code:
import pandas as pd
pd.options.mode.chained_assignment = None
class MyData:
def __init__(self, file_path):
self.file_path = file_path
def prepper_fun(self):
'''Reads in an excel sheet, gets rid of missing values and sets datatype to numerical'''
df = pd.read_excel(self.file_path)
df = df.dropna()
df = df.apply(pd.to_numeric)
self.df = df
return(df)
def quality_fun(self):
'''Checks if any value in any column is more than 10. If it is, the value is replaced with
a warning 'check the original data value'.'''
for col in self.df.columns:
for row in self.df.index:
if self.df > 10:
self.df = str('check original data value')
return(self.df)
I want to transform the string 'one two three' into one_two_three.
I've tried "_".join('one two three'), but that gives me o_n_e_ _t_w_o_ _t_h_r_e_e_...
how do I insert... moreI want to transform the string 'one two three' into one_two_three.
I've tried "_".join('one two three'), but that gives me o_n_e_ _t_w_o_ _t_h_r_e_e_...
how do I insert the "_" only at spaces between words in a string?
Trying to install xgboost is failing..? The version is Anaconda 2.1.0 (64-bit) on Windows & enterprise. How do I proceed? I have been using R it seems its quite easy to... moreTrying to install xgboost is failing..? The version is Anaconda 2.1.0 (64-bit) on Windows & enterprise. How do I proceed? I have been using R it seems its quite easy to install new package in R from RStudio, but not so in spyder as I need to go to a command-window to do it and then in this case it fails..
import sys print (sys.version) 2.7.8 |Anaconda 2.1.0 (64-bit)| (default, Jul 2 2014, 15:12:11) C:\anaconda\Lib\site-packages>pip install -U xgboost Downloading/unpacking xgboost Could not find a version that satisfies the requirement xgboost (from versions: 0.4a12, 0.4a13) Cleaning up... No distributions matching the version for xgboost Storing debug log for failure in C:\Users\c_kazum\pip\pip.log ------------------------------------------------------------ C:\Users\c_kazum\AppData\Local\Continuum\Anaconda\Scripts\pip-script.py run on 08/27/15 12:52:30 Downloading/unpacking xgboost Getting page https://pypi.python.org/simple/xgboost/ URLs to search for versions for xgboost: *... less