When we have to predict the value of a categorical (or discrete) outcome we use logistic regression. I believe we use linear regression to also predict the value of an... moreWhen we have to predict the value of a categorical (or discrete) outcome we use logistic regression. I believe we use linear regression to also predict the value of an outcome given the input values.
Then, what is the difference between the two methodologies?
I started learning the concepts of machine learning.
I would like to know some of the use case scenarios for... moreI started learning the concepts of machine learning.
I would like to know some of the use case scenarios for below
I'm new to time series and used the monthly ozone concentration data from Rob Hyndman's websiteto do some forecasting.After doing a log transformation and differencing by lags 1... moreI'm new to time series and used the monthly ozone concentration data from Rob Hyndman's websiteto do some forecasting.After doing a log transformation and differencing by lags 1 and 12 to get rid of the trend and seasonality respectively, I plotted the ACF and PACF shown . Am I on the right track and how would I interpret this as a SARIMA?There seems to be a pattern every 11 lags in the PACF plot, which makes me think I should do more differencing (at 11 lags), but doing so gives me a worse plot. I'd really appreciate any of your help!EDIT: I got rid of the differencing at lag 1 and just used lag 12 instead, and this is what I got for the ACF and PACF.From there, I deduced that: SARIMA(1,0,1)x(1,1,1) (AIC: 520.098) or SARIMA(1,0,1)x(2,1,1) (AIC: 521.250) would be a good fit, but auto.arima gave me (3,1,1)x(2,0,0) (AIC: 560.7) normally and (1,1,1)x(2,0,0) (AIC: 558.09) without stepwise and approximation.I am confused on which model to use, but based on the lowest AIC, SAR(1,0,1)x(1,1,1) would be the... less
I've compiled and linked ios app that uses lib (libclang) that uses stat() with no errors. But i'm having runtime error:
LLVM code which raises error is... moreI've compiled and linked ios app that uses lib (libclang) that uses stat() with no errors. But i'm having runtime error:
LLVM code which raises error is (/Unix/Path.inc):
error_code status(const Twine &Path, file_status &Result) {
SmallString<128> PathStorage;
StringRef P = Path.toNullTerminatedStringRef(PathStorage);
struct stat Status;
int StatRet = ::stat(P.begin(), &Status); // failure here
return fillStatus(StatRet, Status, Result);
}
How was i able to link the app without having stat() in symbols? How can i fix/walk-around it? less
I can't seem to find any python libraries that do multiple regression. The only things I find only do simple regression. I need to regress my dependent variable (y) against... moreI can't seem to find any python libraries that do multiple regression. The only things I find only do simple regression. I need to regress my dependent variable (y) against several independent variables (x1, x2, x3, etc.).For example, with this data:
print 'y x1 x2 x3 x4 x5 x6 x7'
for t in texts:
print "{:>7.1f}{:>10.2f}{:>9.2f}{:>9.2f}{:>10.2f}{:>7.2f}{:>7.2f}{:>9.2f}" /
.format(t.y,t.x1,t.x2,t.x3,t.x4,t.x5,t.x6,t.x7)
(output for above:)
y x1 x2 x3 x4 x5 x6 x7
-6.0 -4.95 -5.87 -0.76 14.73 4.02 0.20 0.45
-5.0 -4.55 -4.52 -0.71 13.74 4.47 0.16 0.50
-10.0 -10.96 -11.64 -0.98 15.49 4.18 0.19 0.53
-5.0 -1.08 -3.36 0.75 24.72 4.96 0.16 0.60
-8.0 -6.52 -7.45 -0.86 16.59 4.29 0.10 0.48
-3.0 -0.81 -2.36 -0.50 22.44 4.81 0.15 0.53
-6.0 -7.01 -7.33 -0.33 13.93... less
I have a problem involving a collection of continuous probability distribution functions, most of which are determined empirically (e.g. departure times, transit times). What I... moreI have a problem involving a collection of continuous probability distribution functions, most of which are determined empirically (e.g. departure times, transit times). What I need is some way of taking two of these PDFs and doing arithmetic on them. E.g. if I have two values x taken from PDF X, and y taken from PDF Y, I need to get the PDF for (x+y), or any other operation f(x,y).
An analytical solution is not possible, so what I'm looking for is some representation of PDFs that allows such things. An obvious (but computationally expensive) solution is monte-carlo: generate lots of values of x and y, and then just measure f(x, y). But that takes too much CPU time.
I did think about representing the PDF as a list of ranges where each range has a roughly equal probability, effectively representing the PDF as the union of a list of uniform distributions. But I can't see how to combine them.
Does anyone have any good solutions to this problem?
Edit: The goal is to create a mini-language (aka Domain... less
I'm looking for some good tools/scripts that allow me to generate a few statistics from a git repository. I've seen this feature on some code hosting sites, and they contained... moreI'm looking for some good tools/scripts that allow me to generate a few statistics from a git repository. I've seen this feature on some code hosting sites, and they contained information like...1.commits per author2.commits per day/week/year/etc.3.lines of code over time4.graphs5.... much moreBasically I just want to get an idea how much my project grows over time, which developer commits most code, and so on.
It's been quite a while since I did any statistics so I am struggling with the definitions of a Poisson distribution. What I understand by the "rate is constant" is that if a... moreIt's been quite a while since I did any statistics so I am struggling with the definitions of a Poisson distribution. What I understand by the "rate is constant" is that if a customer purchases 1 thing on average in a week, they purchase 4 things on average in a four-week period. Is this correct?
Where I believe I am confused is with the final sentence. Is this saying that the time between a customers purchases would grow exponentially as time goes on? Doesn't this contradict the idea that we have a constant rate of purchase? less
I've been asked to port a legacy data processing application over to Java.
The current version of the system is composed of a nubmer of (badly written) Excel sheets. The sheets... moreI've been asked to port a legacy data processing application over to Java.
The current version of the system is composed of a nubmer of (badly written) Excel sheets. The sheets implement a big loop: A number of data-sources are polled. These source are a mixture of CSV and XML-based web-servics.
The process is conceptually simple:
It's stateless, that means the calculations which run are purely dependant on the inputs. The results from the calculations are published (currently by writing a number of CSV files in some standard locations on the network).
Having published the results the polling cycle begins again.
The process will not need an admin GUI, however it would be neat if I could implemnt some kind of web-based control panel. It would be nothing pretty and purely for internal use. The control panel would do little more than dispay stats about the source feeds and possibly force refresh the input feeds in the event of a problem. This component is purely optional in the first delivery round.
A... less
I want to get started on HMM's, but don't know how to go about it. Can people here, give me some basic pointers, where to look?
More than just the theory, I like to do a lot of... moreI want to get started on HMM's, but don't know how to go about it. Can people here, give me some basic pointers, where to look?
More than just the theory, I like to do a lot of hands-on. So, would prefer resources, where I can write small code snippets to check my learning, rather than just dry text.
This is a follow up question to: ruby variable scoping across classes. The solution makes sense to me conceptually, but I can't get it to work. Thought maybe with more code... moreThis is a follow up question to: ruby variable scoping across classes. The solution makes sense to me conceptually, but I can't get it to work. Thought maybe with more code someone could help me.I have a class Login that declares a new IMAP class, authenticates, and picks a mailbox.I then am trying to create a separate class that will "do stuff" in the mailbox. For example, calculate the number of emails received. The problem is that the @imap instance of Net::IMAP doesn't pass from the Login class to the Stat class -- I'm getting no method errors for imap.search in the new class. I don't want to re-log in and re-authenticate each time I need to "do some stuff" with the mailbox. I'm trying to implement the solution in the other thread, but can't seem to get it to work.Here's the Login class:
class Login
def initialize(user, domain, pass)
@username = user
@domain = domain
@pass = pass
#check if gmail or other domain
gmail_test = @domain.include? "gmail.com"
if gmail_test ==... less
Applied probability is an important branch in probability, including computational probability. Since statistics is using probability theory to construct models to deal with data,... moreApplied probability is an important branch in probability, including computational probability. Since statistics is using probability theory to construct models to deal with data, as my understanding, I am wondering what's the essential difference between statistical model and probability model? Probability model does not need real data? Thanks.
Hello!
I want to fit a dataset with a sum of two distribution: Gaussin + Poisson.
The dataset can have up to 3000 numbers, this should be enough for reasonable fitting. Is there... moreHello!
I want to fit a dataset with a sum of two distribution: Gaussin + Poisson.
The dataset can have up to 3000 numbers, this should be enough for reasonable fitting. Is there any convenient way to do it without programming? For example, with Origin software? Or RStudio?
Hi all,
We have count data, but it appears that this is overdispersed. Therefore the assumed Poisson distribution should be replaced by a quasi-Poisson or a negative binomial.... moreHi all,
We have count data, but it appears that this is overdispersed. Therefore the assumed Poisson distribution should be replaced by a quasi-Poisson or a negative binomial. Although there is some literature around this topic (for instance see http://fisher.utstat.toronto.edu/reid/sta2201s/QUASI-POISSON.pdf), it is rather technical, and we were wondering if there is a pragmatic approach in R to determine whether to use Poisson, quiasi-Poisson or negative binomial as the underlying distribution of the response data?
Thanks in advance! less
I want to do a linear regression in R using the lm() function. My data is an annual time series with one field for year (22 years) and another for state (50 states). I want to fit... moreI want to do a linear regression in R using the lm() function. My data is an annual time series with one field for year (22 years) and another for state (50 states). I want to fit a regression for each state so that at the end I have a vector of lm responses. I can imagine doing for loop for each state then doing the regression inside the loop and adding the results of each regression to a vector. That does not seem very R-like, however. In SAS I would do a 'by' statement and in SQL I would do a 'group by'. What's the R way of doing this? less