R: sum of rowSums off by one

QBoard » Artificial Intelligence & ML » AI and ML - R » R: sum of rowSums off by one

User Dashboard

Back To Topics

Tags : R csv

Vaibhav Mali

259
For my stats class, I'm using R to do some of the math for my term project. The class doesn't call for it, but I want to supplement myself by learning R, which is my weaker language.

Using this data: skittle-data.csv (Every row was an individual bag of skittles submitted by each student)

I'm trying to generate some charts and other things to satisfy the assignment. While doing so, I noticed that in determining the total number of skittles I was off by 1.

When I load the csv into a dataframe I make summations of the rows, and then sum those summations to get the total, like this:
```
skittles = read.csv("skittle-data.csv", header = TRUE)
columnTotals = colSums(skittles, na.rm=FALSE, dims = 1)
rowTotals = rowSums(skittles, na.rm=FALSE, dims = 1)
total = sum(rowTotals, na.rm=FALSE, dims = 1)

print("Column Totals")
print(columnTotals)

print("Row Totals (individual bags)")
print(rowTotals)

print('Total skittles')
print(total)
```
Running my code from my terminal outputs:
```
[1] "Column Totals"
   Red Orange Yellow  Green Purple
   269    289    296    295    300
[1] "Row Totals (individual bags)"
 [1] 62 62 60 60 60 62 59 61 57 59 62 59 59 62 64 62 63 61 60 60 58 59 60 58
[1] "Total skittles"
[1] 1450
```
Yet excel shows the total number of skittles is 1449.

This is clearly user error on my part, but what is my mistake? Conceptually I think I've done this correctly so the only thing I can assume is that I've misunderstood some part of the summation functions (colSums, rowSums, or sum, etc)

This post was edited by Vaibhav Mali at July 21, 2021 2:04 PM IST
July 20, 2021 2:07 PM IST

0
Viaan Prakash

461

It's the dims = 1 parameter in sum. When x = c(1,1), then sum(x) gives you 2, but sum(x, dims = 1) equals 3. Something about that parameter is giving an extra +1 This post was edited by Viaan Prakash at July 21, 2021 1:57 PM IST

July 21, 2021 1:56 PM IST

0
Samar Patil

346 3

Your error stems from your sum function:

total = sum(rowTotals, na.rm=FALSE, dims = 1)

It's interpreting the dims = 1 as a number that is to be added. Remove it and it'll work as intended.

July 24, 2021 2:55 PM IST

0

Member Sign In

Member Sign In

Create Account

R: sum of rowSums off by one

Connect With Us