QBoard » Artificial Intelligence & ML » AI and ML - R » Drop data frame columns by name

Drop data frame columns by name

  • I have a number of columns that I would like to remove from a data frame. I know that we can delete them individually using something like:
    df$x <- NULL
    But I was hoping to do this with fewer commands.
    Also, I know that I could drop columns using integer indexing like this:
    df <- df[ -c(1, 3:6, 12) ]
    But I am concerned that the relative position of my variables may change.
    Given how powerful R is, I figured there might be a better way than dropping each column one by one.
      December 30, 2020 9:46 AM IST
    0
    1. Drop the column. DataFrame has a method called drop() that removes rows or columns according to specify column(label) names and corresponding axis. ...
    2. Delete the column. del is also an option, you can delete a column by del df['column name'] . ...
    3. Pop the column.
      August 19, 2021 12:48 PM IST
    0
  • within(df, rm(x))
    

    is probably easiest, or for multiple variables:

    within(df, rm(x, y))
    

     

    Or if you're dealing with data.tables (per How do you delete a column by name in data.table?):

    dt[, x := NULL]   # Deletes column x by reference instantly.
    
    dt[, !"x"]   # Selects all but x into a new data.table.

     

    or for multiple variables

    dt[, c("x","y") := NULL]
    
    dt[, !c("x", "y")]
      July 28, 2021 4:44 PM IST
    0
  • There's also the subset command, useful if you know which columns you want:

    df <- data.frame(a = 1:10, b = 2:11, c = 3:12)
    df <- subset(df, select = c(a, c))​


    UPDATED after comment by @hadley: To drop columns a,c you could do:

    df <- subset(df, select = -c(a, c))
    ​
      July 30, 2021 2:51 PM IST
    0
  • You could use %in% like this:

    df[, !(colnames(df) %in% c("x","bar","foo"))]
    ​
      September 24, 2021 1:57 PM IST
    0
  • DataFrame has a method called drop() that removes rows or columns according to specify column(label) names and corresponding axis.

    import pandas as pd
    
    # Create a dataframe from a dict
    df = pd.DataFrame({"a": [1,2,3], "b":[2,4,6]})
    print("The DataFrame object before deleting the column")
    print(df)
    df.drop('a', inplace=True, axis=1)
    print("The DataFrame object after deleting the column a")
    print(df)​
      September 30, 2021 12:49 PM IST
    0
  • You can use a simple list of names :
    DF <- data.frame(
      x=1:10,
      y=10:1,
      z=rep(5,10),
      a=11:20
    )
    drops <- c("x","z")
    DF[ , !(names(DF) %in% drops)]

    Or, alternatively, you can make a list of those to keep and refer to them by name :

    keeps <- c("y", "a")
    DF[keeps]

    EDIT : For those still not acquainted with the drop argument of the indexing function, if you want to keep one column as a data frame, you do:

    keeps <- "y"
    DF[ , keeps, drop = FALSE]

    drop=TRUE (or not mentioning it) will drop unnecessary dimensions, and hence return a vector with the values of column y.

      December 30, 2020 1:19 PM IST
    0