QBoard » Artificial Intelligence & ML » AI and ML - R » Replace specific characters within strings

Replace specific characters within strings

  • I would like to remove specific characters from strings within a vector, similar to the Find and Replace feature in Excel.

    Here are the data I start with:

    group <- data.frame(c("12357e", "12575e", "197e18", "e18947")
    

    I start with just the first column; I want to produce the second column by removing the e's:

    group       group.no.e
    12357e      12357
    12575e      12575
    197e18      19718
    e18947      18947
      October 7, 2021 6:40 PM IST
    0
  • With a regular expression and the function gsub():

    group <- c("12357e", "12575e", "197e18", "e18947")
    group
    [1] "12357e" "12575e" "197e18" "e18947"
    
    gsub("e", "", group)
    [1] "12357" "12575" "19718" "18947"​


    What gsub does here is to replace each occurrence of "e" with an empty string "".

    See ?regexp or gsub for more help.
      October 12, 2021 1:19 PM IST
    0
  • Even more simpler:
    input = "a:b:c:d" output ='' for c in input: if c==':': output +='/' else: output+=c print(output)
    output: a/b/c/d
      October 13, 2021 1:20 PM IST
    0
  • Use the stringi package:

    require(stringi)
    
    group<-data.frame(c("12357e", "12575e", "197e18", "e18947"))
    stri_replace_all(group[,1], "", fixed="e")
    [1] "12357" "12575" "19718" "18947"
      October 22, 2021 2:36 PM IST
    0
  • Strings in python are immutable, so you cannot treat them as a list and assign to indices.
    Use .replace() instead:
    line = line.replace(';', ':')
    If you need to replace only certain semicolons, you'll need to be more specific. You could use slicing to isolate the section of the string to replace in:
    line = line[:10].replace(';', ':') + line[10:]
    That'll replace all semi-colons in the first 10 characters of the string.
      October 8, 2021 8:17 PM IST
    0