QBoard » Big Data » Big Data on Cloud » Issue with data format when using textscan()

Issue with data format when using textscan()

  • I am trying to collect data from a .txt file and add it into a matrix in Matlab for plotting purposes, but there seems to be an error when collecting the data. It seems to be happening with the time record.
    I am using the following code snippet.
     
    fileID = fullfile('SI010118.txt')
     
    C = textscan(fileID, '%{dd.MM.yyyy}D %{HH:MM:SS}T %f %f %f %f %f %f %f %d %d %d %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %{HH:MM:SS}T %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f')
    The error shows as follows.
     
    Error using textscan
    Unable to parse the format character vector at position 16 ==> %{HH:MM:SS}T %f %f %f %f %f %f %f %d %d %d %f %f %f %f %f %f %f %f %f %f
    %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f
    %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %{HH:MM:SS}T %f %f %f %f %f %f %f
    %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f
    %f %f %f
    Date formats must be of the form %T or %{...}T.
    The .txt file I am using is attached.
      September 4, 2021 9:07 PM IST
    0
  • The format should be all lower case for duration.
     

    %{hh:mm:ss}T
    ​
    However, the data appears to be delimited as semicolon. You'd have more luck with readtable:
     

    opts = detectImportOptions('D:\SI010118.txt')
    opts = setvartype(opts,1,'datetime')
    opts = setvaropts(opts,1,'InputFormat','dd.MM.uuuu HH:mm:ss')
    readtable('D:\SI010118.txt',opts)​
      September 7, 2021 1:37 PM IST
    0
  • That second datetime format should be:
     

     %{HH:mm:SS}D
    ​

    D is for datetime and T is for duration.

    Also if you are trying to read starting from the line which starts with:

     01.01.2018 00:00:44;29.59;30.16;29.59; etc...
    

    You should use HeaderFiles name value pair when using textscan function.

      September 8, 2021 12:21 PM IST
    0
  • The string str or file associated with fid is read from and parsed according to format. The function is an extension of strread and textread. Differences include: the ability to read from either a file or a string, additional options, and additional format specifiers.

    The input is interpreted as a sequence of words, delimiters (such as whitespace), and literals. The characters that form delimiters and whitespace are determined by the options. The format consists of format specifiers interspersed between literals. In the format, whitespace forms a delimiter between consecutive literals, but is otherwise ignored.

    The output C is a cell array where the number of columns is determined by the number of format specifiers.

    The first word of the input is matched to the first specifier of the format and placed in the first column of the output; the second is matched to the second specifier and placed in the second column and so forth. If there are more words than specifiers then the process is repeated until all words have been processed or the limit imposed by repeat has been met (see below).

    The string format describes how the words in str should be parsed. As in fscanf, any (non-whitespace) text in the format that is not one of these specifiers is considered a literal. If there is a literal between two format specifiers then that same literal must appear in the input stream between the matching words.
      September 9, 2021 12:53 PM IST
    0