Documents Home » Data Files » Structured Data » Labeled Data » Real-world recreated business-related spreadsheet for data preprocessing

Viaan Prakash's Documents

  • More »
  •  
  •  

Real-world recreated business-related spreadsheet for data preprocessing

February 28, 2022

Raw data of real analytical use cases in a number of industries and companies is frequently provided in an Excel-based form. These files usually cannot be processed directly in machine learning models, but must first be cleaned and preprocessed. In this procedure, many different types of pitfalls may occur. This makes data preprocessing an essential time factor in the daily work of a data scientist.

Here, an Excel spreadsheet will be presented which in this form is closely oriented to a real case but contains only simulated figures for reasons of data and business results protection. The form and structure of the file correspond to a real case and could be encountered by a data scientist in a company in this way. Such a file can be the result of a download from a financial controlling system, e.g. SAP.

  • License Type Open Data Commons
  • Data Original Source Attribution https://www.kaggle.com/dgluesen/sales-and-workload-data-from-retail-industry