2. Instructions

2.1. List of features

This tool has a feature to convert input CSV files to the one which MANUFACIA can read. Additionally, it can also process input files described as below.

Feature

Description

Show version number

Shows the version number of the tool.

Check format

Checks CSV file format and logs the result.

Correct format

Based on the error information, it will automatically correct the format so that MANUFACIA can read.
It only logs where there is an error.

Process data: Divide file

Divides a file into several files with the size of specified line numbers.

Process data: Extract rows

Extract the block of rows specified.

Process data: Extract columns

Extract the block of columns specified.

Process data: Thin out data

Thin out the data every n-th of the specified number within data area.

Process data: Swap rows and columns

Swap row and column.

Process data: Replace header line

Add/overwrite CSV file header.

The options below are the ones to be used together with the options above, that do not have any effect by itself.

Additional feature

Description

Input data path

Specifies file path where the input data files are.

Rebuild time data

Rebuild time data while processing data.

Force data processing

Ignores errors and force process data files.

Important

  1. Correct format first and then process data, because processing data will be only done for CSV files in proper format.

  2. Log file or processed CSV files will be written in the same directory where the original CSV files are.

  3. To process data, only one of the features above can be done. If there are multiple options specified in the command line, only the first option will be taken for the data processing.

2.2. Feature detail

2.2.1. Show version information of the tool

  csv_checker -i

Displays the following information about this tool.

  • Version number

  • Default setting values

2.2.2. Format check

  csv_checker

Verifies CSV files in the current directory. To check CSV files in arbitrary disk location, use -poption.

The result will be logged in the following style.

Format check result

Result

Description

OK (correct)

・Header comments are only in the first line of the file.
・Values are separated by comma (,).
・File encode is UTF-8.

Error1

There are comments in multiple lines.

Error2

Header comments are not the same in different files.

Error3

Delimiter is semicolon (;).

Error4

Delimiter is white space or tab.

Error5

Data are enclosed by double (“”)/single(‘’) quotation marks.

Error6

There are white space before/after the value.

Error7

Double-byte characters are used for the value.

Error8

Non-numeric characters are used.

Error9

Value is NULL or white space.

Error10

BOM (EF,BB,BF) is added to the file.

Error11

File is not encoded in UTF-8.

Error12

Slash (/) is used.

undefined

Other errors; no header line, there is no data lines etc.

2.2.3. Correct format

  csv_checker -c

It will correct errors by running format check so that MANUFACIA can read. The errors will be corrected in the following way.

Correction method due to errors

Error number

Correction method

Error1

Takes only the last comment line and removes the rest.

Error2

No correction.

Error3

Replaces the delimiter with comma (,). (Only for header line.)

Error4

Replaces the delimiter with comma (,). (Only for header line.)

Error5

Removes double/single quotation marks.

Error6

Removes white space.

Error7

Converts to one-byte character.

Error8

No correction.

Error9

Inserts 0.

Error10

Removes BOM (three bytes).

Error11

Converts to UTF-8 encode. (Only if the input file is in SHIFT-JIS encode.)

Error12

Removes slash.

undefined

No correction.

2.2.4. Process data

The following features will be done by adding corresponding option to call csv format checker. Multiple options at the same time will not be accepted. Both row and column number start from 1.

2.2.4.1. Divide file

  csv_checker -d <num>

Divides the data area of the file so that each file will have at most the number of lines specified with <num> and add header line at the first line.

2.2.4.2. Extract rows

  csv_checker -e <start> <end>

Extracts the data area block starts from <start> and ends at <end> from the original file and create a new one. Start line number <start> should not be 1 to exclude header line. If end line number <end> is not given, then it will extract until the end of the file. Header line will be copied to the first line of all the files.

2.2.4.3. Extract columns

  csv_checker -r <start> <end>

Extracts the data area block starts from <start> and ends at <end> from the original file and create a new one. If the specified numbers are above the number of columns available in the CSV file, error will be logged.

2.2.4.4. Thin out data

  csv_checker -t <interval>

Think out data lines specified by <interval> from the data area top (2nd line) until the end of the file.

Example of thinning out data (-t option)

If the interval is three, the lines in the parentheses () will be removed. {header line, 1, 2, (3), 4, 5, (6), 7, 8, (9), …}

2.2.4.5. Swap rows and columns

  csv_checker -s

Swaps the rows and columns of the data. Header line will also be included.

2.2.4.6. Replace header line

  csv_checker -h

If there is header.txt in the current directory, it will replace header line of existing CSV files with the string specified in header.txt. The number of columns in the header.txt should be equal to the one of data area. Only the first line will be used if there are multiple lines defined in the header.txt.

In the following cases, header line will not be changed.

  • No header.txt in the current directory.

  • No header line at the first line of the file.

  • No header line entry in header.txt.

2.2.5. Additional options

These additional options cannot be used alone but together with other process options.

2.2.5.1. Specify input data path

  csv_checker ........ -p <path>

Specify relative or absolute path in <path>. If the path name contains white space, the whole path should be enclosed by double quotation marks. (“”)

2.2.5.2. Rebuild time data

  csv_checker ........ -o

This option will be used together with dividing file option (-d), extracting rows options (-t) or thinning out data option (-e).

Time data of the output data file will be rebuild incrementally from 1 instead of using the time information of the corresponding line.

This feature assumes that time data is in the first columns. If it is not, the conversion will not properly work.

2.2.5.3. Force data processing

  csv_checker ........ -f

If there are errors detected by format check such as Error2, Error8, or undefined, processing data is usually not possible without correcting the format first. This option forces to process the data by ignoring the errors. In that case, the area where there is an error will not be modified and keep the data as it is.