Linux – Using ‘split’ command

by OpenLib . · September 23, 2024

The split command in Unix and Linux is used to divide a large file into smaller files. It’s particularly useful when you need to handle a large file in chunks or split data for easier processing or transmission.

Here’s a detailed breakdown of the usage of the split command:

Basic Syntax:

Bash

split [OPTION]... [INPUT [PREFIX]]

INPUT: The name of the file you want to split.
PREFIX: The prefix for the names of the output files. By default, the output files will be named xaa, xab, xac, etc. If a prefix is provided, the output files will use that prefix (e.g., fileaa, fileab, etc.).

Commonly Used Options:

1. Split by Size:
To split a file based on size, use the following options:

Bash

split -b SIZE INPUT PREFIX

SIZE: Size of each chunk. You can specify the size in bytes (default), kilobytes (K), megabytes (M), or gigabytes (G).
- Example: split -b 10M largefile part_
- This splits largefile into chunks of 10 MB each with names like part_aa, part_ab, etc.

2. Split by Number of Lines:
To split a file based on the number of lines in each chunk:

Bash

split -l NUMBER INPUT PREFIX

NUMBER: The number of lines each output file should have.
- Example: split -l 1000 data.txt output_
- This splits data.txt into files of 1000 lines each with names like output_aa, output_ab, etc.

3. Split by Number of Files:
To split a file into a specific number of output files:

Bash

split -n NUMBER INPUT PREFIX

NUMBER: The number of chunks (files) to create.
- Example: split -n 5 bigfile part_
- This splits bigfile into 5 equally sized parts named part_aa, part_ab, etc.

4. Split with Numeric Suffix:
By default, split uses alphabetical suffixes (xaa, xab, etc.). To use numeric suffixes instead:

Bash

split --numeric-suffixes=1 INPUT PREFIX

Example: split --numeric-suffixes=1 largefile part_
This creates files with names like part_01, part_02, etc.

5. Custom Suffix Length:
To specify the length of the suffix (the default is 2 characters):

Bash

split -a LENGTH INPUT PREFIX

Example: split -a 3 largefile part_
This would create files with names like part_aaa, part_aab, etc., where the suffix is 3 characters long.

6. Verbose Output:
To see which files are being created during the split operation:

Bash

split --verbose INPUT PREFIX

This option will print the name of each output file as it is being created.

7. Split from a Specific Starting Point:
If you want to start splitting a file from a specific location:

Bash

split -C SIZE INPUT PREFIX

This ensures that no chunk will be larger than SIZE and split lines properly.
Example: split -C 1M largefile part_ ensures each file is no larger than 1MB and doesn’t split inside lines.

8. Round Robin Split:
If you want to split a file by distributing lines in a round-robin manner across several output files:

Bash

split --number=l/N INPUT PREFIX

N specifies how many files you want to split into, and l is the method for distributing lines.
Example: split --number=l/3 inputfile part_ distributes the lines of inputfile across 3 files in a round-robin fashion.

Examples:

Split a file into chunks of 1 MB each:

Bash

split -b 1M largefile part_

This splits largefile into chunks of 1 MB each, with file names starting from part_aa, part_ab, and so on.

Split a file into files with 500 lines each:

Bash

split -l 500 inputfile segment_

This splits inputfile into multiple files, each containing 500 lines, with names like segment_aa, segment_ab, etc.

Split a file into 4 equal parts:

Bash

split -n 4 inputfile chunk_

This will split inputfile into 4 equal-sized files.

Handling Binary Files:

If you’re working with binary files and want to ensure they’re split correctly without data corruption, you can still use the -b option. Example:

Bash

split -b 512k binaryfile binpart_

This will split a binary file into 512 KB chunks.

Recombining Split Files:

To reassemble the split files back into one, use the cat command:

Bash

cat part_* > combined_file

This will concatenate the files in the order they were created and restore them into a single file.

Conclusion:

The split command is a powerful tool for dividing files into smaller pieces based on size, line count, or number of output files. It offers flexibility with custom prefixes, suffix lengths, and verbose output for ease of use. It’s commonly used when managing large datasets, splitting logs, or distributing files across systems.

Linux – Using ‘split’ command

Basic Syntax:

Commonly Used Options:

Examples:

Handling Binary Files:

Recombining Split Files:

Conclusion:

You may also like...

What’s Hot?

Categories

Recent Posts

Recent Topics

Linux – Using ‘split’ command

Basic Syntax:

Commonly Used Options:

Examples:

Handling Binary Files:

Recombining Split Files:

Conclusion:

You may also like...

Understanding File Compressions in Linux

Linux – Using ‘cut’ command efficiently

Why Linux is most used?

What’s Hot?

Categories