Excel files are ubiquitous in data workflows, but working with them in Linux environments often requires conversion to more universal formats like CSV.
This is especially important when dealing with large datasets or when working on headless servers where graphical applications aren’t available.
In this comprehensive guide, I’ll explore various methods for converting XLSX files to CSV format using command-line tools in Linux. Whether you’re managing a few spreadsheets or processing millions of rows, you’ll find effective solutions that match your specific needs.
Why Convert Excel Files to CSV?
Before diving into the methods, let’s understand why converting Excel files to CSV (Comma-Separated Values) format is beneficial:
- Universal compatibility: CSV files can be processed by virtually any data tool
- Text-based format: Easy to manipulate with standard Unix tools like grep, awk, and sed
- Smaller file size: CSV files are typically much smaller than their Excel counterparts
- Automation-friendly: Perfect for inclusion in data processing pipelines
- Headless environment support: No GUI required for processing
Read: How to display Images in the command line in Linux/Ubuntu
Method 1: Gnumeric’s ssconvert – The Fast and Reliable Solution
Gnumeric is a powerful spreadsheet application that includes a command-line utility called ssconvert, which efficiently converts between various spreadsheet formats.
Installation
On Debian-based systems (including Ubuntu):
apt-get install gnumeric
For minimal installation on headless servers:
apt-get install gnumeric –no-install-recommends
On macOS with Homebrew:
brew install gnumeric
Basic Usage
Converting a single file is straightforward:
ssconvert Book1.xlsx newfile.csv
The command will generate output similar to:
Using exporter Gnumeric_stf:stf_csv
Advanced Options
To export multiple sheets to separate CSV files:
ssconvert -S Book1.xlsx output.csv
This will create files like output.csv.0, output.csv.1, etc., one for each sheet.
For custom separators:
ssconvert -O “separator=;” -T Gnumeric_stf:stf_assistant file.xlsx fd://1
For batch processing multiple files:
for f in *.xlsx; do ssconvert “$f” “${f%.xlsx}.csv”; done
Read: Best Spreadsheet Software for Linux
Method 2: LibreOffice – The Full-Featured Approach
LibreOffice is a comprehensive office suite that offers powerful command-line conversion capabilities.
Basic Usage
libreoffice –headless –convert-to csv filename.xlsx –outdir output_directory
The –headless flag ensures LibreOffice runs without launching a GUI, making it suitable for server environments.
Batch Processing
To convert all Excel files in a directory:
for i in *.xlsx; do libreoffice –headless –convert-to csv “$i”; done
Important Notes
- Close all running LibreOffice instances before executing these commands, or they may fail silently
- In some cases, you might need to run the command with sudo privileges
- On macOS, you’ll need to use the full path:
/Applications/LibreOffice.app/Contents/MacOS/soffice –headless –convert-to csv filename.xlsx
For UTF-8 encoding with preserved non-ASCII characters:
libreoffice –headless –convert-to “csv:Text – txt – csv (StarCalc):44,34,76,1,1/1” filename.xlsx
Method 3: xlsx2csv – The Lightweight Python Solution
For those who prefer a minimal approach, especially on headless servers where installing desktop applications might be impractical, Python’s xlsx2csv offers an excellent alternative.
Installation
Using pip:
pip install xlsx2csv
On Debian/Ubuntu:
apt-get install xlsx2csv
Basic Usage
xlsx2csv file.xlsx > output.csv
Working with Multiple Sheets
Export all sheets to a single file:
xlsx2csv file.xlsx –all > all.csv
Export a specific sheet (e.g., the second sheet):
xlsx2csv file.xlsx -s 2 > sheet2.csv
Export without delimiters:
xlsx2csv file.xlsx –all -p ” > all-no-delimiter.csv
Method 4: csvkit – The Data Processing Swiss Army Knife
csvkit is a suite of command-line tools for converting and manipulating CSV files, including excellent support for Excel files.
Installation
pip install csvkit
Or with homebrew on macOS:
brew install csvkit
Basic Usage
in2csv data.xlsx > data.csv
csvkit also provides additional tools for data analysis, filtering, and manipulation, making it a great choice if you need to perform additional operations beyond conversion.
Method 5: R for Advanced Conversion
For those comfortable with R or needing advanced data manipulation capabilities, this approach offers flexibility and power.
Setup
Create a small Bash wrapper for convenience:
xlsx2txt(){
echo ‘
require(xlsx)
write.table(read.xlsx2(commandArgs(TRUE)[1], 1), stdout(), quote=F, row.names=FALSE, col.names=T, sep=”\t”)
‘ | Rscript –vanilla – $1 2>/dev/null
}
Usage
xlsx2txt file.xlsx > file.txt
This approach leverages R’s robust data handling capabilities and can be extended to perform complex transformations during conversion.
Performance Considerations
When processing large files or millions of rows, performance becomes crucial. Here’s how the different methods compare:
- ssconvert (Gnumeric): Excellent performance, even with large files
- xlsx2csv: Good performance with minimal resource usage
- LibreOffice: Moderate performance but handles complex formatting well
- csvkit: Good performance for most use cases but may slow down with very large files
- R-based solution: Performance varies based on system memory but offers advanced data manipulation
Choosing the Right Tool
Each method has its strengths and ideal use cases:
- ssconvert: Best for simplicity and performance on systems where installing Gnumeric is not an issue
- xlsx2csv: Ideal for headless servers and environments where minimal dependencies are preferred
- LibreOffice: Best when dealing with complex Excel files that require preserving special formatting features
- csvkit: Excellent when conversion is part of a larger data processing workflow
- R solution: Preferable when statistical analysis or data transformation is needed alongside conversion
Troubleshooting Common Issues
Permission Denied Errors
When running xlsx2csv or similar tools, you might encounter permission errors like:
IOError: [Errno 13] Permission denied: ‘/usr/local/lib/python2.7/dist-packages/…’
Solution: Install the package for your user only (pip install –user xlsx2csv) or use a virtual environment.
Silent Failures with LibreOffice
If LibreOffice commands seem to have no effect:
- Ensure no LibreOffice instances are running
- Check system logs for errors
- Try running with sudo if necessary
GConf Warnings with Gnumeric
When running ssconvert on headless servers, you might see warnings like:
GConf-WARNING **: Client failed to connect to the D-BUS daemon
These can be safely ignored by redirecting output:
ssconvert input.xlsx output.csv > /dev/null 2>&1
FAQ
Can these methods handle Excel files with multiple sheets?
Yes, most of the methods described can handle multiple sheets. ssconvert with the -S flag, xlsx2csv with the -s or –all options, and the LibreOffice approach (which exports the active sheet by default) all provide ways to work with multiple sheets.
How do I specify different delimiters instead of commas?
With ssconvert, use -O “separator=;” option. With LibreOffice, you can specify the format string. With xlsx2csv, use the -d option followed by your preferred delimiter.
Which method is best for extremely large Excel files?
For files with millions of rows, xlsx2csv or ssconvert generally perform best. LibreOffice might struggle with extremely large files.
Do these methods preserve formatting and formulas?
No, CSV is a plain text format that doesn’t support Excel formatting or formulas. Only the calculated values will be preserved in the conversion.
Can these methods be incorporated into automated pipelines?
Absolutely. All methods are command-line based and can be easily incorporated into shell scripts, cron jobs, or other automation workflows.
Conclusion
Converting Excel files to CSV format in Linux doesn’t have to be challenging. The command-line tools discussed in this guide provide efficient solutions for various scenarios, from one-off conversions to processing millions of rows in production environments.
If you like the content, we would appreciate your support by buying us a coffee. Thank you so much for your visit and support.