内容摘录
qsv: Blazing-fast Data-Wrangling toolkit
Linux build status
Windows build status
macOS build status
Security audit
Codacy Badge
Crates.io
Discussions
Crates.io downloads
Minimum supported Rust version
FOSSA Status DOI
Ask DeepWiki
<div align="center">
| Table of Contents
:--------------------------|:-------------------------
!qsv logo<br/>_Hi-ho "Quicksilver" away!_<br/><sub><sup>original logo details * Base AI-reimagined logo * Event logo archive</sup></sub><br/>|qsv is a command line program for querying, slicing,<br>sorting, analyzing, filtering, enriching, transforming,<br>validating, joining, formatting, converting, chatting,<br>FAIRifying & documenting tabular data (CSV, Excel, etc).<br>Commands are simple, composable & ___"blazing fast"___.<br><br>* Commands<br>* Installation Options<br> * Whirlwind Tour / Notebooks / Lessons & Exercises<br>* FAQ<br>* Performance Tuning<br>* 👉 Benchmarks 🚀<br>* Environment Variables<br>* Feature Flags<br>* Goals/Non-goals<br>* Testing<br>* NYC SOD 2022/csv,conf,v8/PyConUS 2025/csv,conf,v9<br>* "Have we achieved ACI?" blogpost series<br>* Sponsor
</div>
<div align="center">
Try it out at qsv.dathere.com! <!-- markdownlint-disable-line -->
</div>
| <a name="available-commands">Command | Description |
| --- | --- |
| apply✨<br>📇🚀🧠🤖🔣👆| Apply series of string, date, math & currency transformations to given CSV column/s. It also has some basic NLP functions (similarity, sentiment analysis, profanity, eudex, language & name gender) detection. |
| applydp✨<br>📇🚀🔣👆 !CKAN| <a name="applydp_deeplink"></a>applydp is a slimmed-down version of apply with only Datapusher+ relevant subcommands/operations (qsvdp binary variant only). |
| behead | Drop headers from a CSV. |
| cat<br>🗄️ | Concatenate CSV files by row or by column. |
| clipboard✨<br>🖥️ | Provide input from the clipboard or save output to the clipboard. |
| color✨<br>🐻❄️🖥️ | Outputs tabular data as a pretty, colorized table that always fits into the terminal. Apart from CSV and its dialects, Arrow, Avro/IPC, Parquet, JSON array & JSONL formats are supported with the "polars" feature. |
| count<br>📇🏎️🐻❄️ | Count the rows and optionally compile record width statistics of a CSV file. (11.87 seconds for a 15gb, 27m row NYC 311 dataset without an index. Instantaneous with an index.) If the polars feature is enabled, uses Polars' multithreaded, mem-mapped CSV reader for fast counts even without an index |
| datefmt<br>📇🚀👆 | Formats recognized date fields (19 formats recognized) to a specified date format using strftime date format specifiers. |
| dedup<br>🤯🚀👆 | Remove duplicate rows (See also extdedup, extsort, sort & sortcheck commands). |
| describegpt<br>🌐🤖🪄🗃️📚⛩️ !CKAN | <a name="describegpt_deeplink"></a>Infer a "neuro-symbolic" Data Dictionary, Description & Tags or ask questions about a CSV with a configurable, Mini Jinja prompt file, using any OpenAI API-compatible LLM, including local LLMs via Ollama, Jan & LM Studio.<br>(e.g. Markdown, JSON, TOON, Everything, Spanish, Mandarin, Controlled Tags;<br>--prompt "What are the top 10 complaint types by community board & borough by year?" - deterministic, hallucination-free SQL RAG result; iterative, session-based SQL RAG refinement - refined SQL RAG result) |
| diff<br>🚀 | Find the difference between two CSVs with ludicrous speed!<br/>e.g. _compare two CSVs with 1M rows x 9 columns in under 600ms!_ |
| edit | Replace the value of a cell specified by its row and column. |
| enum<br>👆 | Add a new column enumerating rows by adding a column of incremental or uuid identifiers. Can also be used to copy a column or fill a new column with a constant value. |
| excel<br>🚀 | Exports a specified Excel/ODS sheet to a CSV file. |
| exclude<br>📇👆 | Removes a set of CSV data from another set based on the specified columns. |
| explode<br>🔣👆 | Explode rows into multiple ones by splitting a column value based on the given separator. |
| extdedup<br>👆 | Remove duplicate rows from an arbitrarily large CSV/text file using a memory-mapped, on-disk hash table. Unlike the dedup command, this command does not load the entire file into memory nor does it sort the deduped file. |
| extsort<br>🚀📇👆 | Sort an arbitrarily large CSV/text file using a multithreaded external merge sort algorithm. |
| fetch✨<br>📇🧠🌐 | Send/Fetch data to/from web services for every row using **HTTP Get**. Comes with HTTP/2 adaptive flow control, jaq JSON query language support, dynamic throttling (RateLimit) & caching with available persistent caching using Redis or a disk-cache. |
| fetchpost✨<br>📇🧠🌐⛩️ | Similar to fetch, but uses **HTTP Post** (HTTP GET vs POST methods). Supports HTML form (application/x-www-form-urlencoded), JSON (application/json) and custom content types - with the ability to render payloads using CSV data using the Mini Jinja template engine. |
| fill<br>👆 | Fill empty values. |
| fixlengths | Force a CSV to have same-length records by either padding or truncating them. |
| flatten | A flattened view of CSV records. Useful for viewing one record at a time.<br />e.g. qsv slice -i 5 data.csv \| qsv flatten. |
| fmt | Reformat a CSV with different delimiters, record terminators or quoting rules. (Supports ASCII delimited data.) |
| foreach✨<br>📇 | Execute a shell command once per record in a given CSV file. |
| frequency<br>📇😣🏎️👆🪄!Luau | Build frequency distribution tables) of each column. Uses multithreading to go faster if an index is present (Examples: CSV JSON TOON). |
| geocode✨<br>📇🧠🌐🚀🔣👆🌎 | Geocodes a location against an updatable local copy of the Geonames cities & the Maxmind GeoLite2 databases. With caching and multi-threading, it geocodes up to 360,000 records/sec! |
| geoconvert✨<br>🌎 | Convert between various spatial formats and CSV/SVG including GeoJSON, SHP, and more. |
| headers<br>🗄️ | Show the headers of a CSV. Or show the intersection of all headers between many CSV files. |
| index | Create an index (📇) for a CSV. This is very quick (even the 15gb, 28m …