linux_china技术雷达linux_china技术雷达

Polars

rustdata
试用

Polars是一个基于Rust实现的DataFrame接口,它使用Apache Arrow Columnar Format作为内存模型。 Polars可以在Rust、Python、Node.js、R、SQL等语言中使用,提供了强大的表达式API、查询优化、多线程、SIMD等特性。

Polars核心特性:

  • Fast: Written from scratch in Rust, designed close to the machine and without external dependencies.
  • I/O: First class support for all common data storage layers: local, cloud storage & databases.
  • Intuitive API: Write your queries the way they were intended. Polars, internally, will determine the most efficient way to execute using its query optimizer.
  • Out of Core: The streaming API allows you to process your results without requiring all your data to be in memory at the same time
  • Parallel: Utilises the power of your machine by dividing the workload among the available CPU cores without any additional configuration.
  • Vectorized Query Engine: Using Apache Arrow, a columnar data format, to process your queries in a vectorized manner and SIMD to optimize CPU usage.

Polars支持各种文件类型,如CSV、Excel、Parquet、JSON、Arrow等,同时可以对接各种数据库,可以方便地进行数据处理、转换、分析等操作。

Polars & Friends

  • pandas: a fast, powerful, flexible and easy to use open source data analysis and manipulation tool
  • Ibis: the portable Python dataframe library

References