avadim / fast-excel-reader
轻量级且非常快速的PHP XLSX电子表格阅读器
Requires
- php: >=7.4
- ext-ctype: *
- ext-mbstring: *
- ext-xmlreader: *
- ext-zip: *
- avadim/fast-excel-helper: ^1.1
Requires (Dev)
- phpunit/phpunit: ^9.0
- dev-main
- v2.19.0
- v2.18.2
- v2.18.1
- v2.18.0
- v2.17.1
- v2.17.0
- v2.16.0
- v2.15.5
- v2.15.4
- v2.15.3
- v2.15.2
- v2.15.1
- v2.15.0
- v2.14.2
- v2.14.1
- v2.14.0
- v2.12.0
- v2.11.0
- v2.10.0
- v2.9.2
- v2.9.1
- v2.9.0
- v2.8.4
- v2.8.3
- v2.8.2
- v2.8.1
- v2.8.0
- v2.7.0
- v2.6.0
- v2.5.0
- v2.4.2
- v2.4.1
- v2.4.0
- v2.3.2
- v2.3.1
- v2.3.0
- v2.2.5
- v2.2.4
- v2.2.3
- v2.2.0
- v2.1.2
- v2.1.1
- v2.1.0
- v2.0.2
- v2.0.1
- v2.0.0
- v1.4.2
- v1.4.1
- v1.4.0
- v1.3.4
- v1.3.3
- v1.3.2
- v1.3.1
- dev-dev
This package is auto-updated.
Last update: 2024-09-22 19:37:56 UTC
README
FastExcelReader
FastExcelReader 是 FastExcelPhp 项目的一部分,该项目包括
- FastExcelWriter - 创建 Excel 电子表格
- FastExcelReader - 阅读Excel电子表格
- FastExcelTemplator - 从 XLSX 模板生成 Excel 电子表格
- FastExcelLaravel - 专门的 Laravel 版本
简介
这个库设计得轻量级,超级快,并且占用内存最少。
FastExcelReader 可以读取XLSX格式的Excel兼容电子表格(Office 2007+)。它只读取数据,但读取速度非常快,并且占用内存最少。
特性
- 仅支持XLSX格式(Office 2007+)的多个工作表
- 支持自动检测货币/数值/日期类型
- 支持自动格式化和自定义日期值的格式化器
- 库可以定义和从XLSX文件中提取图像
- 库可以读取单元格的样式选项 - 格式化模式、颜色、边框、字体等。
安装
使用 composer
将 FastExcelReader 安装到您的项目中
composer require avadim/fast-excel-reader
跳转到
用法
您可以在 /demo 文件夹中找到更多示例
简单示例
use \avadim\FastExcelReader\Excel; $file = __DIR__ . '/files/demo-00-simple.xlsx'; // Open XLSX-file $excel = Excel::open($file); // Read all values as a flat array from current sheet $result = $excel->readCells();
您将得到这个数组
Array
(
[A1] => 'col1'
[B1] => 'col2'
[A2] => 111
[B2] => 'aaa'
[A3] => 222
[B3] => 'bbb'
)
// Read all rows in two-dimensional array (ROW x COL) $result = $excel->readRows();
您将得到这个数组
Array
(
[1] => Array
(
['A'] => 'col1'
['B'] => 'col2'
)
[2] => Array
(
['A'] => 111
['B'] => 'aaa'
)
[3] => Array
(
['A'] => 222
['B'] => 'bbb'
)
)
// Read all columns in two-dimensional array (COL x ROW) $result = $excel->readColumns();
您将得到这个数组
Array
(
[A] => Array
(
[1] => 'col1'
[2] => 111
[3] => 222
)
[B] => Array
(
[1] => 'col2'
[2] => 'aaa'
[3] => 'bbb'
)
)
循环中逐行读取值
$sheet = $excel->sheet(); foreach ($sheet->nextRow() as $rowNum => $rowData) { // $rowData is array ['A' => ..., 'B' => ...] $addr = 'C' . $rowNum; if ($sheet->hasImage($addr)) { $sheet->saveImageTo($addr, $fullDirectoryPath); } // handling of $rowData here // ... } // OR foreach ($sheet->nextRow() as $rowNum => $rowData) { // handling of $rowData here // ... // get image list from current row $imageList = $sheet->getImageListByRow(); foreach ($imageList as $imageInfo) { $imageBlob = $sheet->getImageBlob($imageInfo['address']); } } // OR foreach ($sheet->nextRow(['A' => 'One', 'B' => 'Two'], Excel::KEYS_FIRST_ROW) as $rowNum => $rowData) { // $rowData is array ['One' => ..., 'Two' => ...] // ... }
逐行读取的另一种方式
// Init internal read generator $sheet->reset(['A' => 'One', 'B' => 'Two'], Excel::KEYS_FIRST_ROW); // read the first row $rowData = $sheet->readNextRow(); var_dump($rowData); // read the next 3 rows for ($i = 0; $i < 3; $i++) { $rowData = $sheet->readNextRow(); var_dump($rowData); } // Reset internal generator and read all rows $sheet->reset(['A' => 'One', 'B' => 'Two'], Excel::KEYS_FIRST_ROW); $result = []; while ($rowData = $sheet->readNextRow()) { $result[] = $rowData; } var_dump($result);
结果数组中的键
// Read rows and use the first row as column keys $result = $excel->readRows(true);
您将得到这个结果
Array
(
[2] => Array
(
['col1'] => 111
['col2'] => 'aaa'
)
[3] => Array
(
['col1'] => 222
['col2'] => 'bbb'
)
)
可选的第二个参数指定结果数组的键
// Rows and cols start from zero $result = $excel->readRows(false, Excel::KEYS_ZERO_BASED);
您将得到这个结果
Array
(
[0] => Array
(
[0] => 'col1'
[1] => 'col2'
)
[1] => Array
(
[0] => 111
[1] => 'aaa'
)
[2] => Array
(
[0] => 222
[1] => 'bbb'
)
)
结果模式的允许值
可以与结果模式结合使用的附加选项
例如
$result = $excel->readRows(['A' => 'bee', 'B' => 'honey'], Excel::KEYS_FIRST_ROW | Excel::KEYS_ROW_ZERO_BASED);
您将得到这个结果
Array
(
[0] => Array
(
[bee] => 111
[honey] => 'aaa'
)
[1] => Array
(
[bee] => 222
[honey] => 'bbb'
)
)
高级示例
use \avadim\FastExcelReader\Excel; $file = __DIR__ . '/files/demo-02-advanced.xlsx'; $excel = Excel::open($file); $result = [ 'sheets' => $excel->getSheetNames() // get all sheet names ]; $result['#1'] = $excel // select sheet by name ->selectSheet('Demo1') // select area with data where the first row contains column keys ->setReadArea('B4:D11', true) // set date format ->setDateFormat('Y-m-d') // set key for column 'C' to 'Birthday' ->readRows(['C' => 'Birthday']); // read other arrays with custom column keys // and in this case we define range by columns only $columnKeys = ['B' => 'year', 'C' => 'value1', 'D' => 'value2']; $result['#2'] = $excel ->selectSheet('Demo2', 'B:D') ->readRows($columnKeys); $result['#3'] = $excel ->setReadArea('F5:H13') ->readRows($columnKeys);
您可以通过在工作簿中定义的名称设置读取区域。例如,如果工作簿中定义了名为 Headers 的名称,其范围为 Demo1!$B$4:$D$4,那么您可以通过此名称读取单元格
$excel->setReadArea('Values'); $cells = $excel->readCells();
注意,由于该值包含工作表名称,因此该工作表变为默认工作表。
您可以在工作表中设置读取区域
$sheet = $excel->getSheet('Demo1')->setReadArea('Headers'); $cells = $sheet->readCells();
但是,如果您试图在另一个工作表上使用此名称,您将得到一个错误
$sheet = $excel->getSheet('Demo2')->setReadArea('Headers'); // Exception: Wrong address or range "Values"
如果需要,您可以使用带回调函数的 readSheetCallback()
方法完全控制读取过程
use \avadim\FastExcelReader\Excel; $excel = Excel::open($file); $result = []; $excel->readCallback(function ($row, $col, $val) use(&$result) { // Any manipulation here $result[$row][$col] = (string)$val; // if the function returns true then data reading is interrupted return false; }); var_dump($result);
日期格式化器
默认情况下,所有日期时间值都返回为时间戳。但您可以使用 dateFormatter()
改变此行为
$excel = Excel::open($file); $sheet = $excel->sheet()->setReadArea('B5:D7'); $cells = $sheet->readCells(); echo $cells['C5']; // -2205187200 // If argument TRUE is passed, then all dates will be formatted as specified in cell styles // IMPORTANT! The datetime format depends on the locale $excel->dateFormatter(true); $cells = $sheet->readCells(); echo $cells['C5']; // '14.02.1900' // You can specify date format pattern $excel->dateFormatter('Y-m-d'); $cells = $sheet->readCells(); echo $cells['C5']; // '1900-02-14' // set date formatter function $excel->dateFormatter(fn($value) => gmdate('m/d/Y', $value)); $cells = $sheet->readCells(); echo $cells['C5']; // '02/14/1900' // returns DateTime instance $excel->dateFormatter(fn($value) => (new \DateTime())->setTimestamp($value)); $cells = $sheet->readCells(); echo get_class($cells['C5']); // 'DateTime' // custom manipulations with datetime values $excel->dateFormatter(function($value, $format, $styleIdx) use($excel) { // get Excel format of the cell, e.g. '[$-F400]h:mm:ss\ AM/PM' $excelFormat = $excel->getFormatPattern($styleIdx); // get format converted for use in php functions date(), gmdate(), etc // for example the Excel pattern above would be converted to 'g:i:s A' $phpFormat = $excel->getDateFormatPattern($styleIdx); // and if you need you can get value of numFmtId for this cell $style = $excel->getCompleteStyleByIdx($styleIdx, true); $numFmtId = $style['format-num-id']; // do something and write to $result $result = gmdate($phpFormat, $value); return $result; });
有时,如果单元格的格式指定为日期但不含日期,则库可能错误地解释此值。为了避免这种情况,您可以禁用日期格式化
在此,单元格B1包含字符串 "3.2",单元格B2包含日期 2024-02-03,但两个单元格都设置为日期格式
$excel = Excel::open($file); // default mode $cells = $sheet->readCells(); echo $cell['B1']; // -2208798720 - the library tries to interpret the number 3.2 as a timestamp echo $cell['B2']; // 1706918400 - timestamp of 2024-02-03 // date formatter is on $excel->dateFormatter(true); $cells = $sheet->readCells(); echo $cell['B1']; // '03.01.1900' echo $cell['B2']; // '3.2' // date formatter is off $excel->dateFormatter(false); $cells = $sheet->readCells(); echo $cell['B1']; // '3.2' echo $cell['B2']; // 1706918400 - timestamp of 2024-02-03
图像函数
// Returns count images on all sheets $excel->countImages(); // Returns count images on sheet $sheet->countImages(); // Returns image list of sheet $sheet->getImageList(); // Returns image list of specified row $sheet->getImageListByRow($rowNumber); // Returns TRUE if the specified cell has an image $sheet->hasImage($cellAddress); // Returns mime type of image in the specified cell (or NULL) $sheet->getImageMimeType($cellAddress); // Returns inner name of image in the specified cell (or NULL) $sheet->getImageName($cellAddress); // Returns an image from the cell as a blob (if exists) or NULL $sheet->getImageBlob($cellAddress); // Writes an image from the cell to the specified filename $sheet->saveImage($cellAddress, $fullFilenamePath); // Writes an image from the cell to the specified directory $sheet->saveImageTo($cellAddress, $fullDirectoryPath);
单元格值类型
库尝试确定单元格值的类型,并且在大多数情况下它做得正确。因此,您得到数字或字符串值。日期值默认返回为时间戳。但您可以通过设置日期格式(参见 date() PHP 函数的格式选项)来改变此行为。
$excel = Excel::open($file); $result = $excel->readCells(); print_r($result);
上面的示例将输出
Array
(
[B2] => -2205187200
[B3] => 6614697600
[B4] => -6845212800
)
$excel = Excel::open($file); $excel->setDateFormat('Y-m-d'); $result = $excel->readCells(); print_r($result);
上面的示例将输出
Array
(
[B2] => '1900-02-14'
[B3] => '2179-08-12'
[B4] => '1753-01-31'
)
如何获取单元格样式的完整信息
通常读取函数只返回单元格值,但您也可以以样式读取值。在这种情况下,对于每个单元格,返回的将不是一个标量值,而是一个数组,如 ['v' => 标量值, 's' => 样式数组, 'f' => 公式]
$excel = Excel::open($file); $sheet = $excel->sheet(); $rows = $sheet->readRowsWithStyles(); $columns = $sheet->readColumnsWithStyles(); $cells = $sheet->readCellsWithStyles(); $cells = $sheet->readCellsWithStyles();
或者您可以选择只读取样式(不包含值)
$cells = $sheet->readCellStyles(); /* array ( 'format' => array ( 'format-num-id' => 0, 'format-pattern' => 'General', ), 'font' => array ( 'font-size' => '10', 'font-name' => 'Arial', 'font-family' => '2', 'font-charset' => '1', ), 'fill' => array ( 'fill-pattern' => 'solid', 'fill-color' => '#9FC63C', ), 'border' => array ( 'border-left-style' => NULL, 'border-right-style' => NULL, 'border-top-style' => NULL, 'border-bottom-style' => NULL, 'border-diagonal-style' => NULL, ), ) */ $cells = $sheet->readCellStyles(true); /* array ( 'format-num-id' => 0, 'format-pattern' => 'General', 'font-size' => '10', 'font-name' => 'Arial', 'font-family' => '2', 'font-charset' => '1', 'fill-pattern' => 'solid', 'fill-color' => '#9FC63C', 'border-left-style' => NULL, 'border-right-style' => NULL, 'border-top-style' => NULL, 'border-bottom-style' => NULL, 'border-diagonal-style' => NULL, ) */
但我们不推荐在大型文件中使用这些方法
一些有用的方法
Excel对象
getSheetNames()
-- 返回所有工作表名称数组sheet(?string $name = null)
-- 返回默认或指定的工作表getSheet(string $name, ?string $areaRange = null, ?bool $firstRowKeys = false)
-- 通过名称获取工作表getSheetById(int $sheetId, ?string $areaRange = null, ?bool $firstRowKeys = false)
-- 通过ID获取工作表getFirstSheet(?string $areaRange = null, ?bool $firstRowKeys = false)
-- 获取第一个工作表selectSheet(string $name, ?string $areaRange = null, ?bool $firstRowKeys = false)
-- 通过名称选择默认工作表并返回它selectSheetById(int $sheetId, ?string $areaRange = null, ?bool $firstRowKeys = false)
-- 通过ID选择默认工作表并返回它selectFirstSheet(?string $areaRange = null, ?bool $firstRowKeys = false)
-- 将第一个工作表设置为默认并返回它getDefinedNames()
-- 返回工作簿中定义的名称
工作表对象
name()
-- 返回字符串名称isActive()
-- 活跃的工作表isHidden()
-- 如果工作表被隐藏isVisible()
-- 如果工作表可见state()
-- 返回工作表字符串状态(用于isHidden()
和isVisible()
)dimension()
-- 返回工作表属性中的默认工作区域尺寸countRows()
-- 从尺寸计数行countColumns()
-- 从尺寸计数列firstRow()
-- 第一行编号firstCol()
-- 第一列字母readFirstRow()
-- 以数组形式返回第一行单元格的值readFirstRowWithStyles()
-- 以数组形式返回第一行单元格的值和样式
您想支持FastExcelReader吗?
如果您觉得这个包很有用,您可以在GitHub上给我一个星。
或者您可以捐款给我 :)
- USDT (TRC20) TSsUFvJehQBJCKeYgNNR1cpswY6JZnbZK7
- USDT (ERC20) 0x5244519D65035aF868a010C2f68a086F473FC82b
- ETH 0x5244519D65035aF868a010C2f68a086F473FC82b