andrew-svirin / replace-files-duplicates-php
PHP 库用于扫描文件重复项,并通过软链接或硬链接进行替换。
1.0.0
2019-09-26 15:41 UTC
Requires
- php: ^7.2
Requires (Dev)
- phpunit/phpunit: ^6.5
This package is auto-updated.
Last update: 2024-08-27 03:12:58 UTC
README
通过链接替换文件重复项。
概述
面向脚本的扫描目录中相等文件的工具,将最新文件通过硬链接或软链接替换旧文件。硬链接允许在不影响链接实例的情况下删除父文件,但文件或链接实例的修改将对两者产生影响。通过删除副本来减小存储大小的有用工具。
用法
定义缓存和服务的存储。
$scanStorage = new UnixScanStorage([__DIR_PATH_FOR_SCAN__]); $cacheStorage = new FileIndexStorage(__DIR_PATH_FOR_CACHE_STORAGE__); $replacementService = new ReplacementService($scanStorage, $cacheStorage);
扫描目录以构建索引。
$replacementService->scan(function (Record $file) { // Hash consists from concatenation file size + first byte + last byte. $fp = fopen($file->path, 'r'); fseek($fp, 10); $firstChar = fgetc($fp); fseek($fp, -10, SEEK_END); $lastChar = fgetc($fp); $fileSize = filesize($file->path); $hash = $fileSize . ord($firstChar) . ord($lastChar); return $hash; }, function (string $hashA = null, string $hashB = null) { // Compare hashes. $result = strnatcmp($hashA, $hashB); return $result; }, function (Record $file) { // Filter only txt files for scan. $ext = pathinfo($file->path, PATHINFO_EXTENSION); return in_array($ext, ['txt']); });
扫描后查找重复项并通过硬链接进行替换。
$duplicatesGen = $replacementService->findDuplicates(); while (($records = $duplicatesGen->current())) { $replacementService->replaceDuplicatesHard($records); $duplicatesGen->next(); }
计算重复项的字节大小。
$duplicatesGen = $replacementService->findDuplicates(); $duplicateSize = 0; $linkBlock = 1; while (($records = $duplicatesGen->current())) { /* @var $record Record */ $record = reset($records); $stat = stat($record->path); if (0 < $stat['blocks']) { $duplicateSize += ($stat['blocks'] * 512) - $stat['blksize']; } $duplicatesGen->next(); }