#1856Memory leak in DirectoryListing / PHP Iterator?

YinciYinci
opened 8 months ago
Author

Bug Report

QA
Flysystem Version3.29.1
Adapter Nameflysystem-aws-s3-v3
Adapter version3.29.0
AWS SDK3.339.10
Laravel Framework10.48.28
Spatie Laravel Media Library10.15.0
PHP8.1
Summary

In short, we have a long running Laravel project with lots of media (60k+ rows and counting). These media undergo conversions via de Media Library package from Spatie. All media is stored in S3. We use the provided clean command to clean up deprecated conversions. To do this, the underlying code retrieves the stored paths, which ends up calling the files method in the FilesystemAdapter provided by Laravel. The code is probably not unknown:

public function files($directory = null, $recursive = false) { return $this->driver->listContents($directory ?? '', $recursive) ->filter(function (StorageAttributes $attributes) { return $attributes->isFile(); }) ->sortByPath() ->map(function (StorageAttributes $attributes) { return $attributes->path(); }) ->toArray(); }

listContents returns a DirectoryListing instance which is then modified and then simply returns an array of string paths. Every time toArray is called (and basically the iterator contents are converted into an array), the memory usage increases. This increase isn't insane (like 2000 bytes per iteration), however you can imagine with such amounts of data this can quickly become a large amount of memory.

As you can expect, eventually the command runs out of memory.

I've tried to see what I can do to fix it, however I am unable to identify the issue of why the toArray call will simply not let go of memory, so I am not able to say if this is truly related to Flysystem or perhaps a native PHP issue. For now I've had to implement a work-around, which is to chunk the process, which basically means that the memory is freed and then a new process is started to continue where it had left off. It is however not an ideal solution. Any help would be appreciated.

How to reproduce

(Laravel based snippet)

$before = memory_get_usage(); Storage::disk("media")->files("1/conversions"); $mid = memory_get_usage(); for ($i = 0; $i < 100; $i++) { Storage::disk("media")->files("1/conversions"); } dd($before, $mid, memory_get_usage());

Outputs: 17822728 19260872 19915000