Speed of walking large box tree and fetching file metadata
The context of our application is that we're walking our Box folder structure and extracting metadata for every file. It's an enterprise account, so limits aren't an issue per se but there are ~100k documents strewn about 50 top level folders (with depth going up to 12 folders deep). After 4 hours of linearly walking the folder structure I'm noticing we're only a few hundred folders in, and only a few thousand files. There's an absolute metric ton of API calls going on.
Questions:
* Can I walk folders and get file info multi-threaded, will I hit some ceiling of concurrent requests?
* FileInfo attributes: number of comments, modifiedBy, fileSize attributes all appear to be null in my requests. Do I need to add some kind of explicit filter?
* FileInfo requests and FolderInfo requests: do these return *everything* on the default call or should I add explicit fields I want? I'm noticing this is quite time consuming right now.
Maybe the API isn't up to the task of doing an exhaustive walk at high speed, but let me know please.
Please sign in to leave a comment.
Comments
0 comments