Large downloads via link
A collaborator has provided me with a link to a directory containing a total of 67GB of data (files are 2-3GB/ea). I've confirmed that I'm am able to start downloading the files to my local machine but it would be very inefficient to proceed this when when the data must end up on our remote server for analysis. I have been unable to use wget or curl to successfully transfer files to our server. I also tried rclone but this only allows me to see the contents of my free, personal Box account and not the files shared by my collaborator's link. Can you tell me how to either use wget, curl, or similar tool to transfer the shared data or else how to make the shared files visible to rclone?
-
Hi Seth,
I'm a little confused from the description on how exactly your user was collaborated, but let me see if I can help.
To use a curl command to download a file, you would have needed to set up an application in our developer console. Apps in the console can use oauth 2.0, jwt, or client credentials authentication. For your use case, it would be easiest to set up the client credentials type. After this, you can generate a token to use in the curl command linked above. However - the application would also need to be collaborated in the folder in question.
Based on looking into your user account, it looks like this post was made using a free account - this type doesn't have the ability to use the API. You would want to set up a developer account here. Note - you can't use the same email address.
hope this helps,
Alex, Box Developer Advocate
-
Alex,
Sorry if I was unclear. I gather collaboration has a specific meaning in Box. I only meant to say that someone sent me a link to data on their Box instance (uthscsa.app.box.com) and I'm trying to get the data onto our cluster. Our institution doesn't subscribe to box Box so I don't have access to the developer tools and I'd prefer not to set up a paid account just for this download.
Since the person sharing the link is probably on an enterprise account are they any instructions I can give to them that would let them set up the download so I can get the data more easily? I've also tried downloading to my local machine but at 2M/sec it's hard to get an 8h transfer to complete without errors.
Thanks for the help,
-Seth-
-
Ah. yes. Collaboration mean they've added your user on the content in question... not using a shared link.
The link to setup a developer account is free or should be. You don't have to pay for it. That is going to be the way to go, as there isn't a way to use the API without setting up an application in the developer console.
Alex
-
Alex,
Ok, I have a dev account and I've created an app with client side credentials type (I think). I'm still not completely sure how to make use of it with this link I have. The syntax I can find in the docs here
https://developer.box.com/guides/downloads/shared-link/
says to do:
curl -i -X GET "https://api.box.com/2.0/files/12345/content"
-H "Authorization: Bearer <ACCESS_TOKEN>" \
-H "BoxApi: shared_link=https://cloud.box.com/shared/static/gjasdasjhasd&shared_link_password=letmein" \
-LDo I put the shared link I was given into GET field? Is the authorization token the one from the App I created?
Thanks again
-Seth-
-
Great! The shared link would actually go after the shared_link= part, replacing https://cloud.box.com/shared/static/gjasdasjhasd&shared_link_password=letmein but if I remember right, you have a shared link to a folder right?
-
Alex,
I think it worked. I modified the curl command like this:
curl -i -X GET "<url for my app>" -H "Authorization: Bearer <dev token>" -H "BoxApi: shared_link=<download url>" -L
A bunch of stuff got printed to the standard out but it looks like all 67G was downloaded incredibly fast. I'm just starting to QC to make sure it's all complete. Thanks for all the help, it's greatly appreciated.
-Seth-
-
Alex,
Sorry again, I spoke too soon and was looking at a related dataset not what I was aiming for. I've copied most of it below. Definitely this is progress from before but I'm not sure why it's not grabbing any files.
thanks,
HTTP/1.1 302 Found
Date: Wed, 25 Jan 2023 20:00:10 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Strict-Transport-Security: max-age=31536000
Set-Cookie: z=ngjk16uj6nnpc5hdb8ar60tafd; path=/; domain=.app.box.com; secure; HttpOnly
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate
Pragma: no-cache
Set-Cookie: z=ngjk16uj6nnpc5hdb8ar60tafd; Path=/; Domain=.app.box.com; Secure; HttpOnly; SameSite=None
Set-Cookie: box_visitor_id=63d18a4a52d841.47782219; expires=Thu, 25-Jan-2024 20:00:10 GMT; Max-Age=31536000; path=/; domain=.box.com; secure; SameSite=None
Set-Cookie: bv=OPS-45862; expires=Wed, 01-Feb-2023 20:00:10 GMT; Max-Age=604800; path=/; domain=.app.box.com; secure
Set-Cookie: cn=19; expires=Thu, 25-Jan-2024 20:00:10 GMT; Max-Age=31536000; path=/; domain=.app.box.com; secure
Set-Cookie: site_preference=desktop; path=/; domain=.box.com; secure
Set-Cookie: box_redirect_rm=dev_console_get; expires=Thu, 26-Jan-2023 20:00:10 GMT; Max-Age=86400; path=/; domain=.app.box.com; secure; HttpOnly
Set-Cookie: box_redirect_url=https%3A%2F%2Fapp.box.com%2Fdevelopers%2Fconsole%2Fapp%2F1937224; expires=Thu, 26-Jan-2023 20:00:10 GMT; Max-Age=86400; path=/; domain=.app.box.com; secure; HttpOnly
Set-Cookie: box_referrer_url=value; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; domain=.app.box.com; secure
Set-Cookie: site_preference=desktop; path=/; domain=.box.com; secure
Location: https://account.box.com/login?redirect_url=%2Fdevelopers%2Fconsole%2Fapp%2F1937224
Via: 1.1 google
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
HTTP/1.1 200 OK
Date: Wed, 25 Jan 2023 20:00:10 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Strict-Transport-Security: max-age=31536000
Set-Cookie: z=ok03k2gk3hml5nukq3ge6479co; path=/; domain=.account.box.com; secure; HttpOnly
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate
Pragma: no-cache
Set-Cookie: z=ok03k2gk3hml5nukq3ge6479co; Path=/; Domain=.account.box.com; Secure; HttpOnly; SameSite=None
Set-Cookie: box_visitor_id=63d18a4a7b8037.28389072; expires=Thu, 25-Jan-2024 20:00:10 GMT; Max-Age=31536000; path=/; domain=.box.com; secure; SameSite=None
Set-Cookie: bv=OPS-45862; expires=Wed, 01-Feb-2023 20:00:10 GMT; Max-Age=604800; path=/; domain=.account.box.com; secure
Set-Cookie: cn=19; expires=Thu, 25-Jan-2024 20:00:10 GMT; Max-Age=31536000; path=/; domain=.account.box.com; secure
Set-Cookie: site_preference=desktop; path=/; domain=.box.com; secure
Link: </css/vendor/fonts/Lato-Regular.woff>; rel=preload; as=font
Set-Cookie: box_redirect_url=value; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; domain=.account.box.com; secure
Set-Cookie: uid=value; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; domain=.box.com; secure
Via: 1.1 google
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
# then a bunch of html I'm not copying #
<script>
window.Box = window.Box || {};
Box.config = Box.config || {};
Box.config.currentRm = 'amsterdam_login_premium';
Box.config.requestToken = '64b5e28bba151ad0bf8d8578f7365fb902c7a72ef79b39ed0c64e733e08c0c8f';
Box.config.debug = 0;
Box.config.locale = 'en-US';
Box.config.isBoxEditAllowed = 1;
Box.config.isDeviceTrustFallbackEnabled = 0;
Box.config.useUnSecureUrlForComServer = 0;
Box.config.isBoxEditV4Enabled = 0;
Box.config.isDeviceTrustV4Enabled = 1;
Box.config.isConsoleLoggingEnabled = 0;
Box.config.shouldUseActiveXObjectJSRef = 1;
Box.config.isMSEdgeSupportForBoxEditEnabled = 1;
Box.config.isBoxToolsV3EolEnabled = 1;
Box.config.isSandbox = 0;
window.onload = function() {
if (Box.init) return;
Box.Application.init(Box.config);
};
</script>
<script src="data:text/javascript,Box.Application.init(Box.config);Box.init = true;" defer></script>
</body></html> -
Alex,
I right clicked on my app to get the url and used that. My curl command string was:
https://app.box.com/developers/console/app/1937224
curl -i -X GET "https://app.box.com/developers/console/app/1937224" \
-H "Authorization: Bearer <ACCESS_TOKEN>" \
-H "BoxApi: shared_link=REDACTED" \
-Lthanks
-
Ah! That url should be the file url which you get from using this endpoint first.
It looks like our documentation that I sent may be a bit confusing.
-
Alex,
Ok, if I've understood the process now, I run curl once to get the file directory url and then again to get the actual files? It does seem odd that I'd need to supply the shared link twice though so maybe I'm still doing it wrong?
Is the target of GET something other than the shared link I was sent? I just tried this command. Apologies for being slow on the uptake here.
curl -i -X GET "REDACTED" \
-H "Authorization: Bearer <my current token>" \
-H "BoxApi: shared_link=REDACTED" -L
# this returns the chunk of text below followed by an extremely long line of html.HTTP/1.1 301 Moved Permanently
Date: Thu, 26 Jan 2023 20:49:10 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Location: REDACTED
Strict-Transport-Security: max-age=31536000
Via: 1.1 google
HTTP/1.1 200 OK
Date: Thu, 26 Jan 2023 20:49:10 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
X-Robots-Tag: noindex, nofollow
Strict-Transport-Security: max-age=31536000
Set-Cookie: z=s8jeb64oj4gpg4bta0asjk8n1v; path=/; domain=.app.box.com; secure; HttpOnly
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate
Pragma: no-cache
Set-Cookie: z=s8jeb64oj4gpg4bta0asjk8n1v; Path=/; Domain=.app.box.com; Secure; HttpOnly; SameSite=None
Set-Cookie: box_visitor_id=63d2e746355644.23676371; expires=Fri, 26-Jan-2024 20:49:10 GMT; Max-Age=31536000; path=/; domain=.box.com; secure; SameSite=None
Set-Cookie: bv=OPS-45868; expires=Thu, 02-Feb-2023 20:49:10 GMT; Max-Age=604800; path=/; domain=.app.box.com; secure
Set-Cookie: cn=21; expires=Fri, 26-Jan-2024 20:49:10 GMT; Max-Age=31536000; path=/; domain=.app.box.com; secure
Set-Cookie: site_preference=desktop; path=/; domain=.box.com; secure
Via: 1.1 google
Please sign in to leave a comment.
Comments
15 comments