Skip to main content
Question

Script to download a Directory over http

  • January 31, 2018
  • 3 replies
  • 10 views

Forum|alt.badge.img+8

I have a remote https directly hosting pkg files, https://www.fileshere.com/.

I need to be able to download all the files in the directory. I have gone through many options in testing and can not find something that will work. Using something native to macOS is a requirement. I have tried curl but that only seems to work with single files and not a directly.

If there is same way to get a list of the files in the location and then send that to the curl would be nice but I have not been able to figure that out either.

Any help you may have would be great.

3 replies

iJake
Forum|alt.badge.img+23
  • Contributor
  • January 31, 2018

The problem is that cURL does not recursively download files as wget can but wget is not installed by default. Your best bet would be to parse the HTML for the links to loop over them for cURL to download.


Forum|alt.badge.img+9
  • Valued Contributor
  • February 1, 2018

Curl can't do it naively, but you can achieve it with a bit of scripting:

for file in $(curl -s http://www.domain.com/path/to/files/ |
                  grep href |
                  sed 's/.*href="//' |
                  sed 's/".*//' |
                  grep '^[a-zA-Z].*'); do
    curl -s -O http://www.domain.com/path/to/files/$file
done

The success of this will depend on whether or not the host supports directory listings/indexes.


Forum|alt.badge.img+4
  • Contributor
  • February 1, 2018

Seconding iJake, Python can do this using urllib2 (which is native).
If you give us an example of the page's HTML, I can give you an example scraping script.