Script to download a Directory over http

strider_knh
Contributor II

I have a remote https directly hosting pkg files, https://www.fileshere.com/.

I need to be able to download all the files in the directory. I have gone through many options in testing and can not find something that will work. Using something native to macOS is a requirement. I have tried curl but that only seems to work with single files and not a directly.

If there is same way to get a list of the files in the location and then send that to the curl would be nice but I have not been able to figure that out either.

Any help you may have would be great.

3 REPLIES 3

iJake
Valued Contributor

The problem is that cURL does not recursively download files as wget can but wget is not installed by default. Your best bet would be to parse the HTML for the links to loop over them for cURL to download.

Aaron
Contributor II

Curl can't do it naively, but you can achieve it with a bit of scripting:

for file in $(curl -s http://www.domain.com/path/to/files/ |
                  grep href |
                  sed 's/.*href="//' |
                  sed 's/".*//' |
                  grep '^[a-zA-Z].*'); do
    curl -s -O http://www.domain.com/path/to/files/$file
done

The success of this will depend on whether or not the host supports directory listings/indexes.

cornwella
New Contributor III

Seconding iJake, Python can do this using urllib2 (which is native).
If you give us an example of the page's HTML, I can give you an example scraping script.