OK, NOW I could use some help: I have downloaded approx. 150.000 individual pages so far. They are now sitting on my disk drive. I need to maintain a "technical" organization so when I repeat the process when new documents show up I don't download what I already got once more. Therefore, the organization is as follows:
- Each Catalog has a directory within this folder, following the catalog's number (e.g. 12450, 12460, etc.)
- Within each Catalog Directory, each file has its own sub-directory, identified by the file's ID (e.g. 5932, 5937, etc.) - this is the number you see when you are closely looking at the URL for each file - it goes http://wwii.germandocsinrussia.org/de/nodes/[XXXX]-[TITLE] where XXXX is the number I am referring to. I need to maintain these so I later know which files are new and which are not.
- Within each file's directory are the individual pages - prefixed by a 7digit number I created just in case the pages are not coming "in sequence" - then a dash and then the actual page number as given by the server. Doing this guarantees the proper sequencing on the file system.
- Last but not least, within each file's directory is one XML file that gives the file's metadata - Signature, Title, Annotation, Start- and Enddate, and number of pages (if provided)
To make downloads easier, in the folder I am sharing you fill find the structure described above but each catalog directory contains one ZIP File for each of the files contained (and the ZIP contains the directory and the individual pages) - so far so clear?
Here is the
link to the folder. You need to navigate into the directory, then into any zip file and then use the download button from within to download the ZIP...
I am giving you 12476 tonight which is the Flakkorps and Flakdivisions - I need some folks to double-check the files you can grab against the website. In other words: if I give you a file, I need a few of you to grab some of them and make sure that I grabbed the file correctly: all pages included, metadata correct, zoom level as best as the server provides (I always picked level 8, I did not find a file I could zoom in more).
My computer will be zipping and uploading over the next hours... remember, these files are huge so this will take a while (despite my fast upload line). If you don't see the files, come back later...
Now, happy leeching and let me know what you find, good or bad so I know what I might have to correct...
Andreas
__________________
-----------------------------------------------------------------------------------------------------
Web Sites:
Chronicles of the Luftwaffe -
Nachtjagd mit Messerschmitt Bf 109 und Me 262
Publications: JG3 - Eastern Front 1941 (
eBook) - Nachtjagd mit Messerschmitt Bf 109 und Me 262 (Paperback,
German)
-----------------------------------------------------------------------------------------------------
“Those who cannot remember the past are condemned to repeat it” - (George Santayana)