In an attempt to try and preserve the most important information at quakeone.com I have been converting sub-forums to PDF documents. My first installment is from quake-help/general-help/. I intend to do as many and as much of these as I can. I may even revisit ones I’ve already done and attempt to get a better mirror to work with. I think I have better ideas on how to go about this than I did when I started this and hopefully my next installments will be better. However, this isn’t necessarily bad. It’s over 1800 pages of important questions and answers.
- this is not the entire sub-forum. It is the best I could do with what I was able to rip. It’s a whole damn lot of it though. Maybe even better than 80%
- around 600 pages are pretty messed up but, the info is there. It’s just ugly
- I hand pruned 532 pages out of this document. From practically blank pages to pages with entirely useless data (like forum indexes) to pages that had little more than a signature on it.
- I stripped every single solitary form out of the this document (useless weight). 10’s of thousands and most of them by hand cause acrobat couldn’t handle selecting them all at once til I got about halfway through
- I purposely destroyed over 50,000 links. It’s not meant to be a website in a pdf. It’s meant to be a searchable document with an incredible amount of valuable information. Probably half of the links (or more) will die when quakeone finally does. We can’t rely on links for this.
- the final document is over 1800 pages long and only weighs a hair over 16mb
- some images are missing and that may be important. I’m sorry. I was working with what I had
- there weren’t too many 400 bad requests for actual pages. There seemed to be a decent amount for the css on a chunk of pages though. You would think a site mirror program wouldn’t act like every page has a different css but, you would be thinking wrong.
- this took forever and I don’t see this getting any easier as I capture more sub forums
Overall, even though this document isn’t perfect. I’m pretty proud of it. We at least know that this much information won’t be lost. I’m going to try and do better on the next ones but, it’s really out of my hands what I end up with when I try to mirror such a large body of data.
- This topic was modified 48 years ago by .
You must be logged in to reply to this topic.