Scrape course evaluation PDFs from my university’s website.
Technologies
Python 2.7
Selenium Webdriver
Problem
This code allows me to download 101 PDFs with the name download.pdf, download(1).pdf, download(2).pdf, etc. It works up until download(100).pdf, and then the Chrome driver pops up a save as dialog and slowly starts to crash the program.
Attempted Solutions
I’ve tried renaming the download.pdf file that gets downloaded and moving it, but for some reason the next download ends up being named download(1).pdf, even though download.pdf should be an available name.
I’ve tried moving all files from the download directory to a permanent directory, but all that does is move the problem to another directory.
Code
#EDIT: Solution
Thanks to usandfriends and @oaktree for helping me figure out that I needed to use cookies. I rewrote the script and use urllib instead of selenium. Thanks guys!
Thank You!
Thank you for taking the time to read this. I’m new to 0x00sec and have really loved it so far. This is a great community and plan to stay. Thanks in advance for your help.
I probably could, but you can’t access the url until you’ve entered your student credentials. The reason why I was using Selenium instead of urllib was so that I could enter my username and password.
There’s a ton of ways, i think the easiest is opening up the dev console in your browser of choice and then just use copy and paste, but you could also use an intercept proxy to catch things on the fly (My Preferred Method). Additionally there exist browser plugins you can use to manage cookies way more easily than by default dev console shit.