To automate PDF processing, you need to grab the following first from your repository:
apt-get install git qpdf exiftool pdftk poppler-utils tesseract-ocr imagemagick-6.q16
Then, install PDF Ingest into a desired folder
git clone https://github.com/tezcatlipoca/pdf_ingest
Enter the followind folder:
Put the files you wish to convert into SRC and type the following command:
Wait until done, then you’re done! 🙂
After discovering that PHPMyAdmin is not suitable to dump or insert large databases, I did a quick search into how things are done by command line.
This worked well and my database was finally not corrupted. Host 1 is a cheap shared hosting provider with a few limitations regarding to internal data transfer, CPU and memory. Host 2 is a VPS with 4 cores, 4GB of memory with an overall decent data speed.
Keep the passwords ready for both hosts.
mysqldump -u [USERNAME] -p [DBNAME] | gzip > [/path_to_file/DBNAME].sql.gz
Copy the file over to Host 2
gzip -d [/path_to_file/DBNAME].sql.gz
[/path_to_mysql/]mysql -u [USERNAME] -p
Be very aware of what you are doing in the next steps since it involves a drop. And when you drop the wrong database, all that’s left is the cold sweat on your forehead.
DROP DATABASE [DBNAME];
CREATE DATABASE [DBNAME];
Saucysauce and more about conditional dumping: https://www.lullabot.com/articles/importexport-large-mysql-databases