Lots of times as a consultant, I need to get content in one format into another format. It usually means a lot of content, stored in all sorts of ways. I cut my teeth on this kind of work way back in the 1980s when I worked at Bain & Company. Consulting teams there would bring us all sorts of junk (“some sort of magnetic media”) and want to make sense of it. Because of this, I’m rarely scared of data conversions. It’s also one of the reasons I love Sharegate so much – for SharePoint, it makes much of this type of work easy!
In one of my current projects, I need to export content from multiple SharePoint 2007 sites to be loaded into a third party platform. The content is stored in lists and I can easily export it with Sharegate. However, there is a Rich Text field – the most important field, of course – and the content in that field can far exceed Excel’s 32k character limit in a single cell.
Instead, I’m using Microsoft Access (more about this later) and some VBA to export the Rich Text to HTML files. I’ll have thousands of them, and I need to convert them to PDFs.
Surprisingly, Binging and Googling took me to all sorts of fly-by-night possibilities, but not the most obvious one. Adobe Acrobat DC does this right out of the box! I didn’t even find this on the Adobe support site.
And you’re done. In my first test, converting about 400 HTML files, some with embedded images, took about 10 minutes.
This post is as much for future me as it is for anyone else, but I hope it will help some others out.
by Marc D Anderson via Marc D Anderson's Blog
No comments:
Post a Comment