<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>David&#039;s technobabble &#187; PowerGUI</title>
	<atom:link href="http://bable.cybermarshall.com/tag/powergui/feed/" rel="self" type="application/rss+xml" />
	<link>http://bable.cybermarshall.com</link>
	<description>David&#039;s thoughts about this and that</description>
	<lastBuildDate>Fri, 22 Jan 2010 18:29:56 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=abc</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>OCR&#8217;ing all of the PDF files in a SharePoint Document Library using PowerShell and Solid PDF Tools</title>
		<link>http://bable.cybermarshall.com/2009/05/27/ocring-all-of-the-pdf-files-in-a-sharepoint-document-library-using-powershell-and-solid-pdf-tools/</link>
		<comments>http://bable.cybermarshall.com/2009/05/27/ocring-all-of-the-pdf-files-in-a-sharepoint-document-library-using-powershell-and-solid-pdf-tools/#comments</comments>
		<pubDate>Thu, 28 May 2009 03:39:33 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[.NET]]></category>
		<category><![CDATA[PowerShell]]></category>
		<category><![CDATA[SharePoint]]></category>
		<category><![CDATA[WSS]]></category>
		<category><![CDATA[OCR]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[PowerGUI]]></category>
		<category><![CDATA[searchable PDF]]></category>

		<guid isPermaLink="false">http://bable.cybermarshall.com/?p=572</guid>
		<description><![CDATA[A recent review of the PDF Documents in our Document Control Library, revealed that most were &#8220;image only&#8221; PDF&#8217;s. We&#8217;ve run our document control system on different versions of SharePoint technologies since SharePoint Portal Server 2001. We are currently running SharePoint 2007. I&#8217;m surprised that someone did not previously notice that most of our PDF [...]]]></description>
			<content:encoded><![CDATA[<p>A recent review of the PDF Documents in our Document Control Library, revealed that most were &#8220;image only&#8221; PDF&#8217;s. We&#8217;ve run our document control system on different versions of SharePoint technologies since SharePoint Portal Server 2001. We are currently running SharePoint 2007. I&#8217;m surprised that someone did not previously notice that most of our PDF files were not showing up in the searches.</p>
<p>The question is:<em>&#8220;How can we get all of these PDFs reprocessed to be searchable for a reasonable cost?&#8221;</em><span id="more-572"></span> I spent some time reviewing various PDF tools and was surprised that few were &#8220;truly&#8221; scriptable in batch. Some tools would not run on a server or could not interface with a &#8220;HTTP Network Place&#8221;. There were some development kits that were marketed and priced for you to build your own PDF tools and they have a pretty spendy price tag.  These were out of scope with this year&#8217;s budget. Some, like Acrobat, had a batching mechanism in their GUI. However, this would require us to:</p>
<ul>
<li>Manually checkout all of the PDF&#8217;s</li>
<li>Download them into a folder structure</li>
<li>Batch them in the UI</li>
<li>Upload all of the OCR&#8217;ed PDFs</li>
<li>Manually check-in all of the PDF&#8217;s</li>
</ul>
<p>I don&#8217;t think so! We are talking about tens of thousands of documents. This would be labor intensive and prone to human error.</p>
<p>I&#8217;d just about given up finding something that would fit into the current budget, when I received an email notice about Solid Converter PDF V5 being available. While reviewing the v5 features, I noticed Solid PDF Tools v5 was out and it was just $20 more the upgrade to Solid Converter PDF v5. Low and behold, Solid PDF Tools v5 claims that it has a scriptable interface. I got to looking further and discovered their script reference manual and <a href="http://developer.soliddocuments.com/" onclick="pageTracker._trackPageview('/outgoing/developer.soliddocuments.com/?referer=');">developer blog</a>. I downloaded the trial and verified it. The software worked, I sent them my $59 upgrade fee and viola, I had a &#8220;<em>scriptable enough&#8221;</em> tool that I could re-scan &#8220;image-only&#8221; pdf files to create &#8220;searchable&#8221; pdf files. <em>I say &#8220;scriptable enough&#8221; because it gets the job done, but Solid PDF Tools v5 needs to load the GUI, load the splash screen, and display the UI for each file processed. This seems to add 10-15 seconds to the processing time of each file</em></p>
<p>Now to convert the &#8220;image-only&#8221; PDF&#8217;s in SharePoint. Once again, I decided to try PowerShell rather than write a C# program to interface with SharePoint. At the same time, I decided to give the <a href="http://www.powergui.org/downloads.jspa" onclick="pageTracker._trackPageview('/outgoing/www.powergui.org/downloads.jspa?referer=');">PowerGUI Tools</a> a try. I found the PowerGUI Script Editor to be quite useful for developing, debugging and running my script.</p>
<p>The &#8220;proof-of-concept&#8221; result is ~100 lines of PowerShell code that</p>
<ul>
<li>processes all of the webs in a sharepoint site</li>
<li>processes all of the folders and sub-folders</li>
<li>processes all of the PDF documents and sends them to OCR processing</li>
</ul>

<div class="wp_codebox_msgheader"><span class="right"><sup><a href="http://www.ericbess.com/ericblog/2008/03/03/wp-codebox/#examples" target="_blank" title="WP-CodeBox HowTo?" onclick="pageTracker._trackPageview('/outgoing/www.ericbess.com/ericblog/2008/03/03/wp-codebox/_examples?referer=');"><span style="color: #99cc00">?</span></a></sup></span><span class="left"><a href="javascript:;" onclick="javascript:showCodeTxt('p572code2'); return false;">View Code</a> POWERSHELL</span><div class="codebox_clear"></div></div><div class="wp_codebox"><table><tr id="p5722"><td class="code" id="p572code2"><pre class="powershell" style="font-family:monospace;"><span style="color: #008000;">## References</span>
<span style="color: #000000;">&#91;</span>void<span style="color: #000000;">&#93;</span><span style="color: #000000;">&#91;</span><span style="color: #008080;">System.Reflection.Assembly</span><span style="color: #000000;">&#93;</span>::<span style="color: #800000;">LoadWithPartialName</span><span style="color: #000000;">&#40;</span><span style="color: #800000;">&quot;Microsoft.SharePoint&quot;</span><span style="color: #000000;">&#41;</span> 
<span style="color: #000000;">&#91;</span>void<span style="color: #000000;">&#93;</span><span style="color: #000000;">&#91;</span><span style="color: #008080;">System.Reflection.Assembly</span><span style="color: #000000;">&#93;</span>::<span style="color: #800000;">LoadWithPartialName</span><span style="color: #000000;">&#40;</span><span style="color: #800000;">&quot;System.IO&quot;</span><span style="color: #000000;">&#41;</span> 
&nbsp;
<span style="color: #800080;">$SolidPDFTools</span> <span style="color: pink;">=</span> <span style="color: #800000;">&quot;$env:SolidPDFTools.exe&quot;</span> 
<span style="color: #800080;">$LocalFileFolder</span><span style="color: pink;">=</span> <span style="color: #800000;">&quot;D:\spwork\input&quot;</span>;
<span style="color: #800080;">$OCRWorkFolder</span><span style="color: pink;">=</span> <span style="color: #800000;">&quot;D:\spwork\output&quot;</span>;
<span style="color: #800080;">$OCRWorkLogFolder</span><span style="color: pink;">=</span> <span style="color: #800000;">&quot;D:\spwork\logs&quot;</span>;
<span style="color: #800080;">$OCRScriptFile</span><span style="color: pink;">=</span> <span style="color: #800000;">&quot;D:\spwork\ocr.sdscript&quot;</span>;
&nbsp;
<span style="color: #0000FF;">function</span> script:write_local_file<span style="color: #000000;">&#40;</span><span style="color: #800080;">$file</span><span style="color: pink;">,</span> <span style="color: #800080;">$fileFolder</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span>
	<span style="color: #800080;">$fs</span> <span style="color: pink;">=</span> <span style="color: #008080; font-weight: bold;">New-Object</span> System.IO.FileStream<span style="color: #000000;">&#40;</span>$<span style="color: #000000;">&#40;</span><span style="color: #008080; font-weight: bold;">Join-Path</span> <span style="color: #800080;">$fileFolder</span> <span style="color: #800080;">$file</span>.Name<span style="color: #000000;">&#41;</span><span style="color: pink;">,</span> <span style="color: #000000;">&#91;</span>System.IO.FileMode<span style="color: #000000;">&#93;</span>::Create<span style="color: #000000;">&#41;</span>
	<span style="color: #800080;">$bw</span> <span style="color: pink;">=</span> <span style="color: #008080; font-weight: bold;">New-Object</span> System.IO.BinaryWriter<span style="color: #000000;">&#40;</span><span style="color: #800080;">$fs</span><span style="color: #000000;">&#41;</span>;
	<span style="color: #000000;">&#91;</span><span style="color: #008080;">Byte</span><span style="color: #000000;">&#91;</span><span style="color: #000000;">&#93;</span><span style="color: #000000;">&#93;</span> <span style="color: #800080;">$binfile</span> <span style="color: pink;">=</span> <span style="color: #800080;">$file</span>.OpenBinary<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>;    
	<span style="color: #800080;">$bw</span>.<span style="color: #008080; font-weight: bold;">Write</span><span style="color: #000000;">&#40;</span><span style="color: #800080;">$binfile</span><span style="color: #000000;">&#41;</span>;
	<span style="color: #800080;">$bw</span>.Close<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>;
	<span style="color: #800080;">$fs</span>.close<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>;
<span style="color: #000000;">&#125;</span>
&nbsp;
<span style="color: #0000FF;">function</span> script:ocr_local_file<span style="color: #000000;">&#40;</span><span style="color: #800080;">$file</span><span style="color: pink;">,</span> <span style="color: #800080;">$in_fpath</span><span style="color: pink;">,</span> <span style="color: #800080;">$out_fpath</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span>
	<span style="color: #800080;">$infile</span> <span style="color: pink;">=</span> <span style="color: #008080; font-weight: bold;">Join-Path</span> <span style="color: #800080;">$in_fpath</span> <span style="color: #800080;">$file</span>.Name;
	<span style="color: #800080;">$infile</span> <span style="color: pink;">=</span> <span style="color: #800080;">$infile</span> <span style="color: #FF0000;">-replace</span><span style="color: #000000;">&#40;</span><span style="color: #800000;">&quot;\\&quot;</span><span style="color: pink;">,</span><span style="color: #800000;">&quot;\\&quot;</span><span style="color: #000000;">&#41;</span>;
	<span style="color: #800080;">$outfile</span> <span style="color: pink;">=</span> <span style="color: #008080; font-weight: bold;">Join-Path</span> <span style="color: #800080;">$out_fpath</span> <span style="color: #800080;">$file</span>.Name;
	<span style="color: #800080;">$outfile</span> <span style="color: pink;">=</span> <span style="color: #800080;">$outfile</span> <span style="color: #FF0000;">-replace</span><span style="color: #000000;">&#40;</span><span style="color: #800000;">&quot;\\&quot;</span><span style="color: pink;">,</span><span style="color: #800000;">&quot;\\&quot;</span><span style="color: #000000;">&#41;</span>;
        <span style="color: #800080;">$logfile</span> <span style="color: pink;">=</span> <span style="color: #008080; font-weight: bold;">Join-Path</span> <span style="color: #800080;">$OCRWorkLogFolder</span> <span style="color: #800000;">&quot;debug.log&quot;</span>;
	<span style="color: #800080;">$logfile</span> <span style="color: pink;">=</span> <span style="color: #800080;">$logfile</span> <span style="color: #FF0000;">-replace</span><span style="color: #000000;">&#40;</span><span style="color: #800000;">&quot;\\&quot;</span><span style="color: pink;">,</span><span style="color: #800000;">&quot;\\&quot;</span><span style="color: #000000;">&#41;</span>;
	<span style="color: #800080;">$DBG</span><span style="color: pink;">=</span><span style="color: #800000;">'&lt;&lt;/Level/Verbose /Emit (starting) /FileName ('</span> <span style="color: pink;">+</span>  <span style="color: #800080;">$logfile</span> <span style="color: pink;">+</span><span style="color: #800000;">') &gt;&gt; Trace'</span>;
	<span style="color: #800080;">$INP</span><span style="color: pink;">=</span><span style="color: #800000;">'&lt;&lt;/Input ('</span> <span style="color: pink;">+</span> <span style="color: #800080;">$infile</span> <span style="color: pink;">+</span> <span style="color: #800000;">')'</span>;
	<span style="color: #800080;">$OTP</span><span style="color: pink;">=</span><span style="color: #800000;">'/Output ('</span> <span style="color: pink;">+</span> <span style="color: #800080;">$outfile</span> <span style="color: pink;">+</span> <span style="color: #800000;">')'</span>;
	<span style="color: #800080;">$OCR</span><span style="color: pink;">=</span><span style="color: #800000;">'/OCR/Searchable'</span>;
	<span style="color: #800080;">$CRE</span><span style="color: pink;">=</span><span style="color: #800000;">'&gt;&gt; Create'</span>;
	<span style="color: #800080;">$XIT</span><span style="color: pink;">=</span><span style="color: #800000;">'EXIT'</span>;
	<span style="color: #008080; font-weight: bold;">Write-output</span> <span style="color: #800080;">$DBG</span> <span style="color: pink;">|</span> <span style="color: #008080; font-weight: bold;">Out<span style="color: #FF0000;">-File</span></span> <span style="color: #800080;">$OCRScriptFile</span> <span style="color: #008080; font-style: italic;">-encoding</span> ascii;
	<span style="color: #008080; font-weight: bold;">Write-output</span> <span style="color: #800080;">$INP</span> <span style="color: pink;">|</span> <span style="color: #008080; font-weight: bold;">Out<span style="color: #FF0000;">-File</span></span> <span style="color: #800080;">$OCRScriptFile</span> <span style="color: #008080; font-style: italic;">-append</span> <span style="color: #008080; font-style: italic;">-encoding</span> ascii;
	<span style="color: #008080; font-weight: bold;">Write-output</span> <span style="color: #800080;">$OTP</span> <span style="color: pink;">|</span> <span style="color: #008080; font-weight: bold;">Out<span style="color: #FF0000;">-File</span></span> <span style="color: #800080;">$OCRScriptFile</span> <span style="color: #008080; font-style: italic;">-append</span> <span style="color: #008080; font-style: italic;">-encoding</span> ascii;
	<span style="color: #008080; font-weight: bold;">Write-output</span> <span style="color: #800080;">$OCR</span> <span style="color: pink;">|</span> <span style="color: #008080; font-weight: bold;">Out<span style="color: #FF0000;">-File</span></span> <span style="color: #800080;">$OCRScriptFile</span> <span style="color: #008080; font-style: italic;">-append</span> <span style="color: #008080; font-style: italic;">-encoding</span> ascii;
	<span style="color: #008080; font-weight: bold;">Write-output</span> <span style="color: #800080;">$CRE</span> <span style="color: pink;">|</span> <span style="color: #008080; font-weight: bold;">Out<span style="color: #FF0000;">-File</span></span> <span style="color: #800080;">$OCRScriptFile</span> <span style="color: #008080; font-style: italic;">-append</span> <span style="color: #008080; font-style: italic;">-encoding</span> ascii;
	<span style="color: #008080; font-weight: bold;">Write-output</span> <span style="color: #800080;">$XIT</span> <span style="color: pink;">|</span> <span style="color: #008080; font-weight: bold;">Out<span style="color: #FF0000;">-File</span></span> <span style="color: #800080;">$OCRScriptFile</span> <span style="color: #008080; font-style: italic;">-append</span> <span style="color: #008080; font-style: italic;">-encoding</span> ascii;  
&nbsp;
	<span style="color: #800080;">$sTemp</span> <span style="color: pink;">=</span> <span style="color: pink;">&amp;</span>SolidPDFTools <span style="color: pink;">/</span>i <span style="color: #800080;">$OCRScriptFile</span> <span style="color: pink;">/</span>f script 
	<span style="color: #008080; font-weight: bold;">Write-Host</span> <span style="color: #800080;">$sTemp</span>;
<span style="color: #000000;">&#125;</span>
&nbsp;
<span style="color: #0000FF;">function</span> script:upload_ocr_result<span style="color: #000000;">&#40;</span><span style="color: #800080;">$file</span><span style="color: pink;">,</span> <span style="color: #800080;">$ocr_fpath</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span>
	<span style="color: #800080;">$ocrfile</span> <span style="color: pink;">=</span> <span style="color: #008080; font-weight: bold;">Join-Path</span> <span style="color: #800080;">$ocr_fpath</span> <span style="color: #800080;">$file</span>.Name;
	<span style="color: #800080;">$fs</span> <span style="color: pink;">=</span> <span style="color: #008080; font-weight: bold;">New-Object</span> System.IO.FileStream<span style="color: #000000;">&#40;</span>$<span style="color: #000000;">&#40;</span><span style="color: #008080; font-weight: bold;">Join-Path</span> <span style="color: #800080;">$ocr_fpath</span> <span style="color: #800080;">$file</span>.Name<span style="color: #000000;">&#41;</span><span style="color: pink;">,</span> <span style="color: #000000;">&#91;</span>System.IO.FileMode<span style="color: #000000;">&#93;</span>::Open<span style="color: #000000;">&#41;</span>
	<span style="color: #800080;">$br</span> <span style="color: pink;">=</span> <span style="color: #008080; font-weight: bold;">New-Object</span> System.IO.BinaryReader<span style="color: #000000;">&#40;</span><span style="color: #800080;">$fs</span><span style="color: #000000;">&#41;</span>;
	<span style="color: #000000;">&#91;</span><span style="color: #008080;">Byte</span><span style="color: #000000;">&#91;</span><span style="color: #000000;">&#93;</span><span style="color: #000000;">&#93;</span> <span style="color: #800080;">$binfile</span> <span style="color: pink;">=</span> <span style="color: #800080;">$br</span>.ReadBytes<span style="color: #000000;">&#40;</span><span style="color: #800080;">$br</span>.BaseStream.Length<span style="color: #000000;">&#41;</span>;
	<span style="color: #800080;">$file</span>.SaveBinary<span style="color: #000000;">&#40;</span><span style="color: #800080;">$binfile</span><span style="color: #000000;">&#41;</span>;
	<span style="color: #800080;">$br</span>.close<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>;
	<span style="color: #800080;">$fs</span>.close<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>;
 <span style="color: #000000;">&#125;</span>
&nbsp;
<span style="color: #0000FF;">function</span> script:process_a_folder<span style="color: #000000;">&#40;</span><span style="color: #800080;">$folder</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span>
	<span style="color: #800080;">$files</span> <span style="color: pink;">=</span> <span style="color: #800080;">$folder</span>.Files;
	<span style="color: #0000FF;">foreach</span><span style="color: #000000;">&#40;</span><span style="color: #800080;">$file</span> <span style="color: #0000FF;">in</span> <span style="color: #800080;">$files</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span>
		<span style="color: #0000FF;">if</span> <span style="color: #000000;">&#40;</span><span style="color: #800080;">$file</span>.Name.ToLower<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>.Contains<span style="color: #000000;">&#40;</span><span style="color: #800000;">&quot;.pdf&quot;</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span>
			<span style="color: #0000FF;">if</span> <span style="color: #000000;">&#40;</span><span style="color: #800080;">$file</span>.CheckOutStatus <span style="color: #FF0000;">-eq</span> <span style="color: #800000;">&quot;None&quot;</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span>
				<span style="color: #800080;">$file</span>.CheckOut<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>;
            			write_local_file <span style="color: #800080;">$file</span> <span style="color: #800080;">$LocalFileFolder</span>;
				ocr_local_file <span style="color: #800080;">$file</span> <span style="color: #800080;">$LocalFileFolder</span> <span style="color: #800080;">$OCRWorkFolder</span>;
				upload_ocr_result <span style="color: #800080;">$file</span> <span style="color: #800080;">$OCRWorkFolder</span>; 
				<span style="color: #800080;">$file</span>.CheckIn<span style="color: #000000;">&#40;</span><span style="color: #800000;">&quot;version has had OCR processing performed&quot;</span><span style="color: #000000;">&#41;</span>;
				<span style="color: #800080;">$file</span>.Update<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>;
				<span style="color: #000000;">&#125;</span>
			<span style="color: #000000;">&#125;</span>
		<span style="color: #000000;">&#125;</span>
	<span style="color: #800080;">$sub_folders</span> <span style="color: pink;">=</span> <span style="color: #800080;">$folder</span>.SubFolders;
	process_folders<span style="color: #000000;">&#40;</span><span style="color: #800080;">$sub_folders</span><span style="color: #000000;">&#41;</span>;
<span style="color: #000000;">&#125;</span>
&nbsp;
<span style="color: #0000FF;">function</span> script:process_folders<span style="color: #000000;">&#40;</span><span style="color: #800080;">$folders</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span>
	<span style="color: #0000FF;">foreach</span><span style="color: #000000;">&#40;</span><span style="color: #800080;">$folder</span> <span style="color: #0000FF;">in</span> <span style="color: #800080;">$folders</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span>
		process_a_folder<span style="color: #000000;">&#40;</span><span style="color: #800080;">$folder</span><span style="color: #000000;">&#41;</span>;
	<span style="color: #000000;">&#125;</span>
<span style="color: #000000;">&#125;</span>
&nbsp;
<span style="color: #0000FF;">function</span> script:append<span style="color: pink;">-</span>path <span style="color: #000000;">&#123;</span>
	<span style="color: #800080;">$oldPath</span> <span style="color: pink;">=</span> <span style="color: #008080; font-weight: bold;">get-content</span> Env:\Path;
	<span style="color: #800080;">$newPath</span> <span style="color: pink;">=</span> <span style="color: #800080;">$oldPath</span> <span style="color: pink;">+</span> <span style="color: #800000;">&quot;;&quot;</span> <span style="color: pink;">+</span> <a href="about:blank"><span style="color: #000080;">$args</span></a>;
	<span style="color: #008080; font-weight: bold;">set-content</span> Env:\Path <span style="color: #800080;">$newPath</span>;
<span style="color: #000000;">&#125;</span>
&nbsp;
<span style="color: #008000;"># MAIN</span>
append<span style="color: pink;">-</span>path <span style="color: #000000;">&#40;</span><span style="color: #008080; font-weight: bold;">resolve-path</span> <span style="color: #800000;">'D:\Program Files\SolidDocuments\Solid PDF Tools\SPDFT'</span><span style="color: #000000;">&#41;</span>.Path
&nbsp;
<span style="color: #800080;">$site</span> <span style="color: pink;">=</span> <span style="color: #008080; font-weight: bold;">new-object</span> Microsoft.SharePoint.SPSite<span style="color: #000000;">&#40;</span><span style="color: #800000;">&quot;http://my-sp-server&quot;</span><span style="color: #000000;">&#41;</span>;   
<span style="color: #800080;">$siteweb</span> <span style="color: pink;">=</span> <span style="color: #800080;">$site</span>.OpenWeb<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>;   
<span style="color: #800080;">$webs</span> <span style="color: pink;">=</span> <span style="color: #800080;">$siteweb</span>.Webs;   
<span style="color: #0000FF;">foreach</span><span style="color: #000000;">&#40;</span><span style="color: #800080;">$web</span> <span style="color: #0000FF;">in</span> <span style="color: #800080;">$webs</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span>   
	<span style="color: #800080;">$folders</span> <span style="color: pink;">=</span> <span style="color: #800080;">$web</span>.Folders;
	process_folders<span style="color: #000000;">&#40;</span><span style="color: #800080;">$folders</span><span style="color: #000000;">&#41;</span>;
	<span style="color: #000000;">&#125;</span></pre></td></tr></table></div>

<p>I do plan to add some logging, change the hard-coded variables, and look at using streams instead of Byte[] to be more flexible and scaleable. I&#8217;ll need some error handling to deal with things like download or upload failures before I can run this in production. I&#8217;m also trying to determine if a PDF is &#8220;image only&#8221; or searchable. However, the &#8220;Proof-of-concept&#8221; does work.</p>
]]></content:encoded>
			<wfw:commentRss>http://bable.cybermarshall.com/2009/05/27/ocring-all-of-the-pdf-files-in-a-sharepoint-document-library-using-powershell-and-solid-pdf-tools/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>
