<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Code Flux</title>
	<atom:link href="http://chrisdolan.net/talk/feed/" rel="self" type="application/rss+xml" />
	<link>http://chrisdolan.net/talk</link>
	<description>Ideas and tools to improve programming throughput.</description>
	<lastBuildDate>Sat, 28 May 2011 19:15:12 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Detecting failed redactions in PDF</title>
		<link>http://chrisdolan.net/talk/2011/05/28/detecting-failed-redactions-in-pdf/</link>
		<comments>http://chrisdolan.net/talk/2011/05/28/detecting-failed-redactions-in-pdf/#comments</comments>
		<pubDate>Sat, 28 May 2011 19:15:12 +0000</pubDate>
		<dc:creator>Chris Dolan</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://chrisdolan.net/talk/?p=88</guid>
		<description><![CDATA[Timothy Lee of Princeton's Center for Information Technology Policy studied an online database of 1.8 million court documents with my CAM::PDF library to automatically detect solid rectangles drawn over text, representing failed attempts to censor content.]]></description>
				<content:encoded><![CDATA[<p>You&#8217;ve probably seen one or more news stories like this in the last year: a government or corporation releases a document with sensitive text covered with black rectangles, but the text is still technically present in the document and private information is accidentally released.</p>

<p>Timothy Lee of Princeton&#8217;s Center for Information Technology Policy has published a fascinating article titled <em><a href="http://freedom-to-tinker.com/blog/tblee/studying-frequency-redaction-failures-pacer">Studying the Frequency of Redaction Failures in PACER</a></em> in which he describes an online database of 1.8 million PDF court documents. He used my <a href="http://search.cpan.org/dist/CAM-PDF/">CAM::PDF</a> Perl library to automatically search all of the documents attempting to detect solid rectangles drawn over text. Then he manually examined a subset of those, discovering that many of them are cases of accidentally leaked personal information. He proposes that his technique should be used as a pre-processing filter to detect problematic documents before they are added to the database.</p>

<p>Technologically, his technique takes advantage of some interesting details of Adobe&#8217;s Portable Document Format (PDF). In a PDF, the visual elements are all layered on top of each other using document coordinates. My library can make a list of all of the elements and what position they are on the page in the document. His code tries to identify black rectangles and looks for text elements which occur at the same overlapping coordinates as that rectangle. His code is not perfect because it seems to detect text that is drawn intentionally on top of background rectangles, but better safe than sorry with sensitive documents.</p>
]]></content:encoded>
			<wfw:commentRss>http://chrisdolan.net/talk/2011/05/28/detecting-failed-redactions-in-pdf/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CAM::PDF v1.54 fixes appendPDF bug</title>
		<link>http://chrisdolan.net/talk/2011/03/26/cam-pdf-fixes-appendpdf-bug/</link>
		<comments>http://chrisdolan.net/talk/2011/03/26/cam-pdf-fixes-appendpdf-bug/#comments</comments>
		<pubDate>Sat, 26 Mar 2011 18:30:19 +0000</pubDate>
		<dc:creator>Chris Dolan</dc:creator>
				<category><![CDATA[PDF]]></category>
		<category><![CDATA[Perl]]></category>

		<guid isPermaLink="false">http://chrisdolan.net/talk/?p=83</guid>
		<description><![CDATA[I maintain the open source CAM::PDF Perl library in my free time. This library, originally authored by me at Clotho Advanced Media starting in 2002, is a high-performance low-level PDF editing tool. It doesn&#8217;t have support for sophisticated authoring tasks (see PDF::API2 for that!) but it is good for utility work like concatenating two PDFs [...]]]></description>
				<content:encoded><![CDATA[<p>I maintain the open source <a href="http://search.cpan.org/dist/CAM-PDF/">CAM::PDF</a> Perl library in my free time. This library, originally authored by me at Clotho Advanced Media starting in 2002, is a high-performance low-level PDF editing tool. It doesn&#8217;t have support for sophisticated authoring tasks (see <a href="http://search.cpan.org/dist/PDF-API2/">PDF::API2</a> for that!) but it is good for utility work like concatenating two PDFs together or deleting pages from a document or encrypting a PDF.</p>

<p>I just fixed a bug where appending a big PDF to a small one (&#8220;big&#8221; in terms of number of internal objects, which correlates with page count or byte count but is subtly different) often went wrong because I generated object ID numbers by simply incrementing a counter in the small PDF.  Sometimes, that counter matched IDs in the bigger PDF, but those IDs are supposed to be unique per document, so things went badly. This bug is now fixed by simply taking the max ID of the two docs as the new counter value before incrementing to make a new ID number.</p>

<p>I never stumbled on the bug in my own work because I always appended (or prepended) the smaller doc to the larger one for performance reasons.  CAM::PDF v1.54 is on it&#8217;s way to CPAN as I type this.</p>

<p>I&#8217;m grateful to Charlie Katz of the Harvard-Smithsonian Center for Astrophysics for providing me with a simple test case that exhibits the problem!</p>
]]></content:encoded>
			<wfw:commentRss>http://chrisdolan.net/talk/2011/03/26/cam-pdf-fixes-appendpdf-bug/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Perl is my calculator</title>
		<link>http://chrisdolan.net/talk/2011/03/20/perl-is-my-calculator/</link>
		<comments>http://chrisdolan.net/talk/2011/03/20/perl-is-my-calculator/#comments</comments>
		<pubDate>Mon, 21 Mar 2011 01:18:08 +0000</pubDate>
		<dc:creator>Chris Dolan</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://chrisdolan.net/talk/?p=78</guid>
		<description><![CDATA[When I want to do some quick math, I find it easier to type a quick Perl command than to launch a calculator app. I always have Terminal.app or Cygwin rxvt open, so I type something like this: perl -le 'print 32139 + 92176' which yields the copy-pasteable result: 124315 There are possible faster solutions, [...]]]></description>
				<content:encoded><![CDATA[<p>When I want to do some quick math, I find it easier to type a quick Perl command than to launch a calculator app.  I always have Terminal.app or Cygwin <a href="http://en.wikipedia.org/wiki/Rxvt">rxvt</a> open, so I type something like this:</p>

<pre><code>perl -le 'print 32139 + 92176'
</code></pre>

<p>which yields the copy-pasteable result:</p>

<pre><code>124315
</code></pre>

<p>There are possible faster solutions, like Google (try typing &#8220;20 + 20&#8243; in your search bar &#8212; down arrow, select-all and copy) or <a href="http://guides.macrumors.com/Quicksilver#Calculator">Quicksilver</a>&#8216;s math mode on Mac. But Perl is in my muscle memory, and I can quick solutions like appending the results to a file without even thinking about it.</p>

<p>In the command above, I&#8217;ve included two flags: <code>-l</code> and <code>-e</code>.  <code>-l</code> is shorthand to put a newline at the end of every print statement.  <code>-e</code> tells Perl that the next argument will be a script to execute.</p>
]]></content:encoded>
			<wfw:commentRss>http://chrisdolan.net/talk/2011/03/20/perl-is-my-calculator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Static typing, no workarounds</title>
		<link>http://chrisdolan.net/talk/2011/02/28/static-typing-no-workaround/</link>
		<comments>http://chrisdolan.net/talk/2011/02/28/static-typing-no-workaround/#comments</comments>
		<pubDate>Mon, 28 Feb 2011 07:23:25 +0000</pubDate>
		<dc:creator>Chris Dolan</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[workarounds]]></category>

		<guid isPermaLink="false">http://chrisdolan.net/talk/?p=74</guid>
		<description><![CDATA[A simultaneous benefit and curse of loosely-typed languages (Perl, Ruby, Javascript, etc) is that a programmer can do just about anything with a third-party library. The ability to &#8220;fix&#8221; into other people&#8217;s code running in the same process is sometimes called &#8220;monkeypatching&#8221;. Stricter languages, like Java, make it dramatically harder to accomplish the same goals. [...]]]></description>
				<content:encoded><![CDATA[<p>A simultaneous benefit and curse of loosely-typed languages (Perl, Ruby, Javascript, etc) is that a programmer can do just about anything with a third-party library.  The ability to &#8220;fix&#8221; into other people&#8217;s code running in the same process is sometimes called &#8220;monkeypatching&#8221;.</p>

<p>Stricter languages, like Java, make it dramatically harder to accomplish the same goals.  This is intentional, because Java also allows sandboxing of untrusted code in the same code space (pretty much impossible in existing dynamic languages despite <a href="http://search.cpan.org/perldoc?Safe">efforts</a>).</p>

<p>But what do you do if there is a problem with the third-party code and you can&#8217;t change it?  In Java, you may just be screwed.  For example, the <a href="http://river.apache.org/">Apache River</a> distributed computing SDK uses &#8220;throws java.rmi.RemoteException&#8221; as a hint about which server methods are allowed to be invoked from the client.  But, oops, Android&#8217;s Dalvik VM <a href="http://groups.google.com/group/android-developers/browse_thread/thread/a2642f228bfff316">omitted</a> all of the java.rmi.* classes to save resources.  So trying to load a class with that &#8220;throws&#8221; declaration causes a &#8220;NoClassDefFoundError&#8221;.  Because there&#8217;s no way to inject a third-party class into the &#8220;java.*&#8221; packages space, and Android developers have not been willing to add this simple Exception to Dalvik, this <a href="http://groups.google.com/group/android-platform/browse_thread/thread/9b108cf75a100e3a">effectively kills</a> all reasonable hopes of using River on Android.  The River folks even talked about extreme hacks of rewriting the classes on load, but that&#8217;s too impractical.  The answer from Android fans in scenarios like this seems to be &#8220;<a href="http://stackoverflow.com/questions/1873497/xml-parsing-android">Why do you want to use that library anyway?</a>&#8220;</p>

<p>A dynamic language would have just created a stub for that exception if it didn&#8217;t exist.  Is that better or worse?</p>
]]></content:encoded>
			<wfw:commentRss>http://chrisdolan.net/talk/2011/02/28/static-typing-no-workaround/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>JsChilicat: headless Javascript unit testing</title>
		<link>http://chrisdolan.net/talk/2010/12/22/jschilicat-headless-javascript-unit-testing/</link>
		<comments>http://chrisdolan.net/talk/2010/12/22/jschilicat-headless-javascript-unit-testing/#comments</comments>
		<pubDate>Wed, 22 Dec 2010 07:37:30 +0000</pubDate>
		<dc:creator>Chris Dolan</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://chrisdolan.net/talk/?p=69</guid>
		<description><![CDATA[One of my Avid colleagues started a Sourceforge project he calls JsChilicat to simplify automated unit testing of Javascript code. The tool invokes the Rhino Javascript engine with the EnvJS and [QUnit][qunit] frameworks on the JS side. EnvJS emulates the browser&#8217;s DOM to trick your Javascript into thinking its working in an actual browser. JsChilicat [...]]]></description>
				<content:encoded><![CDATA[<p>One of my Avid colleagues started a Sourceforge project he calls <a href="http://jschilicat.sourceforge.net/">JsChilicat</a> to simplify automated unit testing of Javascript code. The tool invokes the <a href="http://www.mozilla.org/rhino/">Rhino</a> Javascript engine with the <a href="http://www.envjs.com/">EnvJS</a> and [QUnit][qunit] frameworks on the JS side.  EnvJS emulates the browser&#8217;s DOM to trick your Javascript into thinking its working in an actual browser. JsChilicat outputs JUnit-compatible XML, including <code>console.log()</code> output.  It also does code coverage reports, but I haven&#8217;t played with that feature yet.  To the best of my Google-fu, this is the only headless Javascript test kit that does code coverage.</p>

<p>The result is brilliant, but a little rough (ExtJS and EnvJS don&#8217;t play nice, for example). The documentation is currently insufficient, but just barely so. One or two examples of usage would be enough to reduce this to a half-hour learning curve.</p>

<p>I have no idea where the name comes from&#8230;  Maybe it&#8217;s a pun in German or something?  <img src='http://chrisdolan.net/talk/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://chrisdolan.net/talk/2010/12/22/jschilicat-headless-javascript-unit-testing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
