FUSE + PDF

Chris Dolan

Equilibrious LLC

cdolan@cpan.org

July 26, 2008

permalink: http://chrisdolan.net/barcamp08/fuse-pdf.html

Roadmap


  1. Demonstrate filesystem-in-PDF
  2. Define FUSE
  3. Describe PDF implementation
  4. Show FUSE code

Demonstration...

This application is a Mac Cocoa GUI front end for the CPAN module Fuse::PDF. You can hand it most any PDF file and it will mount it in /Volumes.

The command line version is much more powerful.

What is FUSE?

Filesystem in Userspace

What is FUSE?

Perl wrapper

What's special about PDF?

  • File structure is just a tree of strings, arrays, hashes, numbers, references and blobs.
  • Renderers look in the trailer to find the Catalog hash to find the Pages tree from which you can find a single Page hash...
  • You can embed arbitrary metadata efficiently without affecting the rendering process by adding an arbitrary new key to the Catalog hash.

PDF internals - Catalog, Pages

%PDF-1.4
1 0 obj
<< /Type /Catalog /FusePDF << /FusePDF_FS 79 0 R >>
   /Metadata 78 0 R /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages /Count 1 /Kids [ 5 0 R ] >>
endobj
3 0 obj
<< /CreationDate (D:20071111223720Z) /Creator (Adobe Illustrator 10)
   /ModDate (D:20071111163806-06'00') /Producer (Adobe PDF library 5.00) >>
endobj
5 0 obj
<< /Type /Page /ArtBox [ 135 603.67383 434.4668 679 ]
   /Contents 74 0 R /MediaBox [ 0 0 612 792 ] /Parent 2 0 R
   /Resources << /ColorSpace << /CS0 66 0 R /CS1 67 0 R >>
   /Font << /TT0 68 0 R >> /ProcSet [ /PDF /Text ] >>
   /Thumb 72 0 R /TrimBox [ 0 0 612 792 ] >>
endobj

PDF internals - Trailer

trailer
<< /ID [ <3ea45250c17a85697af93ca7662ae46f>
<b6e75d95e111a3302f3c74228156deb2> ]
/Info 3 0 R /Root 1 0 R /Size 80 >>
startxref
265166
%%EOF

PDF internals - Crossref table

xref
0 80
0000000000 65535 f 
0000000012 00000 n 
0000000112 00000 n 
0000000171 00000 n 
0000000007 00001 f 
0000000328 00000 n 
0000000694 00000 n 
0000000008 00001 f 
0000000009 00001 f 
0000000010 00001 f 
0000000011 00001 f 
0000000012 00001 f 
0000000013 00001 f 
0000000014 00001 f 
0000000015 00001 f 

FUSE in Perl

use Fuse;
Fuse::main(
   mountpoint => '/Volumes/mnt',
   getattr  => \&fs_getattr,    readlink => \&fs_readlink,
   getdir   => \&fs_getdir,     mknod    => \&fs_mknod,
   mkdir    => \&fs_mkdir,      unlink   => \&fs_unlink,
   rmdir    => \&fs_rmdir,      symlink  => \&fs_symlink,
   rename   => \&fs_rename,     link     => \&fs_link,
   chmod    => \&fs_chmod,      chown    => \&fs_chown,
   truncate => \&fs_truncate,   utime    => \&fs_utime,
   open     => \&fs_open,       read     => \&fs_read,
   write    => \&fs_write,      statfs   => \&fs_statfs,
   threaded => 0, debug => 1);

FUSE in Perl

sub fs_read {
   my ($path, $size, $offset) = @_;
   my ($f) = _parse_path($path);
   return -$f if !ref $f;
   return substr $f->{content}, $offset, $size;
}

FUSE in Perl

sub fs_rmdir {
   my ($path) = @_;
   my ($p, $name) = _parse_path_to_parent($path);
   return -$p if !ref $p;
   my $f = $p->{files}->{$name};
   return -ENOENT() if !ref $f;
   return -ENOTDIR() if 'd' ne $f->{type};
   return -ENOTEMPTY() if 0 != keys %{ $f->{files} };
   delete $p->{files}->{$name};
   $p->{nlink}--;
   $p->{mtime} = time;
   return 0;
}

Conclusion

  • FUSE lets you create filesystems
  • PDF is a generic data structure