About ten years ago, Yahoo appeared on the net. It was so very cool because it was a central place where you could find anything you were looking for. By being popular itself, Yahoo decided what on the web was cool and what was not. Aside from the original NCSA list of all web pages (!), it was the portal site I ever saw. Since then, portals have risen and fallen (e.g. Lycos, Netscape, Altavista). Today, Google released a beta of its new portalized. It’s nice that have plans to let you uber-customize it, but right now it’s quite ho-hum.

The problem with portals is that they decide what’s important and what’s not. If your interests coincide with the portal authors, then you are well served. If you are not a perfect match, you will have to sift through much chaff to reach the kernel of information you seek. Some portals have tried to offer some degree of personalization. For example, Slashdot offers pre-defined portals for broad niches (e.g. as well as the ability to filter out unwanted topics and authors (e.g. JonKatz). That goes a long way towards satisfying the needs of the tech geeks, but it still relies on the much-maligned Slashdot editors to make the initial content selections from which readers may filter.

To get a full serving of content, the typical avid reader would hit his or her 10+ sites on a regular basis to keep up on the latest.


Having to hit more than 10 sites just to see if they have anything new is tedious. Tabbed browsing and bookmarked tab groups does help cut down the overhead, but its still overhead. Why not put the computer to work to do some of this search work for us?

Enter aggregation.

Aggregation tools build your own personal portal for you by periodically querying the sources that interest you and pulling the prime content into a digestible form. The basic tools to permit this have been around for a while, but only in the last year has it really come together. The important bit was that the content producers out there had to standardize on a computer readable structure for their offerings. Thus, we now see RSS, RDF, Atom, and OPML links all over the place, collectively called feeds. These are XML-based file formats to collect abstracts of web-based content. The small files are fast to download and easy to process, so it’s simple to write software to read them. This simplicity created an explosion of parses, which then snowballed into a further explosion of feed offerings.

The next step is to accumulate all of these feeds (and perhaps sort and filter them along the way) into a body of information that the user can absorb. This aggregation process is the real energy saver. Popular tools include client-side solutions like NetNewsWire, Sage (a Firefox extension) and now Safari RSS. On the server side, aggregators that produce concise HTML include PlanetPlanet (my personal favorite), Bloglines and Radio Userland.

In my personal experience, my aggregation has led to me being able to absorb about four times as many news stories as before in about half the time. Much of this time saving arises from:

  1. not having to visit all of those pages,
  2. skipping the ads (and the associated download time)
  3. some content filtering, and
  4. knowing when to stop.

This last point is critical for me. All I have to remember is what was the youngest story I read last session. When I see that story again, I know that I’m all caught up. Without aggregation, I have to remember what was the last story I read for every site. That’s a lot of mental energy. Aggregation makes it more like an email inbox — one stream of information — and that’s cool.

What are our computers for if not to act as agents for our interests?


Aggregation will only become more significant in the next couple of years. Google is aggregating newspaper sites all over the world. Many open-source developer groups rely on Planet feeds, like PlanetMozilla, to get a quick read on what’s current. As more sources provide feeds, and the feedparsers get their remaining bugs worked out (e.g. encoding differences, xhtml vs. html, etc.), and the authors get better at self-filtering (I do not want to see vacation photos from the Mozilla developers!) then aggregates can come into their prime. By leveraging primary sources, the news can be even fresher than from portals.

The major remaining obstacles are discovery and trust. Discovering new sources is time consuming, because you generally read a lot of material that is of low interest. But relying on others to discover sources for you just leads back to the portal days where you lack control. Like a financial portfolio, I posit that diversity is the key to a good aggregate today. My personal reading list is a mix of primary sources (like Tom Tongue), intermediate aggregates (the Planet sites) and rehashed news moderated by an editor (like MozillaZine). Primary sources are good because you get the news soonest and have the greatest control over what you read. Edited content is good because someone who (presumably) has a brain has read every item before you did and culled the worst of the trolls out. Intermediate aggregates are somewhere in between: they include authors that are usually interesting.

Social networking can help with both the discovery and trust issues. A promising future direction for aggregation tools is the sharing of moderation between friends. For example, I’ve had some success with DaringFireball (recommended to me by Peter Erwin, IIRC), but not enough to add it to my daily rotation. I would be thrilled to have an automated way for friends to send me a “Best of…” for that feed. Future aggregation clients will allow the user to flag the best/worst stories and republish that list for their friends to see. That re-publication would be subject to normal aggregate filtering, of course, so you could just cherry-pick from common interests with friends.


Aggregates are the new portals, offering more control before than ever to users in the way the intake information. I wonder what comes next?

5 thoughts on “Aggregation”

  1. Lots of good stuff here. Your article reminded me of my own recent ruminations on trust networks, which I’ve just summarized on my blog.

    Actually, that “Best of…” feature you mention should be something easy for NetNewsWire or some other aggregator software maker to integrate, once they figured out a mechanism for actually distributing the information.

    Hmmm, one way to do it would be to have an “incoming” blog, where friends would post articles to that blog, and then you’d subscribe to the RSS feed for that blog. Except, if I had to post the same cool article to a dozen friends incoming blogs, I’d get tired of that REAL fast. 😉


  2. “All I have to remember is what was the youngest story I read last session. When I see that story again, I know that I’m all caught up. “

    I’m just starting to use feeds and only with Safari right now, but one thing I wish it had is the ability to only show me the new posts. Right now, you can only limit by Date (although you can sort by New-ness). This would make looking at aggregate “folders” of feeds much easier for me.

  3. > I’m just starting to use feeds and only with Safari right now, > but one thing I wish it had is the ability to only show me the new posts.


    That’s one feature I was experimenting with under PlanetPlanet. If you visit and then start clicking “x’ buttons, stories you have clicked will permanently leave the page. This feature is fragile, however, and it’s tedious to click all of the links. We need something more like trn (nntp reader) which flags articles you view and hides them from future visits.


  4. “We need something more like trn (nntp reader) which flags articles you view and hides them from future visits.”

    Safari already knows how many new posts it’s received since it displays the number in the bookmark bar folder entry. For it, I think it’s just a minor UI addition. Maybe I’ll poke around in the JS for the RSS display page and see what I can break 🙂

    Oh, and one more nice feature that I just noticed, it (probably common among feed readers…) really downloads the feeds locally so that they’re available when you’re off-line… like on the bus!

  5. Matt,

    Interesting. I look forward to trying Safari’s RSS features when I eventually upgrade to 10.4.

    It’s behavior of downloading and storing the feed locally is indeed a common feature. The reason is that the feed might change unexpectedly. Slashdot, for example, rolls items off its RSS pretty fast. If you don’t check and store the feed every day, you wil miss stories if you only read the RSS.


Comments are closed.