NewsGator's 2.5 Billion RSS Articles
14/3/2008 external link
Microsoft published a case study about our use of SQL Server 2008. It's a pretty good summary of our how much "stuff" we have to manage.
NewsGator makes life easier for individuals and companies by aggregating Really Simple Syndication (RSS) data feeds from across the Web to provide users with customized content delivery, enabling everyone to essentially create their own electronic newspaper. The company, which also provides Software as a Service to more than 50 media outlets including CNN and USA Today, stores some 2.5 billion RSS articles totalling about 4 terabytes on clustered databases running Microsoft® SQL Server® database Software. NewsGator is upgrading its database infrastructure to SQL Server 2008 Enterprise Edition (64-bit) running on the Windows Server® 2008 for 64-Bit Systems operating system to take advantage of a number of new features, including enhanced Database Mirroring for high availability, Backup Compression to reduce storage needs, and Resource Governor for allocating processing resources.
NewsGator
Activity Scoring in NewsGator Online
21/2/2008 external link
The Great from the Many
In January, when we made all of our client readers available for free, we said we were collecting usage data to make the experience better for all users. Today, we released a feature based on that data.
At the top of the NewsGator Online reader, you’ll see a “Sort” option. When you click it, you’ll see the “Sort By Activity” option. If you choose that, you’ll see something like the display below when you click on a feed or folder.
Your unread posts will be sorted based on total user activity in NewsGator’s online reader, FeedDemon, NetNewsWire, Inbox, and Go!. The green bar gives a graphical view of the total activity based on a scale much like the decibel system. In a sense, you can think of this as the “noise level” for the post. Posts that completely fill up the green bar are generating a lot of “noise”.
Behind the scenes, millions of rows of activity data are run through an algorithm to produce this score and scale it to this view. Actions like clipping a post or emailing it to a friend affect the score more than just clicking the title link. We learned a lot about scoring based on our experience with our NewsGator Enterprise Server (NGES) product. (NGES actually goes a step further and calculates a projected relevance score for you based on your personal feed scores and the activity scores of your co-workers.)
So if you’re in a hurry and just want to see the stories that are getting the most attention or if you’re just curious about how stories stack up against each other in terms of user engagement, flip on “Sort By Activity” to get the great posts filtered by our many users.
--------------------------------------------------
Brian Kellner is VP Products at NewsGator and has been an advocate for relevancy and discovery services and features leading the next generation of RSS in both consumer and enterprise markets.
NewsGator
Attention Data: Content vs. User
12/2/2008 external link
Much has been made about our recent move to make our RSS client applications free to users. To recap, last month we removed all license fees for our client applications (NetNewsWire, FeedDemon, Go mobile apps, our online reader, and Outlook plugin), and in exchange we eliminated telephone support and enabled a data syncing process between the apps and our online service that went beyond our subscription data to what we refer to as "attention data".
The telephone support bit of this was a no-brainer, we rarely had someone call for support; most of our users go to the online forums for help. So in effect, removing telephone support was more symbolic than anything else as the actual impact on resource allocation was pretty minimal.
The attention data topic is considerably more interesting to cover. While most commenters have adopted a wait-and-see approach, some have raised some good questions about what we are doing with that data, which in aggregate totals millions of individual line items each day. Our network datacenter now covers 2.1 million feeds that poll at least hourly, collecting well over 7 million new items of content each day.
We archive this content as well, but it's not a complete cache of the blogosphere, as would be an easy conclusion to make, because we only archive feeds that our users subscribe to and in each case we are limited to the amount of content that the feed exposes. Some feeds are full text, but far too many are still excerpts, but at any rate, it's a lot of content in both the current 24 hour set and in the archive.
First and foremost, attention data is metadata about what happens to content. At one level it's as simple as someone clicking on a headline in a feed to open a post, but also included are the actions that people take on content, such as clipping, tagging, bookmarking, and sharing of individual content items.
There are two kinds of attention data, or put a more accurate way, one set that puts the user at the center and another that puts the content item at the center. We're interested in both, but have different mechanisms for collecting each.
The free release last month focused on the attention data about content, which is why we went to some lengths to explain how we were anonymizing it. Quite honestly, it's not interesting to me that Joe Smith clicked on, bookmarked, and then sent to a friend a post in GigaOM. What is interesting is that a post in GigaOM got clicked on, bookmarked, and shared. It's not interesting to me that Joe Smith did this because I don't have any demographic data about Joe Smith, therefore the commercial value of that information is low, but this isn't to suggest that the "Joe Smith dataset" isn't interesting to Joe Smith... more on that in a minute.
Why is this attention data useful? Simply put, attention infers content authority and quality; if you share something I can make an assumption that you found it useful, which we can then use in our attention algorithm. The scoring generated by our attention algorithm can be used to make search more accurate, and it can be packaged as an API that we make available to our partners to enable their services to better filter and sort content.
We don't sell this data to marketing companies because in that context it's worthless because there is no demographic information attached to it. Recall that this attention data is focused on content and not users, and the purpose is to improve existing function and enable new features. For example, one of our media customers is using this to generate a list of stories that received a high degree of attention in the prior 24 hours and that they did not publish through their sites, in other words we are using attention data to tell them the things they did not know they didn't know.
Last year we did expose something we were involved with that speaks to the user perspective of attention, APML. This standard, which builds on the success of OPML, is attractive for some very important reasons. First and foremost, APML creates a single database about user subscriptions and attention data items, rather than attempting to merge and sync separate databases around each. Second, it's a true industry standard that is emerging through a process of cooperation rather than imposition, and lastly, it makes attention data portable.
We fundamentally believe that data about your browsing habits is yours and that means you should be able to take it with you wherever you go. APML does this much in the same way that OPML does it for subscription data, and that has been a very successful model.
In many ways the ultimate commercial value of attention data is speculative, but we are not totally flying blind here either as we do have concrete examples about how it is enhancing the value of network functions that are important to our consumer and commercial clients. Speaking as a user, the APML piece is very important to me because I can accumulate this data over time and transfer it from service to service without penalty, and as more services take advantage of APML I will receive benefits as a user.
-------------------------------------------
Jeff Nolan is vice president of the software-as-a-service group at NewsGator Technologies. Based in the Bay Area, Jeff also writes frequently on these topics on his personal blog, Venture Chronicles.
NewsGator
comScore widget matrix numbers are innacurate
4/2/2008 external link
Techcrunch recently published a post about “The Widget Kings” which promoted the comScore widget matrix as a symbol of rank among widget manufacturers. We did a little research on the accuracy of these numbers – to make a long story short; we found the numbers entirely inaccurate and incomplete as ranking of widget vendors.
This is not a new perspective, both GigaOm and Jeff Jarvis posted about this back in June 2007 when the comScore list was first released.
It’s difficult enough to track traffic accurately on the internet, much less widgets, so we weren't surprised to see some inconsistencies; it is to be expected when reports like these are first generated. But when the numbers are deceptive and wrong, the report loses all credibility as an independent ranking of widget vendors.
Let’s compare the list in April from the report just released in November. For our analysis, we looked into the changes in the standings and tried to validate their statistics with Compete and Alexa. While we appreciate that comScore, Compete and Alexa don't all track the same way, we were hoping these sites could at least get a sense of whether these other sites might show traffic increasing or decreasing over that time period.
April 2007 comScore Widget Matrix
November 2007 comScore Widget Matrix
Here are the things that jump out immediately.
1) Brightcove is off the list. They went from 16.9 million unique to less that 14.9 million? Let’s try to corroborate that. Here are charts from Compete.com and Alexa.com
Again, traffic is difficult to measure, but at the very least, both Compete and Alexa point to flat growth, not an 11% loss in audience.
2) Slide.com dropped from 117.1 million uniques to 39 million. Sounds like they are in trouble? Not according to Alexa and Compete.
3) Musicplaylist.us at 15 million uniques in 4/07 and 11/07…
How does this work? Traffic to musicplaylist looks to be in a freefall.
I could go on – none of the numbers seem to make sense. Is comScore playing a shell game for their paying clients? Or is this a true third party representation of widget traffic?
Let us know your thoughts in the comments!
If SuperPoke is the crack cocaine of Facebook, then perhaps Widgets are a mild narcotic
19/1/2008 external link
There’s been a lot of talk of late about “viral loops” (see Jeff Nolan or Andrew Chen), how they define the development of social applications and how they are the secret sauce of social networks. Without going into significant detail on the topic, the basic premise is this: viral loops will help your application get distributed by encouraging you to interact with your friends. Most of the Facebook applications are solely focused around this premise, (hence the SuperPoke reference in the title).
The parallel on the widget front is an "interaction loop", these are the hooks built in widgets that encourage interaction - by responding to content or sharing with others. The main difference between 'interaction' and 'viral' loops- not all interactions are viral, some interactions simply benefit the user through personalization or community interaction. Widgets differ slightly from social applications in that the end goal isn't always to get the user to send to a friend, widgets are typically used to provide a service to the end user, such as presenting personalized information or content.
So let’s take a look at some of the ways interaction loops are put together.
This is a simple video widget from NewsGator that plays videos and shows comments. There are 9+ different interaction loops within this widget.
1. Email link (& other share options) – the most obvious and first on the list. By clicking on the email button, a user is able to share the content with a friend, which gives the content more exposure. The experience endears the user to the widget (where they found the interesting story) because the content was interesting enough to respond to. Side note: email is a proxy for all of the ‘response’ options in the widget - IM, blog, send to phone, etc)
2. Email message – on the receiving end of the article email, the user is presented with links to add the widget from which the article was sent. If the article interests the person to which it was sent (which in most cases it will) then there is a higher than average chance that the user will also be interested in the widget it came from, so it is important to ensure that the user who received the email can also add that widget to their blog, social network or personal start page.
3. Ratings – you are more likely to vote on something that other users have already rated. Rating is the most explicit of content engagement, and its also the easiest and least intrusive. As an interaction loop, rating is great because you know exactly how the user feels about the content (its rating afterall)
4. Comment link - as far as interactions are concerned, commenting is one of the most valuable of interactions. A user who comments on an article is showing significant engagement - they are responding to the content, thinking about it, taking a risk by replying.
5. Get This link - This link is the gateway to a bunch of additional interaction and sharing options. This needs to be tracked as it relates to the other options presented, if a user doesn't click something else after this click, there is a problem...
6. Sharing destination links - Taking a widget and putting it on your blog, personal start page or social network represents the highest level of commitment to content. This shows more than just an interest in the content, but an interest as well in the editor or publisher of the widget.
7. Create your own widget link - questionable whether this should be considered an interaction loop - it is a shortcut to NewsGator's signup form for the widget framework.
8. Send widget to a friend link - like the email link, this shows reccomendation for both the content and the publisher
9. Send widget to a friend message - users who receive this message are shown the latest headlines from the widget as well as the sharing destination links, encouraging interaction with the added benefit of a referral from a friend.
These are just a few of the interaction loops that are possible, how you present these loops should reflect the targeting goals of your widget. Another discussion for a later blog post is to explore the value of each of these loops, as well as the monitization options for each.
Do you have a perspective on this topic? Let us know in the commnents!
Why Use a Desktop Feed Reader
11/1/2008 external link
I was going to write a quick post about why web-based readers don't work very well within the enterprise, then I noticed that Nick already wrote that.
Most web-based readers (NewsGator's being an exception) can't subscribe to secure feeds. I don't know about you, but that's a show-stopper for me - I have a number of password-protected feeds that I absolutely have to keep track of.
Web-based readers can't access "behind-the-firewall" feeds. For example, we have an internal server which runs FogBugz, and I'm subscribed to several FogBugz feeds which alert me to problem reports and inquiries regarding my software. I can't add these critically important feeds to a web-based reader.
Most web-based readers offer no offline support, and even when they do, offline reading is still far better in FeedDemon (this screencast shows why). FeedDemon doesn't just download your articles so you can read them offline - it can also prefetch the images they contain and the pages they link to, enabling you to browse the web without an Internet connection. Your web-based reader can't do that. This is one of those features that you don't think you'll need - until you do.
Many desktop readers are full-fledged web browsers, complete with access to your favorites, tabbed browsing, etc. In fact, FeedDemon is my web browser - I rarely use an external browser anymore. If you haven't used a browser that's also a powerful RSS reader, you're missing out.
If you live in Microsoft Outlook, you can use an RSS reader like NewsGator Inbox which integrates with Outlook, complete with flagging, indexing, filtering, archiving, and all the other features Outlook power-uses rely on.
Desktop readers have access to local resources, enabling a slew of features that aren't available in web-based readers. For example, desktop readers can integrate with your favorite blogging client, or download podcasts and copy them to your iPod or WMP device. NetNewsWire even integrates with iPhoto, Twitterrific, Mail, and iCal.
Desktop readers give you a choice about which feeds to keep completely private. Want your reading habits regarding a subset of your FeedDemon subscriptions kept completely on your local computer? Just put them in a folder that's not synchronized.
And of course, speed is often another benefit. Web app performance has become a lot better over the past few years, but we're not at the point where JavaScript in the browser can compete with native performance :)
NewsGator Makes Client Apps Free
9/1/2008 external link
NewsGator announced today the general availability of NetNewsWire 3.1, FeedDemon 2.6, and NewsGator Go! for Windows Mobile 2.0. The public beta of NewsGator Inbox 3.0 also began today. The award-winning products for PC (FeedDemon), Mac (NetNewsWire), Microsoft Outlook (Inbox), and mobile (NewsGator Go!) deliver a best-of-breed RSS reading experience that synchronizes through NewsGator’s online platform. All of the new product versions deliver a better user experience with the inclusion of significant performance, usability, and relevance enhancements.
NewsGator also announced that all of its client RSS reader products are now available free of charge and include free synchronization and other services. Users can now enjoy the great features and performance of all the web and desktop readers as well as free mobile options for iPhone, Windows Mobile, and BlackBerry (powered by FreeRange) all synchronized to provide the same view of their RSS content no matter where they read. Enterprise customers will continue to enjoy the extended value of having all these clients synchronize with NewsGator Enterprise Server. The combination of innovation in client reader features and the ability to leverage core platform data and capabilities have made NewsGator products dominant in the enterprise.
“It’s all about ubiquity” said Greg Reinacker, CTO and founder. “We have over 100 Fortune 2000 companies using our enterprise server and client products. In selling to these enterprises we discovered that thousands of knowledge workers were already using one or more of our client products and we learned that we could drive the relevance of everyone’s experience by using the community’s anonymous content consumption patterns throughout the system. In general, we found that the more people that used our system, the more relevant we could make the product for each user. By making it easier for knowledge workers to user our clients we drastically increase the size of our user community. Enterprises that then deploy our server can take advantage of the synchronization and increased relevance for every user supported by the system. Likewise, we can extend these capabilities to our online platform which currently serves well over one million consumers and indexes seven million new articles per day. The result is tremendous value and continued innovation for both consumers and enterprise users."
JB Holston, CEO and president, said “We are uniquely positioned to drastically expand the community of people working with RSS content. The larger that community, the more valuable the experience for every user and for all of our services customers. That insight has led us to undertake the significant investment to make all of our client products free. Our 50 media and online publishing customers can now take advantage of this valuable intersection of community and content. With services such as our widget framework, these customers can syndicate the most relevant content, on a branded basis, to a much wider audience. In addition, we are rapidly growing our Software as a Service (SaaS) business by partnering with brands, agencies, and information service companies who can use attention and relevance metadata, provided by the NewsGator platform, to offer much richer and more relevant content solutions to their audiences. Look for more announcements about these efforts in the coming weeks.”
[JN update: I had the wrong quote from Mike in the first version, I updated this graph] Mike Gotta, Principal Analyst with Burton Group, said, “Burton Group has long been focusing on the evolution and business impact of RSS platforms in the enterprise. Making readers more available for all workers inside an organization makes it easier for people to participate in an RSS platform and makes it more possible to deliver a consistent feed reading experience across multiple application contexts. Not only is this a usability and productivity benefit for workers but the enterprise wins as well since the underlying RSS platform is able to capture and analyze relevant data on a much broader basis. Wider adoption of its free clients will likely generate a tremendous amount of relevance information, which can be used to augment systems installed at customer sites.”
The improved relevance information in NewsGator’s online platform will drive further innovation within each of the RSS clients. “When I joined NewsGator,” said Nick Bradbury, creator of FeedDemon, “I had a vision of FeedDemon leveraging the power of NewsGator’s platform to give PC users a better reading experience. Making all the clients free helps make that platform even richer and more powerful – I’m very excited about the possibilities and have some great features planned for the next release of FeedDemon.”
“The NewsGator platform enhancements will also enable great new features for Mac users,” said Brent Simmons, creator of NetNewsWire. “Syncing with the NewsGator platform already makes NetNewsWire faster and lets users read their feeds on an iPhone. With these enhancements it will be even easier to recommend the best feeds and the coolest stories in NetNewsWire – creating a better user experience for both consumer and enterprise users.”
All client products are available for free download from the NewsGator website (www.newsgator.com). Additional information is available via this FAQ.
2008 : The Year of RSS
2/1/2008 external link
Marshall Kirkpatrick penned a good post just before Christmas on 2007 highlighting the year in RSS, which made me consider the year ahead in terms of both consumer and enterprise RSS adoption. I polled some of our leading thinkers on the subject and here's what came back:
Portal Plumbing. Using RSS / Atom as a way for backend systems to funnel information (both publishing and retrieving) into a single access points that are easy to use and easy to manage. Users won't need to know why all the information they find important is on one page. This is a no-brainer in many ways, and it reflects what is happening on the consumer side of the business. A large segment of users are taking advantage of start pages like iGoogle and Netvibes to be their own aggregator, and are using RSS to accomplish this, there is no reason why this should not also happen behind the firewall.
RSS will also drive more of the backend of social networks. Users are far better filters than any software - so finding relevant information for (and from) various nodes of your social graph will become even more important. RSS will be the transfer protocol between yourself and your social networks.
Ease of use will be greatly enhanced with discovery and filtering mechanisms to help you find new content and sort/organize the feeds you already subscribe to. People simply don't say "I need more content", but they do consistently say "I need better content".
The debate about privacy is far from over. As more attention-based services that respond to your defined preferences and observed behaviors emerge, the question will again be asked about opt-out or opt-in by default.
Publishing your reactions to this information via Atom publishing (or maybe some other unknown protocol) will become more important as well within social networks. A widely adaptable comment publishing protocol will emerge that would allow me to comment on an item, no matter where I read it, and have my comments visible to anyone else reading that item no matter where they decide to read it. My comments could also be mined for category information and push the original item into more relevancy for others in my social network.
Within the enterprise, the use of authenticated feeds to access transaction and master data systems will rise. Some of these feeds will be user-oriented (e.g. my sales leads) while others will be persistent search based (e.g. a feed for all customer support issues related to a specific customer).
Whatever is in store for us in the year ahead, RSS as an infrastructure technology is achieving critical mass. In the blogosphere we take for granted the omnipresence of RSS but the vast majority of the market is still untapped and ripe for disruption.
----------------------------------------------------------------------------Jeff Nolan is VP Corp Dev for NewsGator Technologies.
Merry Christmas
21/12/2007 external link
From all of us at NewsGator, MERRY CHRISTMAS! We hope that you and your families enjoy a peaceful and happy holiday.
Open Social Developers Journal - Show Me the Money
29/11/2007 external link
Sometimes it’s nice to be old. Well, it’s at least nice to be able to remember mistakes so that we can hopefully avoid repeating them. Back in the “Web 1.0” days, there was a period of euphoria where everything related to the internet was certain to be incredibly profitable. And now we are in the era of “Web 2.0”…So how does anyone make money in the OpenSocial world? We’ll be chatting more about the business perspective for this in an upcoming webinar, but let’s just focus on the developer’s side for a moment here. Advertisements stand out as the clearest path to dollars in most scenarios. What does it take to include advertisements in an OpenSocial Gadget?
First, here’s the world’s shortest primer on web-based advertising. Ads make money when the end user sees them or clicks on them. The ads that make the most money tend to match the other content on the page (e.g. put sports memorabilia ads next to sports news articles) and/or target characteristics of the user (e.g. offer Denver Broncos tickets to a person who lives in Denver).
Since most social sites require a login, the ad networks can have trouble spidering the pages to know what the page is really about. Some networks have options (which are not available to all accounts) that allow the request for an ad unit to say what the content is about (“this page is about automobiles – send me car ads”). Of course, this only works if the application somehow has insight into what the page is displaying. Clearly an application should know what kind of content it is showing. If I write a dating gadget, it’s probably safe to request ads about dating (hold that thought for a moment…)
Targeting by user seems really intuitive for social gadgets. Given the success of Facebook, several ad networks have appeared that focus exclusively on these opportunities. It seems likely enough that someone will have a fairly robust offering. Most likely it ends up being a “least common denominator” offering like Lookery which just uses Age, Sex, and Location. But what happens when a container site can’t tell the application the user’s gender and whose problem will it be to translate all of the possible ways that “location” might be described in different social sites into something the ad network understands?
In addition, OpenSocial has the concept of the user and a visitor. When I browse my friend’s profile, I’m a visitor. But the application doesn’t have any rights to get my profile information, so what kind of ad should be displayed in that situation?
But the real challenge here is understanding the rules of the container site. If I make an application that’s about dating, and a dating site decides to support OpenSocial applications, I really doubt they would like to see ads for their competitors showing up in my application on their site. Today there’s no API that conveys any rules for advertising. Pragmatic social sites will probably screen every application to see if it meets their standards. But what prevents developers from modifying the application later? And just how quickly will social sites review applications?
Remember the early days of getting an application into the Facebook directory. Right now, I’m waiting for one of the OpenSocial launch partner sites to respond to my request to add Didja Hear to their approved list.
Advertising within the gadget isn’t the only path to money for OpenSocial Applications. But other monetization options like affiliate sales programs or cross-promotion fees for other application developers have both technical and business challenges as well.
In many ways, this is very similar to the interior decorator problem. Application developers can build unique, highly-customized versions of their applications for each target social site. Or they can take their chances with some very basic advertising strategy which will work everywhere (but probably not perform well anywhere).
The ideal solution seems like it would be an ad network designed specifically for OpenSocial. It would need a really large inventory of ads along with an API that would allow container sites to express some basic concepts (including forbidden advertisers and targeting rules). It’s a good bet there are some bright minds considering this problem right now…
I’m not old enough to have experienced the 1849 gold rush, but I’ve heard that the best money was made in supplying the prospectors. Clearly in the “web 1.0” world, a lot of companies profited by solving the difficult problems in a new emerging landscape. There’s gold in these OpenSocial hills, but we haven’t yet seen whose going to profit the most from it.
P.S. While we’re waiting for the container sites and API’s to get stable, you can check out this page to catch a quick screencast about Didja Hear!? We’ll be posting links and instructions on how to add Didja Hear to different social sites on this page as well.
Brian Kellner is VP of Products at NewsGator


