Category: SpokenWord.org

The Amazon Web Services (AWS) Outage

Like many other sites hosted on AWS, all of The Conversations Network’s websites went down at 1:41am PDT on April 22, 2011. It would be 64.5 hours until our sites and other servers would be fully restored. A lot has been written about this outage, and I’m sure there’s more to come. Don MacAskill, another early adopter of AWS, has posted a good explanation of SmugMug’s experiences during the outage.  Phil Windley and I are hoping to interview our friend Jeff Barr from AWS for Phil’s Technometria podcast once the dust has settled at Amazon.

Many pundits have suggested this event highlights a fundamental flaw in the concept of cloud computing. Others have forecast doom and gloom for AWS in particular. I disagree with both arguments. While it certainly was the most significant failure of cloud computing to date, I predict this event will become not much more than a course correction and a “teachable moment” for Amazon, their competitors, all cloud architects and of course us here at The Conversations Network. For the geeks in the audience, I’m going to describe our architecture, the AWS services we utilize, and give a bit of an explanation about what happened and what we learned.

The Conversations Network utilizes three basic AWS services, plus a few more that aren’t really pertinent to this episode. Our servers are actually instances of AWS Elastic Compute Cloud (EC2) servers. The root filesystem for each server is stored in a small (15GB) AWS Elastic Block Storage (EBS) volume. Not only are these volumes faster than local storage, they’re also persistent. So if/when an EC2 instance stops, the root filesystem for that instance remains intact and will continue to be usable if the instance is re-started. [EC2 instances are booted from Amazon Machine Images (AMIs). In our case, these are based on Fedora 8 (Linux) customized to our standards. The AMIs are identical for all our servers, but the EBS root filesystems, which change dynamically once a server is booted, are unique to each server.]

We also use EBS volumes for non-relational storage. For example, we have one large EBS volume for IT Conversations and other podcast filesystems. This holds all the audio files and images used on the website. We have another for SpokenWord.org, and so on. These EBS volumes are each mounted to one EC2 instance, which in turn shares them with the other servers via NFS. Finally, we use the Relational Database Service (RDS) for our MySQL databases. Like EBS, this is a true service as opposed to a “box” or physical server.

One very important feature of EBS is that you can take snapshots at any time. For example, we make a snapshot each night of each EBS volume. We keep all snapshots of all volumes (other than the EC2 root filesystems) for the past seven days, plus the weekly snapshots for the past four weeks and the monthly snapshots for the past year. The cost of keeping a snapshot is based only upon the incremental differences since the previous snapshot, so it’s quite a reasonable backup strategy even for large volumes so long as they don’t have changes that are both major and frequent.

Designing any server architecture, cloud-based or otherwise, requires that you consider the failure modes. What can fail? What will you lose when that happens? How will you recover? Automatically or manually? How long will recovery take for each failure mode? It’s not about eliminating failures — you can’t really do that. Rather, it’s about planning to deal with them. And like traditional architectures, the cost of the configuration increases geometrically as you increase the reliability (ie, decrease the amount of time it will take to recover from a failure).

We’ve been using AWS for more than four years. During the period when IT Conversations was part of GigaVox Media, we were the basis of one of the first case studies published by Amazon. [Here's a diagram of one of our AWS-based configurations.] Because The Conversations Network (a non-profit) runs on a shoestring budget and can’t afford the level of redundancy deployed by some commercial enterprises (eg, SmugMug), we’re not looking for a particularly high-reliability architecture. Until last week, we’ve have EC2 instances that haven’t stopped in well over a year. We can’t tolerate any significant loss of data so we need the redundant storage of EBS, but a 99.9% uptime is good enough for us, and that’s what we’ve had from AWS until now. Because of our experience with the high-reliability of AWS, we have never gotten around to automating the re-launching of EC2 instances in case of failure. We do use two separate monitoring services, and there are two of us (me and Senior Sysadmin Tim) who are capable of restarting servers, etc., if something does go wrong.

AWS operates in five regions around the world. We happened to pick US East in Virginia instead of US West (northern California) for no particular reason. Within each region there are multiple physical locations called availability zones. These are probably separate data centers within a metropolitan area. The availability zones within a region are connected by very high-speed fiber. This means you can have some degree of geographic redundancy by deploying servers in multiple availability zones, or achieve even greater protection by also deploying duplicate systems in multiple regions. The latter is far more complex, since the connectivity between regions is not as good as between availability zones. Our needs are humble, so all of The Conversations Network EC2 instances, EBS volumes and RDS databases are located in the us-east-1a availability zone. And of course, that’s where last week’s failures occurred.

Amazon hasn’t yet said what the original failure was. All of our EC2 instances were running and they could communicate with the RDS databases. I think the problem might have been the association between the EC2 instances and the EBS volumes. The volumes used as root filesystems were reachable, but not the others that contained our site-specific files.

After a few hours of downtime, I decided to re-boot our EC2 instances and that’s when things went from bad to worse. All of our EC2 instances entered the Twilight Zone. They were stuck in the “stopping” state. The operating system halted (no SSH access) but the servers didn’t release their EBS volumes. I could have launched all-new EC2 instances, but I wouldn’t be able to connect them to the volumes and hence, no websites.

Because of our backup strategy, however, we did have one more option: We had snapshots of our EBS volumes. I could have created all-new EBS volumes from the daily snapshots, and I could have done so in a different availability zone to get away from the problems. But there was one gotcha. We make the backup snapshots at 2am Pacific time each night. The failure occurred 19 minutes before that, which means our snapshots lacked the most-recent 24 hours of activity: new programs, audio and image files, logs, etc. As with the few previous problems we’ve had with AWS (mostly of our own causing) we thought this outage would be fixed quickly. It was a tradeoff: It seemed better to wait an hour or two rather than to re-launch with day-old data.

Of course “an hour or two” dragged on. Soon the outage was 24 hours old; then 48. It always seemed that the fix was imminent, so we delayed the restart process. Eventually, we decided to go ahead, and that’s when we discovered our one real mistake. Remember that we make snapshots of our EBS volumes every night? Well it turned out that we weren’t making those snapshots of all of our volumes. There was one volume that we somehow missed. The only snapshot we had of that volume was from the date it was created, more than a year ago. That means we would have had to launch our sites with some very old data. In this case, when we finally got access to the most-recent data (on the in-limbo EBS volumes) it would be difficult to reconcile it all. In the end, we decided just to wait it out. Finally, after 64.5 hours, the one EC2 instance that was holding hostage our last EBS volume stopped. We were then able to re-attach that volume to a newly-launched instance. We brought up all-new EC2 instances, attached all the then-current volumes and we were up and running, still in availability zone us-east-1a.

So what did we learn from all this? We re-learned that you have to think through these architectures carefully and understand the failure modes. But most importantly, we learned that once you have a good plan, you have to follow through with it. If we had been making nightly snapshots of that one remaining EBS volume all along, we would have been able to re-start the websites with day-old data at any time, regardless of the problems AWS was having disconnecting EBS volumes from running EC2 instances.

I also have a new strategy for deciding when to stop waiting for AWS to recover and instead switch to the snapshots: Once the length of the outage exceeds the age of the backups, it makes more sense to switch to the backups. If the backups are six hours old, then after six hours of downtime, it makes sense to restart from backups. In this case, we should have done that after the first 24 hours.

But we still know we don’t have ultimate redundancy: We still have to re-start things manually. So long as we accept the downtime, we can survive the total failure of the us-east-1a availability zone and even the entire US East region. That’s because all EBS volumes are first replicated to multiple availability zones within the region, and our nightly snapshots are stores in Amazon’s Simple Storage Service (S3), which is replicated across multiple regions. So our current data can survive a failure within a region and our day-old data can survive a failure of our entire region.

We still have a few things to cleanup and repair from this experience, but all-in-all we remain fairly happy with how things turned out. We didn’t, after all, lose any data. And while we aren’t proud that our sites were down for nearly three days, the world as we knew it did not come to an end. Maybe our team is even glad to have a few days off. (Too bad we couldn’t have told them in advance.) We still have one EC2 instance that refuses to stop, but it’s one of those that used NFS to reach EBS volumes attached to another server. Amazon says “We’re working on it.” Other than that, we’re now better prepared for the next failure, so long as its just like this one. Actually, I think we’re in pretty good shape for most events I can foresee. AWS. It continues to be a great platform for us.

Curators Wanted: SpokenWord.org

Over the past two months we’ve been discussing the future of SpokenWord.org with our advisors, directors and members. We now have a new plan for SpokenWord.org and we need your help.

The web is awash with audio and video. There are great programs out there, but they’re just too hard to separate from the noise. We created SpokenWord.org because we wanted to help people locate the best podcasts, videos and slideshows. We got the basics right — topics and collections — but our homepage in particular isn’t discriminating enough. Literally every five minutes we display the latest programs in each topic, but they’re not filtered. There’s little sense of what’s worth watching or listening to as opposed to just being “new”.

What’s missing is the human touch. For example, I’ve recently become obsessed with photography, and I’ve been looking everywhere for the best podcasts and videos to help me learn more. Along the way I’ve had to work my way through all sorts of junk in order to find the good stuff. If only there were a photography guru who would take the time to find the best podcasts and individual episodes for me. That would be awesome.

So that’s what we’re doing in SpokenWord.org 2.0. We’re building a team of expert curators, each with his or her own specialty. These curators will find the very best audio and video programs and use SpokenWord.org to present them to you. These curators and their collections will be the primary feature of our website.

Is there a topic you’re particularly passionate and knowledgeable about? Would you be willing to share your expertise by maintaining a curated list of feeds and episodes for SpokenWord.org? Would you like to become one of our curators?

There’s no monetary compensation for your effort, but I think you’ll be rewarded by the appreciation you receive and the credibility you’ll gain within your niche. We’re going to work hard to spread the word about SpokenWord.org and our curators, and I think being the SpokenWord.org curator for a particular topic will eventually carry some real weight.

We’re still early in the process of implementing the website features to support this new concept. In fact, the concept itself is still evolving. If you’re interested either in becoming a curator or just participating in the discussion of how our curation system will function, please join the brand-new Google Group dedicated to SpokenWord.org curation.

We’ll soon have a way for you to formally apply to become a curator, but for now, joining the discussion is the best way to get involved.

Taking a Step Back

IT Conversations will be seven years old in three weeks, and as often happens at this time of year I find myself taking a step back from the day-to-day issues surrounding The Conversations Network to try and see the big picture. Where are we and where are we going?

I’ve published the Annual Report and assimilated the results from our annual survey of members as I do every year, but those only address the mostly tactical issues (How well are we doing what we’re already doing?) as opposed to the more strategic ones (What should we be doing?).

This time around I’m going to go through the process more publicly than usual, partly because blogging about it helps me organize my thoughts, but mostly because I want to get input from as many people as possible.

When I started IT Conversations in 2003 virtually no one else was posting free audio recordings of conferences, events and interviews. It was relatively hard to do, so I had to invent many of the tools, processes and even a suitable content-management system for high-volume audio post-production. Over the years this became known as podcasting and hundreds of thousands of people learned how to do it.

Two years ago with help from our Boards of Advisors and Directors I realized that podcasting and video had become so easy and ubiquitous that the needs of the larger community had shifted from “How do you do it?” to “How do you find it?” The discussions that followed led to the creation of SpokenWord.org, our site for finding and sharing audio and video podcasts.

But while SpokenWord.org now has metadata for over 640,000 audio and video programs from nearly 7,500 RSS feeds, it hasn’t really caught on in the way that IT Conversations did in those early years. Ask most geeks, and they’ve probably heard of IT Conversations. But aside from our 4,000+ registered members, virtually no on has ever heard of SpokenWord.org. Sure, we haven’t done much to promote it, but neither did we do so for IT Conversations. SpokenWord.org just isn’t solving a big enough problem for enough people to make it worth our user’s time and effort to tell someone else about it.

Taking stock, what are our assets and our strengths?

  1. We have an excellent team of 35 (active) part-time writers, producers and audio engineers who create IT Conversations, Social Innovation Conversations and CHI Conversations, and good processes for recruiting, training and management.
  2. We have excellent processes and technology for audio post-production, task allocation, content management and automated show assembly.
  3. We have a good metadata directory for audio/video programs and feeds with personal-collection features (SpokenWord.org).
  4. We have an archive of 2,500 of our own programs.
  5. We do this all for less than $35,000 per year.

And weaknesses?

  1. The growth of podcasting (not just ours) is flat.
  2. SpokenWord.org has a very small user base and in it’s current form isn’t solving any big problems.

Don’t get me wrong. The Conversations Network’s channels are the best podcasts on their topics and SpokenWord.org is a terrific resource for those who do use it. But I believe we can (and should) do a lot more with what we have.

The Conversations Network is a 501(c)3 non-profit, which implies a mission to benefit the public. So the question to you (staff, listeners, members and readers) is: What should we do next to continue that mission? I’ve got my own ideas, but I want to hear from you first.

SpokenWord.org — Freestyle

Our annual survey of SpokenWord.org members included five essay-style questions. Here are some of the answers that don’t necessarily correlate with any consensus; they’re just the most interesting.

“How can we improve SpokenWord.org? (What’s the one thing you wish we did that we don’t already do?)” (53 answers)

  • “Most popular” (today, this week, this month, ever) by category is a plus. [There was some consensus on this idea of per-category most-popular lists.]
  • I find it confusing for reasons I can’t articulate. It’s not crystal clear exactly what I’m supposed to do. [I sense that's true for many first-time visitors.]
  • Ogg Vorbis content encoding option [We don't control the encoding; that's up to the publishers.]
  • It would be great if SpokenWord.org could offer files from the Internet Archive. [Yes, we need to re-visit that idea.]
  • Sync with any mp3 player. [We've published extensive APIs with the hope that others will pick up this ball and run with it.]
  • Make discovery easier. I would also like a feed or a page that shows all new programs. [From many questions like this I get the feeling that people don't realize that we get thousands of new programs every day.]

“What do you like most about SpokenWord.org?” (62 answers)

  • The variety of content. [By far the most common response.]
  • That it provides an open, public place to archive ratings data on podcasts.
  • The ability to simplify the process of managing podcasts and subscribe to only a few collections in iTunes.
  • One stop shopping and not iTunes-centric. ["Not iTunes" shows up frequently.]

“If you were running SpokenWord.org, what would you do to increase the number of people who use it?” (49 answers)

  • Advertise [No budget!]
  • Joint programs with schools, college and other educational institutions (younger people have larger social networks)
  • Redesign the homepage.
  • Try and get some influential technologists using it, such as Leo Laporte, Patrick Norton, Dave Winer, etc.
  • Make it easy to post programs and collections on Facebook and Twitter.

The Conversations Network is a U.S. 501(c)(3) non-profit public-benefit corporation. How can we make SpokenWord.org better fulfill its mission of service to the community?” (30 answers)

  • Maybe some collaboration with PBS and NPR.
  • Introduce it to other non-profit organizations that are doing a Podcast.
  • Create an educational hub similar to iTunes U.

Anything else you want to tell us?” (35 answers)

  • Keep up the great work, and thanks for all you do.
  • No

SpokenWord.org — The Features

We asked SpokenWord.org listeners to rate the various features of our service, and here’s what we learned:

Most Important

  • Browsing by category (2.8 out of 3.0)
  • Finding individual episodes (2.6)
  • Finding new RSS feeds (2.5)
  • Personal Collections (2.4)

Helpful

  • Ratings (2.4)
  • Tags (2.4)
  • “Most popular” lists (2.3)
  • Improve the website design (2.2)
  • Automated recommendations (2.1)

“Don’t Need It”

  • iPod/iPhone integration (2.2)*
  • non-Apple device integration (2.2)*
  • A mobile-device version of the web site (2.0)
  • “Send to a friend” (1.7)
  • More screencast tutorials (1.7)
  • Following others’ collections (1.7)
  • Widgets for blogs (1.5)
  • Post to Twitter and/or Facebook (1.4)

* It might have been better to combine these into a choice for “mobile device integration.” Since our listeners are 50/50 Apple and non-Apple, the combined interest in mobile-device integration might have been quite high.

It seems that sharing and other social-networking features are among the least important to our listeners, whereas features for personal use rank quite highly.

We should also note that this is a survey of those who are registered for SpokenWord.org, and most likely those that find some value in the service as-is. Therefore these criteria are not necessarily the same as what might attract new users with difference preferences.

SpokenWord.org — Your Habits

Following up on my first post on the SpokenWord.org annual survey results…

  • 43% of respondents have created at least one collection. Half of them are actively using collections today.
  • 7% frequently rate programs or feeds. Another 35% do so, but rarely.
  • Listening/watching is done via:
    • Android devices (3%)
    • iPhones/iPods (56%)
    • iTunes on computers (37%)
    • other portable devices (45%)
  • 6% of respondents are paid members of The Conversations Network.
  • 8% have donated to The Conversations Network.
  • 1% have donated specifically to support SpokenWord.org.

SpokenWord.org — The Survey

We’ve just completed our annual survey of SpokenWord.org listeners and starting today I’ll be reporting some of the results here on Blogarithms. Overall:

  • We emailed a link to the survey to 3,176 registered members of SpokenWord.org.
  • 250 (8%) of those clicked through to the survey.
  • 174 started the survey.
  • 147 completed the survey.

“How important are…?” On a scale of 1 (Not for Me) to 3 (Important):

  • Audio (2.83) 86% said “Important (4.0)”
  • Video (1.91)
  • Free Audiobooks (2.29)
  • Paid (Audible.com) Audiobooks (1.70)

The ratio of audio/video is expected, but I was surprised to see the ratings of both free and paid audiobooks.

“Have you ever watched or listened to…?”

  • Public radio (70%)
  • Free audiobooks from Librivox.org (38%)
  • YouTube.edu (31%)
  • Paid audiobooks from Audible.com (27%)
  • Fora.tv (25%)
  • Free audiobooks from Podiobooks (20%)
  • WGBH Network Forum (11%)

Again, the surprise for me is the high percentage of listeners to both free and paid audiobooks.

The APIs are Here!

It may not be as exciting as when Steve Martin discovered “The new phone books are here!” in the 1979 Carl Reiner film The Jerk, but we are starting to roll out full APIs for SpokenWord.org. It’s a RESTful interface and the first flavor of response formats is JSON, so it should be easy to use from any programming language. (We plan to off XML responses as an option if enough developers complain about JSON.)

If you’ve used the Twitter APIs, you’ll see that we modeled ours after theirs in many ways. We also took the idea of a Remote Key for authentication from FriendFeed. (OAuth is coming soon.) The initial methods allow you to set and get ratings of programs, feeds and collections and to retrieve extended metadata about individual programs. We’ll be publishing new methods very quickly, but we’re anxious to get feedback from developers before we go too far. The full API documentation is available online. If you have comments, questions, suggestions or bug reports about the new APIs, post them to our API Forum or join our API Mailing List.

A special Thanks goes out to all of those who have participated on that list to help us design a set of APIs that people will actually use.

SpokenWord.org Town Hall Conference Call

Interested in what’s coming from SpokenWord.org? Want to participate in the discussion? Join us for a conference call on Thursday:

August 27, 2009
Noon Pacific Daylight Time
Phone Number: +1.724.444.7444
Call ID: 18232

It’s best to access via the TalkShoe web site if you want to speak or ask questions: http://www.talkshoe.com/tc/18232

Our agenda will include:

  • new-feature planning
  • APIs
  • traffic-building activities

I hope you can join us on Thursday. If not, we’ll be making an MP3 recording available. Of course. Can’t make it? Email your questions or agenda items in advance: doug@conversationsnetwork.org

The Book Oven

Hugh McGuire and I have met only once, but we immediately recognized in each other similar ambitions, motivations and values. While I was building The Conversations Network, Hugh was doing the same for Librivox. (Thanks to Jon Udell for introducing us.) And for the past year, as I was working on SpokenWord.org (with Hugh’s help as an advisor) he was creating an excellent new site: the Book Oven.

If you’re involved in any aspect of publishing (as a writer, editor, proofreader, small publisher, designer or agent) you need to check this out. After successfully publishing more than 2,500 audiobooks on LibriVox, Hugh refers to the new project as “cloud-based publishing.” Crowdsourcing itself isn’t new, but the Book Oven promises to apply crowdsourcing to all aspects of publishing. The first component is called Bit-Sized Edits: sort of a mesh of Nathan McFarland’s CastingWords (based on Amazon’s Mechanican Turk) and the reCAPTCHA project.

As Hugh admits, the Book Oven is just getting started, but there’s already enough there to make it worth your time to visit and get involved. If you’re in the publishing world, you’ll want to be part of the Book Oven from the beginning.

A Great New Feedback Widget

Two weeks ago we posted a simple survey asking for your input on new directions for SpokenWord.org. This week we’ve gone a big step further and given you direct access to our to-do list. Not only can you vote for ideas already on the list, but you can also add your own. (And we take those votes seriously.)

Look for the new red “feedback” button on the left edge of every SpokenWord.org page. From there you can go to the Feedback Forum (the to-do list) or report a bug.

OK, so many of you did this just two week ago, but this is a better opportunity to see what we’re thinking and to express your opinions in far more detail. Thanks to UserVoice for this great new service.

Shortcut to the Feedback Forum

More ID Partners for SpokenWord.org

Thanks to a terrific service called RPX from JanRain, I’ve rewritten all of the code on SpokenWord.org for third-party identity providers. In addition to OpenID and Facebook, you can now use the following to login to the site: Google, Yahoo!, WordPress, Windows Live, Blogger, Flickr, AOL and Live Journal. Whereas I previously spent many days (each) to implement raw OpenID and Facebook Connect, getting the basic mechanism of RPX up and running takes only about two hours. And most of that is just waiting for a CNAME to appear in DNS. When you’re done, you instantly have access to a whole slew of third-party ID providers. I did spend a few more days to write about 800 lines of code — yeah, most of it was re-purposed — to fully integrate RPX into our existing identity system. But that’s only required if you need to allow users to link to their existing logins and you don’t want to use JanRain’s simplified identity-mapping service. And now, as JanRain adds more features and identity providers to RPX, we get them with no development/integration effort at all.

Collection Limits for SpokenWord.org

Because SpokenWord.org collections can subscribe to feeds and even follow other collections, they can grow to a size that is unmanageable. We’ve therefore added three ways in which you can keep your collections under control.

  1. Limit the number of programs.
  2. Limit the age of programs.
  3. Limit the size of a collections RSS feed.

On your collection’s page, click the Info link under “Edit This Collection”.

1. “Remove oldest programs when there are more than [count] or [age].” The default value for [count] is 1,000, the maximum number of programs any collection can contain. If you want to keep your collection smaller, select another value: 10, 25, 100 or 250. As you add new programs, earlier-added programs will be removed in order to maintain the maximum size you specify.

2. Likewise [age] tells us how long to keep programs from the date you collect them. The default is never to delete them (by age), but you can change this to automatically remove programs that have been in your collection for more than one week, one month or one year.

3. “Most-recent programs to include in RSS feed: [count].” By default, we’ll include up to 100 programs from your collection in its RSS feed. But you can use this option to change that value to 10, 25, 50, 100, 250 or “all”.

Note: Although you can set all of these values now, only #3 (RSS limits) is operational. We won’t turn on #1 and #2 until at least Wednesday morning (July 15) at 9am Pacific time to allow you time to modify your collections that may be affected by the change.

Facebook Connect for SpokenWord.org

Yesterday I rolled out Facebook Connect for SpokenWord.org, and if you have a Facebook account I urge you to stop by, give it a try, and let us know if it works for you. The integration is about two-thirds done, but you probably won’t notice the missing one-third. It has been an interesting process so far. I previously implemented OpenID, and I expected something similar, but that’s not the case. The concepts of the two systems are similar, but the realities are quite different. For example:

  • Facebook’s documentation is awful. Rather than one or two coherent documents there are dozens of wiki pages written, as far as I can tell, by the developers themselves, not good tech writers. Each page is written in a different style and documents (usually incompletely) one small piece of the big picture. To actually integrate Facebook into an existing identity system, there are many — more than becessary — moving parts.
  • Although a FB user explicitly authorizes your application, FB refuses to supply his or her email address through the API. Instead, there’s a very Baroque system by which you send FB hashed versions of the email addresses of all your existing registered members in advance so that Facebook can then let you know that one of them matches a FB user at the time that user authorizes your application. But if a new (to you) FB user logs into your site, you don’t have that existing data. (OpenId’s API gives you an email address if the user approves.)
  • The Facebook Terms of Service are oppressive. They must have been written by Facebook’s Business Prevention Division. For example, you are not allowed to store (in a database) any personal data you receive from Facebook Connect. When a user authorizes our app, FB sends us the user’s first and last names. We’re allowed to display those while the user is connected, but not thereafter. (We get around this by asking the user to give us this data independently.) I noticed that TechCrunch uses Facebook Connect for comments, so I was curious what would happen if I left a comment on their blog and then de-authorized the TechCrunch app. Sure enough, my comments disappeared from their site, and when I re-enabled the app, the comments re-appeared. Weird.
  • The email thing is particularly nasty, for while we’re not sending FB our users’ emaill addresses unencrypted (which would violate our own Privacy Policy), we are sending an MD5 hash of those addresses. This means FB can compare the hashes we send them to the 100+ million email addresses they already have, allowing them to determine that someone is a registered members on our site even before that person authorizes the use of his/her FB identity to access our site.
  • FB requires that if a user is logged in via Facebook, you display that user’s Facebook photo on every page they view. No reason is given for this requirement, and very few Facebook Connect sites do so. (Digg is an exception.) Note that this (and other ToS issues) requires that you load FB’s supporting JavaScript on every page.
  • Oh, did I mention how bad their documentation is?

All of that said — and there are many more issues — we’ve had many requests for this integration as a way to make it easier to register for and login to SpokenWord.org. I hope you find it valuable.