Blog

Or search support forum

What's Global Moxie?

Global Moxie is the hypertext laboratory of Josh Clark, whose projects include the Big Medium web content management system. Josh creates web applications and websites from his multimedia studio in Paris, France.

What's Big Medium?

Big Medium is flexible, easy-to-use server software for creating and editing websites directly from your browser. Check out the features or download now.

Moxiemail

Enter your e-mail to receive occasional updates:

Spock: Where No Search Engine Has Gone Before

Posted Apr 16, 2007 (updated Jun 13, 2007)

spock.jpg
Photo © Paramount Pictures.

What’s in a name? In my case, a Star Trek officer.

I have the Google-crippling misfortune of sharing a name with the actor Josh Clark, who had a recurring role on the Star Trek Voyager series. Alas, it seems that Google has more regard for galactic federations than for global moxies. The interstellar Josh Clark makes his star turn several times in the Google results before I finally limp my way onto the page behind a handful of other earthbound Josh Clarks.

Consider the plight of the hundreds (thousands! millions!) of people trying to find out more about yours truly, only to get caught up in a nebula of Josh Clark parallel dimensions. Call it my Star Trek problem, a disruption in the Google search continuum.

We all have some version of this problem, of course. By now Google has introduced just about everyone to their moniker doppelgangers. And when searching for people, we’ve all had the frustrating experience of combing the results for the real McCoy. Given that 30 percent of web searches are people-related,[1] it’s surprising that search companies haven’t put more effort into helping us sort out individuals.

Good news: Spock may have the answer to the Star Trek problem. I’m not talking about the pointy-eared science officer, either.

Spock.com is a people search engine that is slated to launch this week in a couple of months (it’s currently in private beta testing). Tim O’Reilly and Michael Arrington have both recently offered previews of the service, and it seems compelling. From Tim’s review:

You can search for a class of person, say politicians, or people associated with a topic -- say Ruby on Rails. The spock robot automatically creates tags for any person it finds (and it gathers information on people from Wikipedia, social networking sites like LinkedIn and Facebook), but it also lets users add tags of their own, and vote existing tags up or down to strengthen the associations between people and topics. Users can also identify relationships between people (friend, co-worker, etc.), upload pictures, and provide other types of information.

In other words, Spock promises to gather and organize all of the various information and links about a person onto a single profile page. Unlike Google, where a single person can dominate search results and make it difficult to find average folks, every person gets just one result in Spock. Since you can narrow searches by field or affiliation, finding the right person suddenly becomes much easier.

This Is Your Life

spock_ericschmidt.png
Google CEO Eric Schmidt's profile at Spock. Photo via Tim O'Reilly. Click to enlarge.

As a developer, though, I find the single summary page to be particularly interesting. If it works as advertised, it has the potential to provide a single URL to sum up a person’s online identity. That kind of canonical address for people has been elusive but strikes me as both powerful and important.

The TV show This Is Your Life presented the lives of its guests (celebrities and everyday people alike) by way of a hefty book consulted by the host. It was a good gimmick, and there was something charming in the idea that your own life story might be out there somewhere in a handsome, single-edition volume.

We could really use the equivalent of a “this is your life” book online. At the moment, no such single web reference exists for people. For better or worse, Wikipedia pages have emerged as an it’ll-do substitute for a standard URL reference for notable people, but often at the expense of more reliable sources and without much help for the not-so-notable.

There’s real value in having a single standard URL for, well, just about anything. I had the good fortune of attending a talk in London last year by the ever-interesting Tom Coates. He described best practices for participating in our ever-growing web of data and, in particular, how developers can cultivate and contribute searchable data to the web. Principal among his points was the importance of giving primary data objects single, permanent URLs. Everything in your website, in other words, should have a single home with one and only one URL. Among other things, this makes it easy for outside sites and services to find and reference your content. It makes the web a better place.

Things like news articles and products tend to have a single address on the web, since these items are typically controlled by a single organization and site. Not so for people. We’re complicated and crafty creatures who spread, tribble-like, across the net, across organizations, across activities. I have many homes: my blog, my company, my flickr account, my past employers, my past projects, my music, my bookmarks, my alma mater, you name it. Ironically, the more that you participate online, the more difficult it is to capture a defined picture of who you are on the web, certainly not via a single URL.

After his talk, Tom was kind enough to indulge a few of my questions, and we talked about the challenge of correlating external IDs. That is, how should a website or app refer to data that it does not control, particularly when that data is spread out across multiple sites? How to create a single URL to represent that kind of distributed data?

Tom admitted that there was not yet a good answer for that one but that he imagined that we could eventually develop URLs that point to link clouds, clusters of URLs that all reference the same topic and in aggregate provide a coherent description. His hope was that a service would emerge that could find the affinities among various URLs to create a canonical identity for people, films, events, etc.

“What have you done with Spock’s brain?”

spock_brain.jpg
Photo © Paramount Pictures.

This seems to be exactly what Spock promises to do, for people at least, but much depends on Spock’s smarts. Neither gathering the info nor separating out individuals is an easy task (Spock proposes a clever mix of web spidering and community contributions, but they’re offering $50,000 to the coder who can make it better). There are also spam and privacy issues to consider in all of this.

Whether or not Spock gets it right, people search and identity aggregation is a problem that’s crying out to be solved, and it’s starting to look like that might happen sooner rather than later. At a minimum, Spock’s early preview press suggests that its approach points in the right direction.

Meantime, as a developer, I’d like to know how I can help. How can Big Medium generate pages in a way that helps search engines make people-smart associations? Is it through microformats? Will a format emerge to let developers tag pages with an individual’s unique ID and indicate how they’re affiliated to the page? Will there be an API that lets web applications fetch the data, or at least the URL, for individuals?

Looking forward to seeing how this shakes out. At the very least, I’m hoping that Spock will help this Josh Clark to escape my Star Trek captors.

1. I plucked this stat from Michael Arrington's article about the Spock search engine. [back]

Tags: , , , ,

Want more? Recent blog entries...

Comments

8 comment(s) on this page (times are local Paris time). Add your own comment below.

Sean
Apr 16, 2007 9:40pm [ 1 ]

Spock is hiring bright engineers!

Reasons to Work at SPOCK

  1. Free lunch every day and tons of snacks to keep you full all day.

  2. We still have a very small engineering team. You won't be a cog in the wheel here.

  3. Giant flat-panel monitors for all engineers and a new Mac or PC.

  4. You get to work with Jay.

  5. SPOCK has potential to be a part of every Internet user's daily life!

apply now! jobs@spock.com

Josh Clark
Apr 17, 2007 11:58am [ 2 ]

While I'm flattered that the Spock team took note of my post, I have to confess that Sean's post dampens my optimism a bit. I wish Sean had taken this blog post as an opportunity to answer some of my closing questions re: how can the larger developer community help to make the web ecosystem friendlier to this new type of service?

Spam management is going to be a huge part of making this type of service work as a reliable source of content. This type of drive-by recruiting ad doesn't exactly suggest an enlightened view of spam on Spock's part. I hope I'm wrong about that.

Apr 20, 2007 2:49pm [ 3 ]

Don't have the same problem as you, Josh, as to the best of my Knowledge, I am one of only two "Mark Thristan"s in the world (and the other one, a distant cousin, doesn't appear to have much of a web presence at all). I think Web 2.0 (much as I hate the moniker) - by focusing on the social - really makes disambiguated single points of reference to individuals highly important: issues of trust and reputation pretty much circle around "known" individuals. I've not got access to the Spock Beta yet, but one query I would have would be, what is the process for correcting information on a page if Spock's algorithms get it wrong about someone? How are they going to go about validating those sort of requests requiring a proof of identity?

Josh
Apr 20, 2007 5:52pm [ 4 ]

Sounds like I'd be better off in the Thristan clan than lost in my crowd of Clarks. You accepting applications?

I haven't gotten an invite to Spock yet either, so I'm just working from what I've read second-hand. My understanding, though, is that they'll allow community contributions and edits, so that the result will be somewhat wiki-like in its ability to self-correct. This of course brings up the usual issues of vandalism and accuracy.

Apparently you'll also be able to claim your own page (not sure how they'll verify your identity), which will allow you to have some measure of control about what's displayed about you. Of course, that could likewise encourage spam or at least the temptation to pad one's page.

Possibly even more daunting are the general privacy issues. The dustup when Facebook aggregated members' information a couple of months ago shows that people get itchy when their info is massaged without their express permission. Could be explosive if not handled correctly.

A lot of challenges ahead in order to get this right in terms of both accuracy and privacy. I hope somebody can pull it off, whether it's Spock or another outfit.

May 29, 2007 5:59pm [ 5 ]

I am beta testing Spock. I fear it may be over run by bozos, rather than than populated with serious technicians and uber geeks. Spock needs an About, FAQ, and User Guide, with Glossary.

I am not sure how to tag my contacts. They need more information on site, if they want to build a really valuable People Search. Type in "geek" and get a lot of weirdos. Type in "web developer" get nothing. Type in "Web 2.0" get the 5 contacts I tagged thusly.

Workin' on it.

Nice user-friendly captcha, BTW.

May 29, 2007 6:01pm [ 6 ]

You can edit, ie tag etc, other people's profile pages. Weird. Thus, they need to control who becomes members. Probably a mistake to let us beta tester users import all our Gmail contacts.

May 29, 2007 8:16pm [ 7 ]

Vaspers has a point here, as usual. If everyone is allowed to tag, things could easily get out of hand (i.e. contentious articles on Wikipedia, vandalism). I don't want someone to tag me "pedophile," for example, even as a "joke." Even if caught by the Spock team pretty much immediately, it could easily ruin a reputation or scare away a potential client. Some other issues, too.

This bit, in their T.O.S., is a little sad, too: "You may not take the results from a Spock search and reformat and display them, or mirror the Spock home page or results pages on your Web site. You may not "meta-search" or aggregate Spock search results."

Yet another service trying to lock in the data we generate.

I've also been unable to claim a couple of other "ME's" so far that were automatically pulled from Friendster and MySpace. Might be a bug, but might be my fault. Wish they had a bit more guidance for the Beta testers. What do you want help on, Spock?

Josh
May 30, 2007 8:58am [ 8 ]

I just got a beta invite, too, and have only had a little time to tinker first-hand, and I was disappointed, too, by usability and performance issues (I found it painfully slow to edit a profile).

However, I'm not necessarily opposed to user-tagged profiles, including profiles of others. I agree that it does raise spam and vandalism possibilities, but it's also an interesting opportunity to gather profile info that might not otherwise be available. An interesting experiment that will, I agree, need some kind of additional moderation. I'm not convinced that the ability of the community to vote up and down tags on individual profiles will do the trick for anyone but the most high-profile people. So, yes, this approach will definitely need some tinkering and policing, but it strikes me as worthwhile to explore.

The terms of use, though, are particularly disappointing, since they appear to discourage mixing, mashing and other uses outside of the Spock site. That pretty much seems to deep-six Spock as a useful, portable provider of canonical URLs, which initially struck me as its most important potential service. Whether Spock or someone else provides it, that's the service that seems most useful to me.

I do plan to poke around with Spock a bit more; like other commenters, I'm not really sure that I've really yet gotten my head around the service or its features. A user guide or tour would be helpful.

Add a Comment

Don't be shy.

( )

( Use Markdown for formatting.)

Download Big Medium
Try it free for 30 days, or buy to unlock.

State of Josh's Brain