Archive for the ‘Uncategorized’ Category

Our new release – an abundance of features awaits you!

Tuesday, May 12th, 2009

We deployed our latest release, and it’s got a ton of good stuff. Coupled with yesterday’s announcement of our iPhone APP (thanks @TechFlash, for covering it!) we have a lot going on.

Here’s what’s in the release. A new home page, featuring our EvriFeed – a constantly updating stream of the latest about interesting topics in the EvriVerse. We use our unique approach to find interesting topics (grammatical subjects and objects in sentences being written on the web) and stream it to the home page. It’s always updating, so check back frequently to find something surprising.
hompage-mod

The EvriFeed shows you the Topic in question, a link to the relevant article on the web, and the most recent Evri connections for this topic.

Also on the new home page we are showing more of the breadth and depth of our content in the new Browse box on the right-hand side. You can easily find topics in our major domain — Politics, Entertainment, Business and Sports – or go directly to our full browse experience, which we talked about recently.

browse-box-hp

Continuing with the Home page, we have made your collections — and other interesting ones — much easier to find. If you haven’t yet created and account and built some collections, you really should. This is the best way to track your outside interests in real-time on the web.

collections-hp

Our Profile pages have new stuff as well. First, we now show you any collections (yours or others) that this topic is part of. This is a great way to discover the power of collections. If you are on the page for Star Trek (the new movie) for example, you can see it’s in three collections, including this one that I created.

We also now provide a way for you to find our how any topic is related to anything you want to type in. Even if it’s not in the Dig Deeper section of the page, you can type in any search term that you want to connect to the current profile and see how they are related. In this case, I was on the Microsoft profile, and wanted to see how they were currently related to Yahoo. I entered Yahoo in the ‘filter by keyword’ box and get the results right there. Cool!

keyword

Last thing I’ll mention for now is that we have added interactive stock charts to all of our public company profiles, thanks to our friends at Wikinvest. This is a highly-interactive in-page application that gives more than just a share price. Coupled with the other information we have on the page our company profiles just got a lot more valuable. (Though GM may not feel that about their page today!)

wikinvest

I’ll do follow-up posts exploring some of these features in more detail. Please let us know what you think, and what you would like to see us work on next and what we need to fix. You can always email me (neil[at]evri.com). And don’t forget that you can find us on Twitter: @evri is the company account, and you can get to me @neilr.

Semantic Web Meetups Come to Seattle

Monday, April 27th, 2009

Last week I read about the first Semantic Web meetup in Vancouver, BC organized by Melanie Courtot. I started wondering why we don’t have something similar down here in Seattle given all the folks in the start up arena, academia, and the larger corporate world all working on things involving the word “semantic.” I was chatting w/ Neil, our CEO here at Evri, and he mentioned that Alex Iskold, Founder/CEO of Adaptive Blue and feature writer for ReadWriteWeb, was going to be in town on May 6th; Alex was interested in presenting to a meetup group if there was one. We said, well no, there isn’t one, but lets try to get one together. So, inspired by Melanie, I was just about to create a meetup group on meetup.com, when I got a notice telling me that one had, only moments before, been created by Jillian McRae. Jillian was interested in getting the semantic community together, as well as meeting up with folks heading down to SemTech 2009 in June.  So I joined the group, contacted Jillian, and explained the sequence of events. Jillian replied, “Serendipitous!” So there you have it folks, a swirling nexus of events due to lots of excitement over all things semantic.  Come to our first meetup of soggy Seattlites now Semantically Webbed — mingle with folks, drink some, eat some, and listen to Alex present on Get Glue and other semantic tech. All the details are HERE.

Our latest release: Browse, Twitter, Quotes and much more.

Friday, April 17th, 2009

We deployed our latest release last night, and it’s packed with new stuff. Here’s a rundown of the new features.

Browse
You can now browse through all of our millions of entity profile pages! Go here to find the category you are interested in, or just click a category name next to anything on the site. We have everything from Amusement Parks (there are 489 of those) to Zoos (389) and everything in between. You can add anything you want to follow in a Collection directly from the browse lists too.browse all categories page

Quotes & Tweets on Entity Detail Pages
Our profile pages (EDPs) have a couple new features. In addition to the new browse capabilities, you can see related tweets about the topic of the page. We use Evri to process the text of the tweets and find other relevant entities buried in the sometimes overwhelming volume of the Twitter stream.Tweets on the Rachel Maddow Evri profile page

Want to know who is saying what, about whom? Now you can with the new Quotes feature on Evri.com. You can see quotes by people, and quotes about people, places and things. We use Evri’s linguistic parsing to find quotes in the tens of thousands of sources we read every day. This very cool feature finds what people in the news are talking about and keeps it dynamically updated.
cowell-quotes


Your favorite things – to go!
Want to take your favorite things with you? Now, from any of our profile pages you can now get embed code so you can put your favorite things anywhere on the web! Perfect for blogs, home pages — anywhere you want to keep up with your interests.

Collections
Our Collections feature is a whole new way for you to keep track of your interests on the web. We have given Collections a “newsfeed” view which provides content recommendations based on all the entities in your collection. We comb all our our sources to find the most compelling, up-to-date content about your collection as a whole. Collections are public, so you can easily share them around the web.

avatar-collection

In addition, once you’re signed in, you’ll see your username linked in the top right – click this to access a list of all your collections. We added the ability to delete collections from this page as well.

Widget Gallery
In addition to the single topic widget, we also launched a new format, the Post widget. Check out this and all of our other applications here.

It’s in the API
Most of what’s above is, or will shortly be, available in our API. If you are a developer, take a look at our API docs, and please do let us know what you are building.

We want to hear from you.
Send me any feedback, comments, suggestions, ok? Please do try out the new features and let us know what you think. You can email me (neil[at]evri.com), or get to us via twitter, evri or neilr

Structuring Data is a Dirty Business

Monday, March 16th, 2009

Why is George W. Bush a criminal?
Why is Joe Biden a football player?
Why is Tiger Woods a journalist?
Why is Britney Spears a director?

Our goal is not to make any editorial decisions about former President Bush or anyone else; in fact, we prefer to not editorialize any of our data. Our philosophy is to allow data to speak for itself.

Why it’s not what you think
We are building our knowledge base from numerous structured and semi-structured data sources, like Freebase and Wikipedia. So, for example, if you head over to Freebase and check out the George W Bush page (http://www.freebase.com/view/en/george_w_bush), you can see under the Crime topic, Freebase has information about his conviction for drunk driving. Although this was years ago in his younger days, in our system, conviction for a crime (at this time, any crime), will lead an individual to be faceted (the broad categories we classify people with – think politician, olympic medalist, and so forth) as a criminal.

Likewise, Joe Biden was the halfback for the Blue Hens football team at his alma mater. the University of Delaware. Tiger Woods writes a weekly column for the Golf Digest, and Britney Spears has directed music videos.  All of these data lead our system to facet these entities accordingly.

Bringing order to the universe is challenging
As the curator, my main efforts involve resolving the structural differences between our sources at a high level, rather than at the individual level. For example, one data source may include individual baseball players classified according to the positions they play. Do we roll all the players together as ‘baseball player’, or do we retain the source’s designations of ‘pitcher’, ‘first baseman’, or do we create a hybrid of both representations?

The decisions we make at the higher level can on occasion translate into seemingly odd classifications. This is why exposing our users to the context of the classifications is so important.  A richer experience with the world’s information is what we are trying to build and is one of the most interesting aspects of what we could do.

The Grammar Students Guide to Radiohead

Tuesday, July 15th, 2008

Here at Evri, we talk a lot about searching less. When we talk about searching less, we are talking about you, our users with precious time — we want you to search less — we aren’t talking about our machines, because they do an awful lot of searching so you don’t have to. So how are they, our racks and racks of computers, searching so you can understand more?

Well it comes down to teaching our machines to read documents more similar to the way humans do – to basically understand more of the meaning of the documents they index. This is very different from what traditional keyword based search technology does. Typical search engines, when they encounter a document, treat the document like a bag of words — associations between the words, how they interconnect, and form actual meaning is lost. Consider the following text snippet from a Starpulse article:

Howard insists they won’t be copying Radiohead’s idea and making their disc only available on the internet. [...] He tells BBC Radio 1, “We won’t be doing the same thing as Radiohead, no.” [...] Last year, Radiohead released In Rainbows as an Internet download and allowed fans to name their own price for the album.

Now from this snippet of text, your favorite search engine will store this data something like:

Radiohead – 3
Howard – 1
Rainbows – 1
released – 1
Internet – 1

and so on. I’m simplifying things a lot for the sake of discussion, but basically, your favorite search engine is maintaining a list of words, and keeping track of how many times those words appear in a given document. This approach works quite well for finding websites, but not very well for discovering facts, or relationships describing how people, places and things interconnect.

Now consider how Evri’s approach is different. For this same snippet of text, our machines will break the snippet out into multiple sentences. For each sentence, our machines will, in essence, diagram the sentence similar to what you did back in 7th grade grammar class. So, for every grammatical clause in a sentence, our system creates a data structure like that shown below.

In the last sentence of the snippet above, our system will store a relationship like: Radiohead > released > In Rainbows

In addition, our system knows that Radiohead is a band, released is a verb, and In Rainbows is an album. If a sentence said: Radiohead of Oxfordshire may release an album called In Rainbows, our system will store Oxfordshire as the suffix modifer of Radiohead, and will store the verb release as being conditional; knowing that a verb is conditional or negated is important as this information can be used to determine where in a list of results this relationship should appear. In addition, if a subsequent sentence says something like: The band’s experiment proved successful., our system will know that The band refers to Radiohead; this is because our system attempts to resolve anaphora similar to the way humans do. Finally, this triplet style data structure is searchable at web scale and web speed by searches expressible in a query language; this query language is quite flexible, but basically allows our recommendation and information navigation applications to formulate effective queries in a precise manner. For example, a query like:

[musical_artist] OR [band] > praise > Radiohead

is being used to render the right column in the entity detail page shown in the screen shot below.

When you actually click on a person or organization, like Billy Corgan, the system will execute a more refined query like:

Billy Corgan > praise > Radiohead

One of the challenges our scientists and engineers face is how to formulate these types of queries in clever ways so you, the user, do not have to; I’ll save this discussion for another day, however.

Finally, we published a book chapter last year that does a more thorough job explaining our approach, and additional grammatical treatments our system performs. So if you’re interested, see the Natural Language Processing and Text Mining book chapter titled A Case Study in Natural Language Based Web Search.