Archive for 2008

A tour of Evri.com

Wednesday, October 22nd, 2008

Evri.com is the only way to browse the connections that exist across the web. Watch this video to see how the Seattle Mariner’s record breaking Ichiro is connected to other people, places and things in articles, blog posts, videos and images around the world.


Evri.com Profile Page Tour from Evri on Vimeo.

Evri’s Garden Sprouts Some Search

Tuesday, September 30th, 2008

We thought about launching a labs site where we could showcase our latest gadgetry, but decided none of us really fancy wearing lab coats. Many of us have gardens, however, and a few of us wear overalls, so we figured we’d instead start a garden to sprout new ideas. So, voila: we have a new section of our site called Evri’s Garden where we’ll be showcasing our fresh but not fully farmed veggies. Our first garden sprout is Evri Search, which I’ll spend some time chatting about.

Evri Search exposes our text analysis infrastructure that automatically identifies and makes available linguistic links connecting people, places and things found on the web. To provide this enhanced search capability, Evri Search performs an exhaustive deep natural language processing based analysis of every sentence in our corpus. This search interface allows you to directly interact with the same back end system our scientists and engineers use everyday to fine tune the algorithms used in our applications to search on your behalf. The help section on the search page is pretty exhaustive, so I thought it would be more entertaining to just walk through some interesting queries.

One of my favorite queries is to find corporate acquisitions. To do so using the Evri query language (EQL), I can construct a query like:

[Organization/Name]>buy>[Organization/Name]

In this query, I am asking the system for all sentences containing a grammatical clause where the source of an action is a named organization (usually companies but also non profits and government agencies), the action is the verb buy (or similar verbs), and the target of the action is also a company.  Here is a screen shot of the results the day I executed the query:

Note: the system has over 24000 instances of acquisitions, and I am shown them in ranked order. One day I will chat more about how we do this ranking, but in the mean time, suffice it to say many factors impact this ordering, including, but not limited to: relevance of the document, verb condition, importance of the entities, relationship parts of speech, relationship redundancy, document timeliness, and credibility of the source.

Now also note: I’m shown the name of the acquiring company and the name of the acquired company; I’m not sent off to a web page to sift through acquisitions nor do I need to merge results from multiple websites containing acquisitions. A key goal of traditional search engines, as well as many semantic search engines, is to point users to documents, or web sites, where users are expected to read the results and assimilate the information they are after. Evri Search excels at distilling relationships, or facts, from disparate web sites — this ultimately enables users to read less, and understand more. Now let’s expand the first result:

Note: the relationship: Bank of America > buy > Merrill Lynch was extracted from multiple different sentences, or different ways of expressing the same thing. Also note: you can click on the article titles to visit the article and read the sentence containing the matched relationship in context. Let’s do a slight modification of this query now, and execute:

[Organization/Name]>buy>[Organization/Name] PREP CONTAINS [Money]

Now we are asking for the same relationships as before, except now we only want relationships where the complement of the preposition is a monetary amount. In other words, the sentence should contain language like: Company X bought Company Y for Z dollars. Here is the first result expanded:

Note: In every sentence, the monetary amount of the acquisition is mentioned. Now lets say we want to get even more constrained. Lets say we only care about acquisitions with the amount mentioned but in the media sector. We could constrain the query a bit more:

[Organization/Name]>buy>[Organization/Name] PREP CONTAINS [Money] CONTEXT CONTAINS media

Now we are asking the system for the same results as before, except the context (the sentence containing the relationship, the sentence before or the sentence after) must contain the word media. Note: the results are now focused on the media sector:

You may, on occasion, note that the sentence matching a query does not contain the name of the entity. For example, in the query:

shark>attack>[Person/Name]

I expanded the first result (when I ran the query), and got:

Note: the shark attack victim is not mentioned in the matched sentence shown in black. This is because the pronoun she is referring to Bettina Pereira mentioned in a previous sentence. Evri Search is able to understand, similar to the way a human does, that pronouns (along with other anaphora like the company and the lawyer) refer to other named entities.

I’ll now leave this post with a few queries to help you get started geeking out with Evri Search. Feel free to try these queries out on your favorite keyword or semantic search engine.

Finally, if you find any great searches you’d like to share, you’ll find the +share link in the top right of your browser that links to all your favorite bookmarking apps, else we’d love for you to drop it in the comments section here. Have fun geeking out on Evri Search.

Evri Inside

Monday, September 29th, 2008

We have gotten lot’s of great feedback – positive and constructive – since we opened the beta up a last week. All of your feedback is helpful, even when critical. I thought it would help to give a little more background on what we are doing, and how we think you might get the most out of visiting and using Evri.com

We think one of the best ways to find us is when you get to us through existing content that has already implemented our content recommendation widgets. From here, you might explore with the widget, go read something else on the same or a related topic, or head directly to an Evri profile page.

Sarah Palin - Joe Biden article

When you reach an Evri profile page, our hope is that you see us as more of a browse engine than anything. We definitely aren’t, and aren’t trying to be, a search engine. Because of our methods, using natural language and focusing on the named entities in content (the proper nouns), we train and configure the system on specific subject areas. For that reason, our Find box only finds things for which we have a profile page. We don’t want to disappoint you by taking you to a dead-end “No Results” page. The subject areas we focused on first are Politics and Entertainment, so these are the areas you can expect we’ll have really good coverage.  As a result, we have great pages about both major US figures (like presidential candidates Barack Obama and John McCain) and international ones like the new head of Israel’s Kadima Party, Tzipi Livni. We also do a good job on entertainment, from not-yet-released movies like Neil Gaiman’s Coraline, to classic movie stars like the recently departed and much-loved Paul Newman. We do have lots of coverage in other areas, but it’s not comprehensive…yet!

In coming months, we will add more coverage in Business, Technology, Sports, Health & Medicine, and others. What we definitely won’t have for a little while longer are many pages about non-proper nouns or “conceptual entities”. For example the term “Golden Retriever”, mentioned in an article, isn’t designed to work well in the current product. We think there is tremendous value in applying what we do to these types of subjects, and we will get there, but for right now we aren’t focusing on making this work. (Although, take a look at the UI at the bottom of this page for an example of what we get today for the subject “Bank Failure”.)

Another thing that is easy for our beta testers to sometimes miss is how much stuff we actually have. The key is the two-column grid that appears alongside the “Tinkertoy” relationship display. As an example, look at this section of the Angelina Jolie profile page. I have selected “Actor” in the left column of the “Top Articles” section. The results include a list of the Actors she is related to, with the top ten documents about these relationships listed below.

Angelina Jolie on Evri

Selecting any of the actors in the in the right hand column will cause the documents and the media to change to reflect your choice. Here, I’ve selected Billy Bob Thornton:

Angelina & Billy Bob Thornton

With this UI you can browse a very large number of web documents without being overwhelmed by a new list of keyword-based search results at each step. If it looks like we don’t have everything that’s because we show the top ten results for each combination that you are browsing. And, coming full circle,  that’s the idea: to encourage serendipitous discovery and browsing of web content about the things you are interested in. Where else could you browse from Angelina Jolie to Hamid Karzi in 2 clicks (the route goes through Jude Law)?

Please don’t forget to send us feedback - it really is appreciated.

The Beta is Open

Wednesday, September 24th, 2008

We’re pretty excited today, as we are opening the beta and removing the password restriction to the site. We think this is the first step on a long journey in making the web better for browsing and discovering what’s important and meaningful to you. With almost a billion connections between People, Places and Things contained in millions of documents written every day, our widgets and Profile Pages are designed to let you discover more, with less work.

We’re rolling out some new features along with opening the system up to everybody.

  • Use the widget on your site! You can now directly add the widget to your TypePad, Blogger or WordPress blog post, or cut-and-paste some script to add it to any blog or web page.
  • Try it with your own content: Just paste your text into the submit box on this page, and see what the Evri content recommendation widget comes up with. Try it with a blog post, article, or any text at all. As with the rest of Evri, it will work best with documents about People, Products, and Things of general interest.
  • Sharing: Share a favorite profile page, or even a cool grammatical query your built in Evri’s Garden.
  • Image Carousel: We have added photos in addition to videos to every Profile Page. We will be continue to enrich the media experience of the site over the next weeks and months.
  • Our Garden, where we will feature new stuff that isn’t ready for release, and behind-the-scenes technology that we use at Evri. Right now we have our unique search technology featured there. We will have another blog post with more details, but you can play with it right now.

We are all happy with, and excited by, what we’ve gotten done so far, but I am sure we have a lot more to do, and do better. There’s is a feedback link on every page and we would really like you to use it. Or, send me feedback directly to neil@evri.com.

Images on the Vicky Cristina Barcelona Page

images on the new Vicky Cristina Barcelona Profile page.

You Asked, We Delivered - New Features, More Entities, and More!

Wednesday, August 13th, 2008

We’ve gotten some great feedback from our early beta users, and we’ve been busy at Evri world HQ working on updates to Evri.com, the Evri widgets, and the rest of our products. Probably the single most frequent feedback request was: “I am looking for a specific person (or place or thing) — where’s the “Find” box?” Well, we heard you, and it’s now there — a brand-spanking new Find box on every page. Just start typing, and if we have a match, it’ll show up in the drop down list. Pick what you are interested in, and you will go right to the Evri profile page for that topic.

What else is new? Here are some highlights:

More entities - We have doubled the number of People, Products, and Things for which we have profile pages. As you can see from the Find list image above, we even have Klingons!

Navigation and UI Improvements - we’ve made some changes to the user interface and experience to improve how the site and widgets work. The home page lists are now streamlined, and we have made it easier to get right to a profile page (just click on the Evri icon). On the profile pages, we have grouped things together in what we think is a more logical fashion, and added better navigation to related profile pages.

New widget types, and a special page for them! Content publishers, bloggers, and anyone interested in cool new widgets should look at our new Partners page. We have some new widget types on display to show how Evri can increase user engagement on your site. If you are a blogger or other publisher that is interested in Evri, please contact us.

Also, lots of improvements to performance, stability, and overall quality (at least, we think so.) We would of course love to know what you think as well — if you haven’t signed up for the beta, please do. And, if you already have, please send us feedback on the new features.

The Grammar Students Guide to Radiohead

Tuesday, July 15th, 2008

Here at Evri, we talk a lot about searching less. When we talk about searching less, we are talking about you, our users with precious time — we want you to search less — we aren’t talking about our machines, because they do an awful lot of searching so you don’t have to. So how are they, our racks and racks of computers, searching so you can understand more?

Well it comes down to teaching our machines to read documents more similar to the way humans do - to basically understand more of the meaning of the documents they index. This is very different from what traditional keyword based search technology does. Typical search engines, when they encounter a document, treat the document like a bag of words — associations between the words, how they interconnect, and form actual meaning is lost. Consider the following text snippet from a Starpulse article:

Howard insists they won’t be copying Radiohead’s idea and making their disc only available on the internet. [...] He tells BBC Radio 1, “We won’t be doing the same thing as Radiohead, no.” [...] Last year, Radiohead released In Rainbows as an Internet download and allowed fans to name their own price for the album.

Now from this snippet of text, your favorite search engine will store this data something like:

Radiohead - 3
Howard - 1
Rainbows - 1
released - 1
Internet - 1

and so on. I’m simplifying things a lot for the sake of discussion, but basically, your favorite search engine is maintaining a list of words, and keeping track of how many times those words appear in a given document. This approach works quite well for finding websites, but not very well for discovering facts, or relationships describing how people, places and things interconnect.

Now consider how Evri’s approach is different. For this same snippet of text, our machines will break the snippet out into multiple sentences. For each sentence, our machines will, in essence, diagram the sentence similar to what you did back in 7th grade grammar class. So, for every grammatical clause in a sentence, our system creates a data structure like that shown below.

In the last sentence of the snippet above, our system will store a relationship like: Radiohead > released > In Rainbows

In addition, our system knows that Radiohead is a band, released is a verb, and In Rainbows is an album. If a sentence said: Radiohead of Oxfordshire may release an album called In Rainbows, our system will store Oxfordshire as the suffix modifer of Radiohead, and will store the verb release as being conditional; knowing that a verb is conditional or negated is important as this information can be used to determine where in a list of results this relationship should appear. In addition, if a subsequent sentence says something like: The band’s experiment proved successful., our system will know that The band refers to Radiohead; this is because our system attempts to resolve anaphora similar to the way humans do. Finally, this triplet style data structure is searchable at web scale and web speed by searches expressible in a query language; this query language is quite flexible, but basically allows our recommendation and information navigation applications to formulate effective queries in a precise manner. For example, a query like:

[musical_artist] OR [band] > praise > Radiohead

is being used to render the right column in the entity detail page shown in the screen shot below.

When you actually click on a person or organization, like Billy Corgan, the system will execute a more refined query like:

Billy Corgan > praise > Radiohead

One of the challenges our scientists and engineers face is how to formulate these types of queries in clever ways so you, the user, do not have to; I’ll save this discussion for another day, however.

Finally, we published a book chapter last year that does a more thorough job explaining our approach, and additional grammatical treatments our system performs. So if you’re interested, see the Natural Language Processing and Text Mining book chapter titled A Case Study in Natural Language Based Web Search.

“We’re not a band, we’re a company” or, The Evri Ontology Explained

Friday, July 11th, 2008

The quote in the title is from an (in)famous 1980 interview with John Lydon (Rotten) and Keith Levene of the group Public Image Limited. They spent a good portion of the time saying, “We’re not a band, we’re a company” to an obviously perplexed host, Tom Snyder.

Other than appealing to post-punk fanboys, why I am talking about this? Well, the PiL boys are raising a point near and dear to our hearts — what is the difference between a “band” and a “company”? And, how do you tell the difference programmatically? This is important because we use software to ‘read” web content, extract all of the named entities and then try to categorize them against our standard ontology.  We refer to the individual things (Barack Obama, James Bond, and Paris, France) and subjects (Grammy Awards, World War II, and USA Patriot Act) of the world our users would want to know more about or understand the connections between as ‘entities.’

At Evri, we use six ontology types - People, Locations, Organizations, Products, Events, and Concepts (the last four are grouped as Things on our homepage.) These are intentionally broad — they are intended to be the most immutable part of our description of things — once a person, always a person. We have a couple of other ‘root’ level types - Temporal and Numeric - that are used, and are useful when analyzing content and making recommendations, but not shown directly in the User Interface. These are not fixed for all time — as the scope of our entity coverage grows we have built things in a way to make is easy to add more basic types.

For the more dynamic description of an entity’s characteristics or role, we use the term facet. Each entity can be of only one basic type, but can have many facets. A location is only a Location, but it can have multiple facets (State Capital and County Seat, for example.) The dynamic and extensible nature of facets let’s us rapidly respond to emerging descriptions in the web content our systems analyze. You can think of facets as a kind of tagging.

We show the current facet(s) of an entity at the top of each profile page. You can see this in the screenshot of the top of Bono’s Evri Profile, that his facet’s are ‘Musician’ and ‘Activist.’

We use facets in many ways. First, to provide information to the user — so that you know more about the person, place or thing you are looking at. But, it’s not just for display. We use this information to help with our document analysis systems. Facet information helps with entity disambiguation, for example. It’s very important for us to have the highest degree of precision identifying which particular ‘Michael Jackson” is referred to in content on the web. Unlike keyword search, being correct here is crucial to what we are building.

michael jackson disambiguation

Facets also help with making recommendations. Our systems, and our curators, learn what kinds of activities and relationships occur most often for particular facets. We can then use this information to create templates to highlight these actions and relationships in our user interface. For example, ‘Musician’ facets are often “performing” so we make sure that we highlight this action, if it exists, on a musician’s profile page.

Lastly for now, I would mention that once we figure out our browse UI,  an important way to explore our Evri profiles will be by pivoting on these facets.

We will have future posts from the team that works on these systems as soon as they finish them :)

The Evri Widget In Action (& other stuff)!

Wednesday, July 9th, 2008

We have made it possible now for those of you in the Beta Preview to see how the first version of the Evri widget works. If you go to the main beta page you will find a list of three recent news headlines at the bottom of the page. Click on one of these to display the article, then click on the large banner in the middle of the page.

Evri Article Page

Article with Widget

We have also made it a bit easier to get to Evri Profile Pages from the Home page and elsewhere. Based on your feedback, we realized that this was a bit opaque… We have some bigger changes coming at the end of this sprint, but for now, we have added a simple way to go directly to a Person, Place, or Thing’s Profile Page. Just look for the small Evri icon next to the name, and click! You get right to where you wanted to go.

Home Page with Turtle

Beta Preview Coverage

Wednesday, July 2nd, 2008

We’ve gotten a lot of good, constructive coverage for our beta prieview. Thanks to all who wrote and who have provided feedback. Nice to get Techrunched, and see us in VentureBeat, ReadWriteWeb, and and in Seattle’s own John Cook’s blog. VentureBeat’s headline, “Evri launches semantic site to help blaze paths through the Internet” obviously pleased us, as did Lee Graham’s, “Is Evri the Twitter of Digg?” (I am pretty sure that would be a good thing, if I follow the meaning!) Our friends at Freewaregenius hit the nail on the head: “Evri: re-discover the interconnected web

All of the feedback is great to hear, and we are going to make some small, mid-sprint changes to address some of what we’re hearing. I hope to post a screencast later tonight to provide some guide to the cool stuff we are doing.

A Picture is Worth…

Wednesday, June 25th, 2008

The artists at Cohitre have been kind enough to create a pictorial representation of our beta launch. In their words, “The Evri Creature represents genetic diversity and the dinosaur symbolizes the meaning of life.”

Evri Beta Launch Picture