2 years ago, mid-November
Creating specialist vertical search engine using yahoo! boss api
Posted by pbirnie under technology
Truevent is a search engine built on top of the Yahoo! boss api.
I was talking to a search researcher about their offering.
researcher: i think the underlying theory is a bit overstretched...
researcher: well, i think there's nothing new about it
Paul Birnie: you mean the linking of words associated to other words
Paul Birnie: well at least boss is achieving something - making lots of people try different things
researcher: they fed the system with documents about eco-stuff
researcher: modeled, somehow, the word patterns
Paul Birnie: like simple signature detection
researcher: and now do some kind of re-ranking of boss-results based on the similarity to that model
6 easy steps to making a fortune by creating a cool vertical search engine startup based on the boss api are:
- signup for the boss appid
- use the boss mashup framework to retrieve search results from boss api
- pull data from another webservice at the same time
- mix and rerank the results based on some "patent pending" technology
- get bought by Yahoo!, Google or Microsoft - either because its good PR or because your product actually works.
The problem with search is it very hard to get users to switch from one search engine to another. All of the major search engines use MLR to do ranking of results and finding a new page feature that can provide >1% improvement in ranking for the average user, is very hard (All the easy and even hard stuff has been done)
Custom vertical search engines:
Creating custom vertical search engines is not new: Google and Yahoo have both supported "site restrict" to create a custom vertical search engine for years
Google provides Google Co-op Custom Search Engine
Yahoo has shut down sitebuilder which also used to support site restrict to a custom list of sites.
afaik. The boss api is currently limited to around 22 site restricts before you run out of space in the GET webservice call. I wonder if POSTS are supported- POSTS are not yos compliant
The problem with the search market is its really hard to get users to switch to another supplier.
Boss mashup framework:
I really like the mashup framework that comes with boss - it allows you to simply instantiate a object and pass a webservice url in the constructor. The mashup framework will automatically find a collection in the response and create a dictionary from the webservice response. You can then write sql-like syntax to mashup (mix) the data returned from each of the services.

December 14th, 2009 at 5:28 pm
Thanks for the nice post. Any idea how many sites can be restricted currently in yahoo boss api?