Simian systems has released a PHP based website search engine called SiteSearch Pro
The technology, says the company, allows users of your website to search not only your web files, news stories and product database, but also any content from disparate file types that are saved on your web server including Word, Excel, Powerpoint, PDF, vCard, iCal, RTF,HTML, XML, MP3, and Photoshop (PSD) files.
Custom types may also be added. Essentially, any file with text in it of any type may be spidered by SiteSearch allowing web masters the ability to offer a search utility to their website viewers to search a variety of files, and not just web pages. Built in both PHP and Apache Lucene project, SiteSearch provides a documented XML-RPC API for integrating external applications with the SiteSearch query server.
Also, search results can also be retrieved in RSS newsfeed format. Simian Systems reports that client PHP classes require no special PHP extensions or configuration changes, since the two communicate solely through XML-RPC. SiteSearch Pro is part of the Sitellite Content Management System. Sitellite and its creator Simian Systems, have been around since the turn of this century. Simian maintains an open-source CMS project of the same name as their main product