Mga Archive ng kategorya: SharePoint Search

I-configure ang Thesaurus sa Moss

Ako ay nagtatrabaho sa isang dokumento ng arkitektura ang review na ito linggo at ito ay nagmumungkahi, bukod sa iba pang mga bagay, that the client consider using the thesaurus to help improve the end user search experience. Having never done this myself, I wanted to do a quick hands-on test so that my suggestion is authentic.

Ito ay nakakagulat na mahirap upang malaman kung paano gawin, kahit na ito ay, sa katunayan, quite easy. There’s a pretty good bit of information on the thesaurus (tsek dito at dito, halimbawa). Gayunman, mga doc ay alinman sa WSS 2.0 / SPS 2003 oriented or they don’t actually spell out what do to after you’ve made your changes in the thesaurus. They provide a great overview and fair bit of detail, ngunit hindi ito sapat upang i-cross ang pagtatapos linya.

Ang mga hakbang na nagtrabaho para sa akin:

  1. Make the changes to the thesaurus. (Tingnan sa ibaba para sa isang mahalagang tala)
  2. Go to the server and restart the "Office SharePoint Server Search" serbisyo.

Ang isang dulo ng sumbrero sa Mr. J. D. Lumakad nang painut-inot sa tubig (bio). He provided the key bit about restarting the search service and rescued me from endless, time consuming and unnecessary iisresets and full index crawls. This episode Pinatunayan, minsan pa, na Twitter is the awesome. (Sundin ako sa nerbiyos dito. I follow any SharePoint person that follows me).

I don’t know if this functionality is available in WSS. If it is or is not, mangyaring mag-iwan ng komento o mag-email sa akin at kukunin ko na i-update ang post na ito.

Mahalagang paalala: There’s conflicting information on which XML thesaurus file to change. There’s this notion of "tsneu.xml" as being the "neutral" tesauro. I wasted some time working with that one. Sa aking kaso, I needed to change the "tsenu.xml" mag-file na matatagpuan sa ilalim ng folder ng app ID mismo: \\win2003srv c $ Program Files Microsoft Office Servers 12.0 Data Office Server Application 3c4d509a-75c5-481c-8bfd-099a89554e17\Config. I assume that in a multi-farm situation, Gusto mo gawin ang pagbabagong ito sa lahat ng dako sa isang query server ay tumatakbo.

</dulo>

Mag-subscribe sa aking blog.

Technorati Tags: , ,

SharePoint at FAST — peanut butter ang Reese ni tasa ng Enterprise Apps?

I’ve finished up day 2 of FAST training in sunny Needham, MA, and I’m bursting with ideas (which all the good training classes do to me). One particular aspect of FAST has me thinking and I wanted to write it down while it was still fresh and normal day-to-day "stuff" pushed it out of my head.

We SharePoint WSS 3.0 / MOSS implementers frequently face a tough problem with any reasonably-sized SharePoint project: How do we get all the untagged data loaded into SharePoint such that it all fits within our perfectly designed information architecture?

Often enough, this isn’t such a hard problem because we scope ourselves out of trouble: "We don’t care about anything more than 3 months old." "We’ll handle all that old stuff with keyword search and going-forward we’ll do it the RIGHT way…" Etc.

Pero, what happens if we can’t scope ourselves out of trouble and we’re looking at 10’s of thousands or 100’s of thousands (or even millions) of docs — the loading at tagging of which is our devout wish?

FAST might be the answer.

FAST’s search process includes a lot of moving parts but one simplified view is this:

  • A crawler process looks for content.
  • It finds content and hands it off to a broker process that manages a pool of document processors.
  • Broker process hands it off to one of the document processors.
  • The document processor analyzes the document and via a pipeline process, analyzes the bejeezus out of the document and hands it off to an index builder type process.

On the starship FAST, we have a lot of control over the document processing pipeline. We can mix and match about 100 pipeline components and, most interestingly, we can write our own components. Like I say, FAST is analyzing documents every which way but Sunday and it compiles a lot of useful information about those documents. Those crazy FAST people are clearly insane and obsessive about document analysis because they have tools and/or strategies to REALLY categorize documents.

Kaya … using FAST in combination with our own custom pipeline component, we can grab all that context information from FAST and feed it back to MOSS. It might go something like this:

  • Document is fed into FAST from MOSS.
  • Normal crazy-obsessive FAST document parsing and categorization happens.
  • Our own custom pipeline component drops some of that context information off to a database.
  • A process of our own design reads the context information, makes some decisions on how to fit that MOSS document within our IA and marks it up using a web service and the object model.

Talaga, no such automated process can be perfect but thanks to the obsessive (and possibly insane-but-in-a-good-way FAST people), we may have a real fighting shot at a truly effective mass load process that does more than just fill up a SQL database with a bunch of barely-searchable documents.

</dulo>

Mag-subscribe sa aking blog.

Technorati Tags: , ,

Aspeto ng Paghahanap Bakod pasahero Walang Higit pang mga

Mayroon akong dahilan ngayon upang i-play ang tungkol sa codeplex aspeto ng paghahanap project today.

Ito ay naging sa paligid para sa isang habang, ngunit ako hesitated upang i-download at gamitin ito para sa mga karaniwang dahilan (higit sa lahat kakulangan ng oras), plus outright fear 🙂

Kung naghahanap ka upang mapabuti ang iyong paghahanap at galugarin ang mga bagong mga pagpipilian, download it and install it when you have an hour or so of free time. I followed the installation manual’s instructions and it took me less than 20 minutes to have it installed and working. It provides value minute zero.

It does look pretty hard to extend. The authors provide a detailed walk-through for a complex BDC scenario. I may be missing it, but I wish they would also provide a simpler scenario involving one of the pre-existing properties or maybe adding one new managed property. I shall try and write that up myself in the next period of time.

Ika-line — sa ilang minuto, Maaari mong i-install, i-configure ito, use it and add some pretty cool functionality to your vanilla MOSS search and be a hero 🙂

</dulo>

Mag-subscribe sa aking blog.

Technorati Tags:

SharePoint wildcard na Paghahanap: “Sang-ayon” Ay Hindi isang stem ng “Programming”

Sa forum MSDN paghahanap, mga taong madalas magtanong ganito:

"I have a document named ‘Programming Guide’ but when I search for ‘Pro’ sa paghahanap ay hindi mahanap ito."

Maaaring hindi ito pakiramdam tulad nito, but that amounts to a wildcard search. The MOSS/WSS user interface does not support wildcard search out of the box.

Kung ikaw maghukay sa mga bahagi sa paghahanap sa web, makakahanap ka ng isang checkbox, "Enable search term stemming". Stemming is a human-language term. It’s not a computer language substring() uri ng pag-andar.

Ito ang ilang mga Nagmumula:

  • "fish" is a stem to "fishing"
  • "major" is a stem to "majoring"

Ang mga ito ay hindi Nagmumula:

  • "maj" is not a stem to "major"
  • "pro" is not a stem to "programmer"

The WSS/MOSS search engine does support wild card search through the API. Here is one blog article that describes how to do that: http://www.dotnetmafia.com/blogs/dotnettipoftheday/archive/2008/03/06/how-to-use-the-moss-enterprise-search-fulltextsqlquery-class.aspx

Ang isang 3rd party na produkto, Ontolica, provides wild card search. I have not used that product.

</dulo>

Mag-subscribe sa aking blog.

Technorati Tags: