SharePoint i brzo — je Reese je kikiriki maslac kupova u Enterprise Apps?

Ja sam završio dan 2 od brzih treninga u sunčanoj Needham, MA, i ja sam prepun ideja (kojima su svi dobri trening klase učiniti za mene). One particular aspect of FAST has me thinking and I wanted to write it down while it was still fresh and normal day-to-day "stuff" pushed it out of my head.

We SharePoint WSS 3.0 / MOSS implementers frequently face a tough problem with any reasonably-sized SharePoint project: How do we get all the untagged data loaded into SharePoint such that it all fits within our perfectly designed information architecture?

Often enough, this isn’t such a hard problem because we scope ourselves out of trouble: "We don’t care about anything more than 3 months old." "We’ll handle all that old stuff with keyword search and going-forward we’ll do it the RIGHT way…" Etc.

Ali, what happens if we can’t scope ourselves out of trouble and we’re looking at 10’s of thousands or 100’s of thousands (or even millions) of docs — the loading i tagging of which is our devout wish?

FAST might be the answer.

FAST’s search process includes a lot of moving parts but one simplified view is this:

  • A crawler process looks for content.
  • It finds content and hands it off to a broker process that manages a pool of document processors.
  • Broker process hands it off to one of the document processors.
  • The document processor analyzes the document and via a pipeline process, analyzes the bejeezus out of the document and hands it off to an index builder type process.

On the starship FAST, we have a lot of control over the document processing pipeline. We can mix and match about 100 pipeline components and, most interestingly, we can write our own components. Like I say, FAST is analyzing documents every which way but Sunday and it compiles a lot of useful information about those documents. Those crazy FAST people are clearly insane and obsessive about document analysis because they have tools and/or strategies to REALLY categorize documents.

Tako … using FAST in combination with our own custom pipeline component, we can grab all that context information from FAST and feed it back to MOSS. It might go something like this:

  • Document is fed into FAST from MOSS.
  • Normal crazy-obsessive FAST document parsing and categorization happens.
  • Our own custom pipeline component drops some of that context information off to a database.
  • A process of our own design reads the context information, donosi neke odluke o tome kako bi odgovarao da je Moss dokument unutar naše agencije te ga obilježava se koristi web servis i Object Model.

Naravno, takva automatizirani proces može biti savršen, ali zahvaljujući opsesivno (a možda suludo, ali-u-dobru-FAST-way ljudi), možemo imati stvarnu borbu pucao po zaista učinkovite masovne opterećenja proces koji se više nego samo popuniti SQL baze podataka s gomilom jedva pretraživati ​​dokumente.

</kraj>

Pretplatite se na moj blog.

Technorati Tags: , ,

Dopust jedan Odgovor

Vaša email adresa neće biti objavljena. obavezna polja su označena *