A while ago, a student of mine asked if it was possible to use the taxonomic alternative labels from the Managed Metadata Service as thesaurus items in SharePoint Server Search.
There’s no built-in way to do this, but it is possible to generate the required Xml with a little PowerShell.
The Concept
The Term Store contains Term Group, which content Term Sets, which in turn contain nested Terms. Each Term can have (optional) synonyms called Labels.
If we find all the Terms with Labels, we can write them out in the correct format as a chunk of Xml, and pipe it into a thesaurus file.
Thesaurus files for SharePoint Server Search are kept under the file path:
%ProgramFiles%\Microsoft Office Servers\14.0\Data\Office Server\Applications\GUID-query-0\Config
A sample thesaurus file is shown below:
<XML ID="Microsoft Search Thesaurus"> <thesaurus xmlns="x-schema:tsSchema.xml"> <diacritics_sensitive>0</diacritics_sensitive> <expansion> <sub>Internet Explorer</sub> <sub>IE</sub> <sub>IE8</sub> </expansion> <replacement> <pat>NT5</pat> <pat>W2K</pat> <sub>Windows 2000</sub> </replacement> </thesaurus> </XML>
You can see the rationale for how this file works on and how to manage thesaurus files at TechNet.
The Script
The script below shows the principle. It iterates over the Terms in each Term Store and finds their Labels. Where the Label is not the same as the name of the Term itself, it represents a synonym and we add it to the Xml.
function Extract-SPThesaurusFromTermLabels { param([string] $webUrl); $ts = Get-SPTaxonomySession -Site $webUrl; Write-Output "<XML ID='Microsoft Search Thesaurus'>"; Write-Output "<thesaurus xmlns='x-schema:tsSchema.xml'>"; $ts.TermStores | % { $_.Groups | % { $_.TermSets | % { $_.Terms | % { $_.Labels | ? {$_.Term.Name -ne $_.Value} | % { Write-Output ("<expansion><sub>" + $_.Term.Name + "</sub><sub>" + $_.Value + "</sub><expansion>"); } } } } }; Write-Output "</thesaurus>"; Write-Output "</XML>"; }
You can then pipe the output of this command to an Xml file, and optionally use this in the place of your existing Thesaurus file with something like this:
Extract-SPThesaurusFromTermLabels http://sharepoint > tsLANG.xml
As always, please back up your original Thesaurus files and check the output of this before you use it!