Making Sense of the SharePoint World

May-272010

Successful SharePoint 2010 People Search

MC900139387[1]

Finding your Way through the Configuration Maze

SharePoint has two basic configuration modes:

- SharePoint sets up "Everything" for you
- You set up "Everything" manually

There is precious little in between these two extremes. The good news is, if you let SharePoint configure everything, chances are everything will work. The bad news is, these settings rarely reflect best practices, and if (when?) you want to tweak some of those settings later you often find that one change has to lead to another, and another, and another in order to get back to working order. By the time you're done you may as well have done it manually in the first place.

Configuring SharePoint 2010 to do people search is one such area. The first half of the manual configuration (or reconfiguration) process is setting up the User Profile import. That is fairly well documented in several places. Probably the best is by fellow MVP Spencer Harbar in his article "A Rational Guide to Implementing SharePoint Server 2010 User Profile Synchronization".

The Bread of the Sandwich

Given how comprehensive Spencer's article is, you wouldn't think that there is anything more to say, and in truth, it is the meat of the issue and often the hardest part to get working. But as I said, that is only half of the story - getting user profile data into SharePoint. What my article is about is letting your users find the information. Since some of this comes before, and some comes after, the AD configuration in Spencer's article, you could think of this as the bread of the sandwich.

Once Central Administration is up and running, the first thing it offers is the opportunity to let another Wizard configure all of your service applications for you, and set up a default SharePoint web application. If you followed Spencer's advice, you said "No" to its kind offer. His article assumes you did, and gives instructions for setting things up completely manually. For this article, I'll assume you said "Yes" and want to fix things up. For completeness, I cover some of the same ground, and you can safely follow either set of instructions for creating the User Profile Sync service app.

Again, if you say "Yes", you'll get something that works. But if you look carefully, you'll discover two big things that violate good configuration practice for production environments:

  1. The Search service application is configured to use the Server Farm/Database Access account as the default content access account.
  2. My Sites and the Profile host site collection are configured to live within that first web application, which is named with the host name of your central administration server.

The first one is easy to address - on the surface. Create a suitable domain account, then in Central Administration, go to your Search service application and assign it to be the default content access account.

image

SharePoint will give it a default read policy on every web application associated with that service application. That's great as far as it goes, but hold that thought for a moment. I'll be coming back to it shortly.

As for the second issue, having the personal sites embedded in a content web application, you'll need to delete and re-create the User Profile Service application to resolve that. Or create the service application for the first time if you didn't invoke the wizard. Whether correcting from the wizard or creating the applications for the first time, other than the deletion, the steps (and some of the potential issues) are the same.

First, create a "normal" web application for your profiles and personal sites. Create a site collection at the root of the web application using either the "Blank" or "MySite Host" template.

Second, go to your Service Applications page and from the New button select User Profile Synchronization service application. Like most service applications, this one requires you to allocate an application pool and number of databases. The page suggests leaving them as the default names, which you can, though if you do make sure the databases from the original service application (if any) are deleted first. Otherwise, give them appropriate names for your environment.

Toward the end of the configuration page, specify the server in your farm that you want to host the profile sync service, and enter the web application you defined in the previous step.

MyWebApp

After you accept your settings, wait for the service application to finish creating. (You will return to the UI before that process completes.) Now would be a good time to go read Spencer's article to see what you should have done to get to this point, and have your AD administrator set the permissions required for your profile import account.

By that time, you should be able to complete the User Profile service application configuration as instructed.

The Last Piece of Bread

In a perfect world, you would be done. Of course, we don't live in a perfect world. Chances are, you'll get a wonderful set of profiles imported, and you can navigate to them and see everything. If your users create MySites, you'll probably even be able to find their content. But do a people search, and you get a whole bunch of "nothing". That's because you're not actually crawling the profile store - at least not successfully.

Time to go back to Central Administration, and first look at your Search service application's management page. Click the Content Sources link on the left hand side, and open/edit your Local SharePoint Sites content source. In the Start Addresses section, you will see a box with entries similar to those below:

image

Notice the sps3: line. This is the protocol SharePoint uses to read profiles. (Note: It isn't a "protocol", per se. It just instructs SharePoint to call a specific web service hosted at that address.) If you ran the wizard to configure your service applications, it will be pointing at the original web application created by it. You'll need to change it to reflect your new profile web application, then save the changes to your content source definition. Also, if you deleted the original wizard-created web application (or aborted its creation), you'll need to delete the regular http: line referencing it.

You might think (again) that that's all there is, but again you'd probably be wrong. Once you make the change above, you'll probably start seeing access denied errors on that "server". Remember when we assigned a new default content access account way back in step one? Well, even though it has permission to read the contents of the web site, the service under the sps3 protocol leads right back to the User Profile Synchronization service application, and you didn't tell that application to let the new content access account in.

To do that, navigate to the Manage Service Applications page, and highlight your User Profile Service Application. Click the Administrators icon in the ribbon.

ProfileAdmins

You'll need to add your default content access account to the list of "administrators". It won't really be an administrator - notice that there are an array of permissions available. Once you add the account, highlight it and ensure that the "Retrieve People Data for Search Crawlers" permission is checked, as shown below:

PermissionDialog

Click OK, and reset IIS on the profile import server. Maybe even reboot it.

Best Practices?

At last, you're done. You should now have functioning user profiles and people search, configured in accordance with "best" practices. (Yeah, "best" is relative...) Still, there are reasons for this kind of configuration. It gives you an easily manageable farm, with excellent control over My Sites - ensuring that personal content is in separate databases from your corporate portal data. The account used to crawl won't be the "all powerful" Farm account, and you can tell the difference through access and audit logs between administrative access to resources and the search crawler's.

Now, wasn't that a tasty sandwich?


Oct-122009

Knowing Your Limitations

MCj03789710000[1]

"2.1 Billion ID's Should be Enough for Anybody!"

One of the more infamous stories about Bill Gates is that he once said "640K of memory should be enough for anyone." That wasn't true - he never said it, but it did point up the frustration that came from one of the design limits of the original IBM PC. The memory between 640K and 1MB (which was the physical limit of the CPU) was allocated by IBM for video, I/O buffers, and lots of other "housekeeping", and therefore couldn't be accessed by DOS. This was fine at the time, when the typical computer came with 64K of RAM, and even expanding to 512K was a luxury; but when applications (like Lotus 123, dBase III, and even Windows itself) became complicated enough to require that memory, and more powerful CPUs became available that allowed access to even more, that big "gap" before getting to the extended memory required more effort to program around than anyone could have predicted. (Yes, that's way over-simplified, but it is enough to get the point across...)

The reason I bring up this little history lesson is to point out that when you are designing products, you have to set limits somewhere. Sometimes these limits are intrinsic, like the 1MB maximum RAM of the 8088 CPU. Others are compromises, like how much of that 1MB to allocate for system housekeeping, and where to locate it in the address space. You hope you set these high enough that most users will never see them, but they are there.

SharePoint also has a number of limits. Most of them are well documented. Some of them are "soft" limits - places where you see performance degradation. Others are "hard" limits, like the maximum size of an integer value. But some limits are buried under the covers, because they are internal to a function, and users never see the processes that are impacted. If they are set high enough, the users will never even know they exist.

Crawling Forward

Unfortunately, there is a limit that wasn't set high enough. This was buried deep inside the MOSS and MSS search databases. Most database tables have a field for a unique identifier. This is automatically incremented every time a new row is added. Typically, a SQL Server Integer (int) is used for this ID, allowing up to just over 2 billion items to be added (2,147,483,647 if you must know). That's a lot. But this value just goes up - it isn't decremented if you delete a row.

In the SharePoint Search DB, there is a table that keeps track of all of the links in your crawled content. Whenever you do a new crawl, rows are added to and deleted from this table. This table originally used the int referenced above for its ID field. Now, there can be a lot of links in a SharePoint site, but still, 2.1 billion should take an awfully long time to reach, and in most cases it does. But reach it you can. For very large and complicated sites, if you do a full crawl every day (which deletes and replaces all of the link references) you can reach it faster than you might (and the developers did) think.

So, what happens if SharePoint actually hits this limit, and runs out of IDs? It isn't pretty. Essentially the crawling process gets stuck. It asks the database for permission to write the next available row, and since there isn't an ID that can be given to it, the database just says "no". Unfortunately, SharePoint doesn't take no for an answer, and keeps asking. You will, occasionally, see an error in the event log talking about a SQL Identity failure, but unless you were aware of this possibility, it wouldn't make much sense.

Recovering

This also prevents you from effectively controlling search. Because SharePoint insists on finishing the last thing it was doing, you can't stop the crawl. Because there isn't much to go on in the logs, and it takes some SQL Server proficiency to accurately diagnose the problem, many times, this results in folks rebuilding their SSP, with all of the pain and agony that entails, just for the want of an ID.

Note: At this point, you need to consider the search index on this SSP corrupt. There is nothing that can recover the ability to crawl new content without resetting your index and doing a full crawl as described in the prevention section below.

Even if you can successfully diagnose it, there are very few supportable solutions that *don't* involve rebuilding the SSP one way or another. Remember, directly modifying the SharePoint databases yourself can result in an unsupported state. So, if you reset the seed of the maxed-out table to 1 in order to get control of the crawl back and stop it, you should restore the search database from a backup to reach a production state before you reset the crawled content (see below), which resets the database to an initialized state.

You can also restore your whole SSP from a backup, but that's almost as much fun as rebuilding it, and it assumes you have a restorable backup of your SSP.

An Ounce of Prevention

Obviously, it is much better to prevent this problem from occurring in the first place than to try recovering from it. There are a couple ways to do this. The first and best is to upgrade your SharePoint environment to Service Pack 2. Among the many enhancements in SP2, the ID fields in the search databases that were prone to maxing out are updated to "big" integers. BigInts are twice the number of bits as regular integers. That doesn't just double the capacity, though. It makes it 4 billion times as large. (For those who really need to know it makes the number of possible ID's 9,223,372,036,854,775,807!) So, if it took 6 months to reach the old limit, it would take 24 billion months to reach the new one.

If you can't upgrade to SP2, you should consider adding a periodic reset of the index into your maintenance plans - especially if you have a very large corpus, with lots of links. The option to do this is available from Quick launch in the Search Administration page.

image

Resetting the crawled content doesn't impact your settings, keywords and best bets, etc... But it does delete your existing index and completely resets the search crawl database - including the table ID fields. After the reset, search results will not be available until a full crawl is performed, so you should schedule this to take place during a down time and/or notify your users of the search outage. If you have multiple content sources defined, you will need to crawl all of them.

When you select reset, you will get a screen asking if you want to turn off search alerts during the reset. It will default to being selected, and you should leave it that way.

image

The alerts can be reactivated once your crawls have been completed.

Conclusion

As Clint Eastwood once said as Dirty Harry, "A man's got to know his limitations." Everyone, and every thing, has limits.

Limits are only a problem when you don't know about them, and don't take them into account. SharePoint, as powerful as it is, has plenty of them. In addition to the hidden limit I covered in today's article, you might want to review some of the more well known limits in the SharePoint planning material: Planning for Software Boundaries.


Aug-192009

My Free SharePoint Twitter Integration Components

MPj04389110000[1]

Yes - I Still Like Twitter!

If you've been following my saga over the last few weeks, you'll know that I was temporarily suspended from Twitter due to a cross-site attack, that caused an inappropriate spam link to be injected into my tweetstream. While I am still disappointed that it took Twitter customer service almost two weeks to reinstate me, I do still like Twitter.

In an effort to "bury the hatchet", I am re-posting links to some components I wrote to bring Twitter into SharePoint. The first two are simple and fancy Federated Location Definitions for Search Server 2008, or MOSS Search (post-Infrastructure Update). The third is a simple Data View web part that can provide a twitter search result on any SharePoint page, including WSS.

(Note: For all of the download links below, right-click and choose "Save target as" to retrieve them.)

Federated Locations

See the original articles: Part 1, Part 2

Download the "basic" Twitter search results Federated Location Definition Download the "deluxe" Twitter search results Federated Location Definition
image image

Data View Web Part

See the article on how to create this part.

Download this part.

image

You can see all three components in action here.


Mar-52009

Binary Free SharePoint Twitter Search Web Part

Binoculars

No Assembly (or C#, or VB) Required

Searching Twitter from SharePoint has become all the rage since I originally posted my Twitter Search Federation articles (Part 1, Part 2). Federation is great if you have Search Server, or the Infrastructure Updates. But what if you are only using WSS? Or what if you just want to drop a Twitter search into any old SharePoint page, rather than a full Results page? And more critical - what if you don't have direct access to the SharePoint server in order to install binary web part and feature - with or without a Solution Package (WSP)?

Well, buried in Part 2 of my article was the the solution. A Data View Web Part (DVWP) that displays the results of a twitter search. In the original article, that DVWP was just an interim step on the way to Federation. For this article, a form of that web part is the actual goal. So, I'm going to start by re-using the DVWP section of the Federation article - with a tweak or two :). But then, I'll also show you how do two very important things - connect the web part to an input form (or any other web part), and export it for use on other SharePoint sites.

Note: You can find a link to download the Twitter Search Results Web Part at the end if this article.

A Data View Refresher

The Data View Web Part is a way to display information from virtually any source within SharePoint. Data Views are created in SharePoint Designer, in association with another feature called the Data Source Library. This is not to be confused with the "Business Data Catalog", or BDC. While both the Data Source Library and the BDC deal with presenting data from external sources within SharePoint, the BDC is a part of MOSS Enterprise, and allows a much deeper integration of the data with various aspects of SharePoint. The Data Source Library, on the other hand, is available in all editions of SharePoint - from WSS on up - and is primarily used to generate Data View/Data Form Web Parts.

Data Views and the Data Source Library are a very powerful combination - so much so that almost two whole chapters of my book are devoted to them. Obviously, I can't go into that kind of detail here, but while this particular example is fairly simple, it covers a lot of ground.

The link between Federation and Data Views is pretty close. In fact, prior to Search Server or the Infrastructure Updates, you could use a Data View to achieve very similar results. We're going to take advantage of this by building the look we want in a Data View, then transferring it into the Federated Location definition.

Creating a Data View

Before we can create a Data View, we need create a new item in the Data Source Library for our Atom feed.

To do this, Select "Manage Data Sources..." from the Data View menu in SharePoint Designer to summon the Data Source Library task pane. Atom and RSS feeds fall into the category of "Server-side Scripts" that return XML, so expand the Server-side Scripts section and click "Connect to a script or RSS Feed." You will see the dialog below. Fill in the URL with the same Twitter Atom query we have been using: http://search.twitter.com/search.atom?q=sharepoint (See Part 1 of the original article for details on how this was derived.)

image

The query parameter (q) will automatically be passed into the list as soon as you change the focus from the URL field. "SharePoint" will become the default parameter value, and give us something to see as we customize the look. If you are following along, you can replace "SharePoint" with any default query term that might be appropriate to your environment. Make sure the Runtime Parameter box is checked, otherwise you can't change the query later.

Now that we have the Data Source, we need a place to put it. This can be any web part page. While you can use the results page if you feel so inclined, because we aren't going to be using the Data View directly, it doesn't need to be.

Once you have a web part page open, select a Web Part Zone, and then pick "Insert Data View..." from the Data View menu. The Data Source you created above will have a drop-down menu associated with it. Select "Show Data".

You will see the Data Source Details task pane, with the structure of the Twitter Atom feed displayed.

image

I've maximized my task pane for this screen shot in order to show you how the SharePoint Designer data source displays the entire structure of the feed. Notice the folders and item scrolls for the various elements. The Twitter Atom feed is a "hierarchical" data source. This means that the data has nested, potentially (and in this case, actually) repeating, elements, which in turn may have their own nested elements.

For now, the primary entity we are interested in is the "Entry" folder. Look at the screen shot to the right. Highlight the elements in the "Entry" folder as shown, and select "Multiple Item View" from the "Insert Selected Fields as..." menu. (Yes, I know. It looks like a button, but trust me - it's a menu!)

A table will be inserted into the web part. That's got most of the information we want, but it isn't terribly pretty. So, let's fix it up!

The first column contains the "href" entity. Ironically, even though there is a separate entity for the Author, one of the two links listed for each user is the Author's avatar. The other is a link to the Twitter URL of the tweet itself. For our results, we really only want the avatar, so we're going to do two things - Change the display to show the image instead of the URL, and hide the other URL.

To change to an image view, click one of the URLs in the href column. To the right will be a little box with a chevron in it:

image

When you click it, you will have choices to modify the current field. Select Picture.

image

You will get a warning that URLs and Pictures can be dangerous. We know that, so click Yes.

The changes you make here will affect all of the items of that series. (You probably noticed that they were all highlighted in a different color when you clicked on any one of them.)

Once you have done that, to suppress the other image (which will show as a "broken" picture), Right-click the broken link and select "Conditional Formatting". In the Conditional Formatting task pane, select "Show Content" from the "Create" menu (another one of those "buttony" menus). In the Condition Criteria box, set the conditions like this:

image

The broken link will go away.

Next, we want to merge the rest of the cells in the row. This is just like any other table action - highlight the data cells (not the labels) for content, updated, name, and uri. Right-click, and select "Modify/Merge Cells". Now we're cooking! Just a couple more tweaks, and it will be there.

Select the tweet content text, and change its format to Rich Text (just like changing the image format above).

Select the date, and format it to your regional liking.

Notice that we have a link to the Author, the Author's name, and the Author's avatar. Wouldn't it be great to have the name and the avatar actually link to the Author's page? Well, we can. If you click the chevron by the link, you will see that the field being displayed is called "ddw1:author/ddw1:uri". For the text, change the format to Hyperlink, you will see the following dialog. You can use the "fx" icons to select the fields you wish to use in the hyperlink, or enter the values manually. In either case, you want the "Text to display" and "Address" fields to be set as shown:

image

Setting the link on the picture is easy, too. Just right-click the image, and select "Hyperlink" from the context menu. Set the address to the same token as you used above. Now you can delete the field that shows the text of the author link.

You should now have a web part that looks a lot more like what you would expect from a Twitter search:

image

Pretty good, but I'm still not satisfied. :)

Notice the chevron icon in the upper-right corner of the web part.

If you click it, you will summon the "Common Data View Tasks" menu:

image

Click Data View Properties. You will get this dialog:

image

Click "Show view header" and "Show view footer", then click the "Paging" tab.

Click "Limit the total number of items displayed to:" and enter a reasonable number for a search results page. (I picked 5). Click OK.

Display the Data Source Details task pane, and drag the first title field available into the newly created header. Click in the footer, and delete the Item Count. In the "link" group (above the "title" field you just used), make sure item 1 (rel = "alternate") is selected. Highlight the "href" and select "Item(s)" from the Insert Selected Field menu. Change its format to a Hyperlink. Leave the Address as-is, but change the Text to Display to "More Results..."

I'm going to delete the field name row, rearrange the fields slightly, and also apply the style "ms-searchChannelTitle" to the Header cell. This results in a part that looks like this:

image

Now I'll make one more change to this web part to allow it to respond to a URL query string. From the Common Data View Tasks menu, select Parameters. You should see this dialog, displaying the "q" parameter that got created when we built the Data Source:

image

From the Parameter Source menu, select Query String. This will add a field for you to enter the name of the URL parameter you will be passing. The standard for SharePoint Search keyword parameters is "k", so I suggest using this (without the quotes, of course). This allows you to use this web part on a standard SharePoint results page and have it respond appropriately. But you can also then use the k parameter in the query string of any SharePoint page you drop the part on!

Your Twitter Search Results web part is done! You can save the page and close SharePoint Designer.

The Twitter Search Results Web Part in Action!

When you display the page on which you created the Twitter Search Results web part, you will see the default result set:

image

To show that the URL Query string works, append a "?k=twitter" to the URL (again, no quotes), and hit the Enter key. The results will change to Tweets containing the word Twitter:

image

Notice also how the search form recognized the "k" parameter, and set it as the default keyword for an internal Sharepoint search...

Now that we know the part is working, delete the k parameter from the address bar and hit Enter to return to the default page. We need to do this because the query string parameter will override the web part connection we're going to be making. (You might want to keep that in mind, as there may be times you find that behavior useful in your own data views...)

Let's insert a "form" web part on the page. From Site Actions, select Edit Page. In one of your Web Part zones, click the Add a Web Part link, and select Form Web Part from the Miscellaneous section, and click the Add button:

image

By default, this form will have a text box, and a "Go" button, and will be called "Form Web Part". On your site, you will probably want to set the web part properties to give it a different label, such as "Twitter Search". For purposes of this article, I'm going move directly on to setting up the connection.

From the edit menu on the part you just inserted, select Connections.

image

From the fly-out submenu, select Provide Form Values To, and select the Twitter Results web part. You will get this dialog:

image

Select Get Parameters From, and click the "Configure" button. The dialog will then ask for the Consumer Field Name. Select "q" (the parameter name), and click the "Finish" button.

image

You can then also exit Edit mode on your page.

Now just enter a term into your form, and click "Go". You will get Twitter Search results for the term you selected!

image

You don't need to use a form for the web part connection. You can connect to almost any web part in SharePoint to get query parameters. For example, you could connect to a client list and use the company name as the search term. You could then just click on each client's record to see their Twitter buzz.

Exporting and Importing the Twitter Results Web Part

Important: Remove the web part connection created in the previous exercise before exporting! 

One of the great things about the Data Views you create in SharePoint Designer is that you can easily export them for use on virtually any SharePoint site that has access to the data used to define it. To do this, Use the little arrow in the top right corner to summon the web part menu, and select Export:

image

A standard download box will appear, allowing you to save the file to your local machine. In our case, the part will be called Twitter_Results.webpart. ".webpart" is one of two extensions you might see when exporting SharePoint web parts. (The other is ".DWP")

So, what do you do with the file once you have it? You import it back into SharePoint!. As mentioned earlier, it doesn't need to go on the same site - or in this case, even the same server! As long as the server you are installing the part on has access to Twitter, you can use this part.

There are two ways to import the part:

  1. Directly importing onto a page
  2. Adding it to the Web Part Gallery

To import it onto a page, start the same way as always: From Site Actions, pick Edit Page. Then click Add a Web Part over a Web Part zone. However, this time, you need to click the link on the bottom of the window: "Advanced Web Part gallery and options"

image

This will close the dialog and open the Add Web Parts task pane. At the top of the task pane is a menu. Select Import from the menu: image

This changes the task pane to Import mode. From here, you can either type the path to the .webpart file, or use the standard Windows file dialog to browse for it:

image

Click Upload, and the part will appear on a list below the form. You can then drag it into the Web Part zone of your choice.

The down side of this method, is that you have to re-import the web part for every page on which you want it to appear. Fortunately, you can make it available to any page in your site by adding it to the Web Parts Gallery.

To access the Web Parts gallery: From the Site Actions menu, select Site Settings. On the Site Settings page, you will see the list of Galleries. Click the "Web Parts" link.

image

Essentially the Web Parts gallery works just like any other document library in SharePoint:

image

To add the web part, click Upload, and browse to the web part file. After you upload it, you can enter a full description, and determine the context(s) in which the part will be selectable.

image

Click OK to complete the Save process. From then on, your Twitter Results web part will be available from the standard Add Web Parts dialog box:

image

Of course, these techniques apply to almost any Data View or Content Editor Web Parts you create, not just this Twitter Search.

You can Download this part here!