Thursday, 29 April 2010

XML sitemaps

Search engines such as Google can make use of XML sitemaps to discover content. Such sitemaps are useful if a site:

  • Has dynamic content
  • Has pages that aren't easily discovered during the crawl process (e.g. rich media)
  • Is new and has few links to it
  • Has a large archive of content pages that are not well linked to each other, or are not linked at all

The XML sitemap protocol is an open standard defined by http://www.sitemaps.org.

According to the specification:

  • You can provide multiple sitemap files
  • Each sitemap file can only contain 50,000 URLs
  • Site map files must be < 10 MB
  • Compression is allowed
  • Multiple sitemap files should be listed in a sitemap index file

The location of a sitemap file is important to what URLs can be contained in it.

“A Sitemap file located at http://example.com/catalog/sitemap.xml can include any URLs starting with http://example.com/catalog/ but can not include URLs starting with http://example.com/images/.” – see http://www.sitemaps.org/protocol.php#location

See http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156184 for Google specifics.

Identifying canonical content

Search engines may down-rank pages if the same content is referenced by multiple URLs. This would apply in cases where URL parameters are added to the URL to put the page in different modes (e.g. sort order). It would also apply in the case of pages that are referenced by different portals (i.e. different domains).

For example the following URLs could all point to the same content:

  • http://<domain name here>/mypage.aspx
  • http://<domain name here>/mypage.aspx?sort=desc
  • http://<domain name here>/mypage.aspx?sort=desc&page=2
  • http://<another domain name here>/mypage.aspx

To assist search engines in identifying the canonical version of the content add a <link> tag to the <head> section of the duplicate content:

<link rel="canonical" href="http://<domain name here>/mypage.aspx" />

For Google’s guidance go here: http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html

More on duplicate content:

Friday, 23 April 2010

Getting around proxy issues

Some proxy issues can be addressed by alterations to the machine.config file. For example, click-once deployments may fail due to proxy issues.

Details of the appropriate machine.config elements can be found here:

http://msdn.microsoft.com/en-us/library/aa903360

<configuration>
  <system.net> 
    <defaultProxy enabled="true" useDefaultCredentials="true"/> 
  <system.net/>
<configuration>

Wednesday, 14 April 2010

Problems recycling SQL Server logs

Sometimes it is possible that SQL Server has problems cycling the error or SQL Agent logs. This can result in maintenance plans reporting failure.

These errors may manifest themselves on the server as missing files in the log sequence (e.g. there is an ERORLOG file, ERRORLOG.1 is missing but there is an ERRORLOG.2 etc.). Occasionally the problem can be fixed by renaming the files so the sequence is compete again.

To recycle error or SQL Agent log manually try running one of the following:

EXEC sp_cycle_agent_errorlog

EXEC sp_cycle_errorlog

DBCC ERRORLOG

You can also recycle the SQL Server Agent log via the Management Studio by right-clicking on SQL Server Agent > Error Logs, and choosing recycle from the popup menu. However, it is possible that the log simply won’t recycle:

22022

In this case a safe bet is to restart the SQL Server Agent service.

Thursday, 1 April 2010

Function for splitting text data

Try using the following function:

USE [DatabaseNameHere]
GO

CREATE FUNCTION dbo.Split
    (
      @String VARCHAR(8000),
      @Delimiter CHAR(1)
    )
RETURNS @temptable TABLE ( items VARCHAR(8000) )
AS BEGIN     
    DECLARE @idx INT     
    DECLARE @slice VARCHAR(8000)     
    
    SELECT  @idx = 1     
    IF LEN(@String) < 1
        OR @String IS NULL 
        RETURN     
    
    WHILE @idx != 0     
        BEGIN     
            SET @idx = CHARINDEX(@Delimiter, @String)     
            IF @idx != 0 
                SET @slice = LEFT(@String, @idx - 1)     
            ELSE 
                SET @slice = @String     
  
            IF ( LEN(@slice) > 0 ) 
                INSERT  INTO @temptable ( Items )
                VALUES  ( @slice )     

            SET @String = RIGHT(@String, LEN(@String) - @idx)     
            IF LEN(@String) = 0 
                BREAK     
        END 
    RETURN     
   END

Then execure a query like the following:

SELECT TOP 10 * FROM dbo.Split('Item1,Item2,Item3',',')

The code was taken from here: http://www.logiclabz.com/sql-server/split-function-in-sql-server-to-break-comma-separated-strings-into-table.aspx