<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Data Quality Chronicle</title>
	<atom:link href="http://dqchronicle.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://dqchronicle.wordpress.com</link>
	<description>An event based log about a service offering</description>
	<lastBuildDate>Tue, 01 Dec 2009 05:28:55 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<cloud domain='dqchronicle.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://www.gravatar.com/blavatar/2d1ef2833156419569246769738dfe75?s=96&#038;d=http://s.wordpress.com/i/buttonw-com.png</url>
		<title>Data Quality Chronicle</title>
		<link>http://dqchronicle.wordpress.com</link>
	</image>
			<item>
		<title>Microsoft Dynamics CRM Duplicate Consolidation Management</title>
		<link>http://dqchronicle.wordpress.com/2009/12/01/microsoft-dynamics-crm-duplicate-consolidation-management/</link>
		<comments>http://dqchronicle.wordpress.com/2009/12/01/microsoft-dynamics-crm-duplicate-consolidation-management/#comments</comments>
		<pubDate>Tue, 01 Dec 2009 05:28:55 +0000</pubDate>
		<dc:creator>wesharp</dc:creator>
				<category><![CDATA[data quality]]></category>
		<category><![CDATA[deduplication]]></category>
		<category><![CDATA[master data management]]></category>
		<category><![CDATA[MasterID]]></category>
		<category><![CDATA[merging duplicates]]></category>
		<category><![CDATA[Microsoft Dynamics CRM]]></category>

		<guid isPermaLink="false">http://dqchronicle.wordpress.com/?p=329</guid>
		<description><![CDATA[After receiving a comment on last month&#8217;s post I decided to do a follow-up and detail a little further how Microsoft Dynamics CRM manages the merging of duplicate records.  For the purposes of this post I&#8217;ll stick to using Contacts as the example.  However, the same is true for Accounts and many other tables. 
For our sample records let&#8217;s [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dqchronicle.wordpress.com&blog=6584949&post=329&subd=dqchronicle&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>After receiving a comment on last month&#8217;s post I decided to do a follow-up and detail a little further how Microsoft Dynamics CRM manages the merging of duplicate records.  For the purposes of this post I&#8217;ll stick to using Contacts as the example.  However, the same is true for Accounts and many other tables. </p>
<p>For our sample records let&#8217;s say we have just two contacts that are duplicates.  Contact A has four service calls associated with it.  Each of these service calls have relevant data that you want to retain.  Contact B has three service calls associated with it and each service call has data that needs to be retained.</p>
<p>Upon merging Contact A with Contact B (so in this case B is the keeper record), there will be seven service calls associated with Contact B.  This is accomplished through the use of three data elements in each Contact transaction. These fields are MasterID, Merged, and Statecode. </p>
<p>Merged is an indicator field where 1 indicates that the transaction is indeed merged.  Statecode is another indicator field indicating active and inactive transactions.  In Dynamics a statecode of value of 1 is inactive and 0 is active.  Yes, you read that right.  Zero is active. </p>
<p> The magic of the duplicate consolidation lies in the MasterID field.  For consolidated records the MasterID is equal to the unique identifier of the keeper record.  So in our example if the unique identifier of Contact B was 1234, the MasterID of  Contact A will be 1234.  The Merged field would be 1 and Statecode of 1.  This allows all associated transactions of Contact A to be &#8220;realigned&#8221; to Contact B without having to perform massive updates to many tables.  The MasterID functions as a re-pointer for all related transactions associated to the merged record.  So there are no changes to the related transactions.  The re-pointer directs all transactions related to the non-keeper record to the new keeper record.</p>
<p>For those transactions which are not merged the MasterID is NULL (no need to store the unique identifier twice).</p>
Posted in data quality Tagged: data quality, deduplication, master data management, MasterID, merging duplicates, Microsoft Dynamics CRM <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dqchronicle.wordpress.com/329/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dqchronicle.wordpress.com/329/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dqchronicle.wordpress.com/329/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dqchronicle.wordpress.com/329/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dqchronicle.wordpress.com/329/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dqchronicle.wordpress.com/329/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dqchronicle.wordpress.com/329/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dqchronicle.wordpress.com/329/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dqchronicle.wordpress.com/329/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dqchronicle.wordpress.com/329/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dqchronicle.wordpress.com&blog=6584949&post=329&subd=dqchronicle&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://dqchronicle.wordpress.com/2009/12/01/microsoft-dynamics-crm-duplicate-consolidation-management/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/57521ea1af5d6a8b1abbfa6208c125f6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">wesharp</media:title>
		</media:content>
	</item>
		<item>
		<title>Removing duplicates in Microsoft Dynamics CRM</title>
		<link>http://dqchronicle.wordpress.com/2009/10/17/removing-duplicates-in-microsoft-dynamics-crm/</link>
		<comments>http://dqchronicle.wordpress.com/2009/10/17/removing-duplicates-in-microsoft-dynamics-crm/#comments</comments>
		<pubDate>Sat, 17 Oct 2009 14:49:28 +0000</pubDate>
		<dc:creator>wesharp</dc:creator>
				<category><![CDATA[data quality]]></category>
		<category><![CDATA[CRM]]></category>
		<category><![CDATA[deduplication]]></category>
		<category><![CDATA[duplicate detection rules]]></category>
		<category><![CDATA[master data management]]></category>
		<category><![CDATA[merge]]></category>
		<category><![CDATA[merging duplicates]]></category>
		<category><![CDATA[Microsoft CRM]]></category>
		<category><![CDATA[Microsoft Dynamics CRM]]></category>

		<guid isPermaLink="false">http://dqchronicle.wordpress.com/?p=297</guid>
		<description><![CDATA[In last month&#8217;s edition of the DQC I reviewed some data quality features built into Microsoft&#8217;s CRM package, namely detect a duplicate upon create or update, duplicate detection rules and duplicate detection jobs.  I left off with a promise to dive deeper into how you remove the duplicates once you&#8217;ve detected them. 
Before I get into the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dqchronicle.wordpress.com&blog=6584949&post=297&subd=dqchronicle&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>In <a title="last month's edition of the DQC " href="http://dqchronicle.wordpress.com/2009/10/02/data-quality-and-microsoft-dynamics-crm/" target="_blank">last month&#8217;s edition of the DQC</a> I reviewed some data quality features built into Microsoft&#8217;s CRM package, namely detect a duplicate upon create or update, duplicate detection rules and duplicate detection jobs.  I left off with a promise to dive deeper into how you remove the duplicates once you&#8217;ve <a title="detected" href="http://www.youtube.com/watch?v=YiNHe7BUqhc" target="_blank">detected</a> them. </p>
<p>Before I get into the details, I want to emphasize that without customization, removing duplicates is not a batch process.  In other words, you remove duplicates one at a time.  Don&#8217;t kill the messager; learn from the message.  If there is one area within the data quality space that Microsoft needs to improve on, it&#8217;s this one. </p>
<p><img class="alignleft size-thumbnail wp-image-299" title="home_1_01" src="http://dqchronicle.files.wordpress.com/2009/10/home_1_01.png?w=150&#038;h=107" alt="home_1_01" width="150" height="107" />Duplicate consolidation, in my experience, is rarely so exception based that it can be done in such a tedious manner.  Not to mention that those organizations that are most afflicted with duplicates generally have  a large customer base.  When you have a customer base in the millions, duplication ratios can be as high as 10% or more.  Consolidating 100,000 duplicates one at a time is almost pointless.  By the time you catch up, you&#8217;ve created more duplicates.</p>
<p><img class="alignleft size-thumbnail wp-image-303" title="soapbox" src="http://dqchronicle.files.wordpress.com/2009/10/soapbox1.jpg?w=80&#038;h=150" alt="soapbox" width="80" height="150" /></p>
<p>  That said, let&#8217;s move on.  So you&#8217;ve detected duplicates and now you want to eliminate them from your data. </p>
<p>If you remember from last month&#8217;s post, read up <a title="here" href="http://dqchronicle.wordpress.com/2009/10/02/data-quality-and-microsoft-dynamics-crm/" target="_blank">here </a>if you don&#8217;t, a duplicate detection job returns potential duplicates and allows you to browse each one along with it&#8217;s potential match.  Consult the screenshot below for a view of what that looks like.</p>
<div id="attachment_306" class="wp-caption aligncenter" style="width: 510px"><img class="size-full wp-image-306 " title="clip_image013_thumb" src="http://dqchronicle.files.wordpress.com/2009/10/clip_image013_thumb.jpg?w=500&#038;h=303" alt="Duplicate Detection Job results" width="500" height="303" /><p class="wp-caption-text">Duplicate Detection Job results</p></div>
<p>In the lower pane of the screenshot above there is a toolbar option (3rd from the left) is a icon to merge the two highlighted records.  This is where the consolidation effort begins.</p>
<p>One of the best features of the merge functionality is that it has the flexibility to build a composite, or best of available information, master record.  Briefly, the master record is the record which is retained as the active record.  It also allows the end user to select one record over another.  An example of these features is outlined in the screenshot below.  First let&#8217;s look at the option where each element of the master record are selected.</p>
<div id="attachment_312" class="wp-caption alignleft" style="width: 510px"><img class="size-full wp-image-312" title="Master Record Selection" src="http://dqchronicle.files.wordpress.com/2009/10/master_record12.jpg?w=500&#038;h=341" alt="An example of an all inclusive master record selection" width="500" height="341" /><p class="wp-caption-text">An example of an all inclusive master record selection</p></div>
<p>Here&#8217;s a look at the composite option:</p>
<div id="attachment_315" class="wp-caption aligncenter" style="width: 510px"><img class="size-full wp-image-315" title="Composite Master Record Selection " src="http://dqchronicle.files.wordpress.com/2009/10/master_recordcomp.jpg?w=500&#038;h=142" alt="An example of a composite master record selection " width="500" height="142" /><p class="wp-caption-text">An example of a composite master record selection </p></div>
<p>Notice in the all inclusive example the entire left hand column is highlighted in blue, whereas in the composite option only those elements selected via the radio button are highlighted in blue.  This is a visual indication of what data elements will be retained in the master record.</p>
<p>This is one of my favorite pieces of functionality with regard to the merge option.  Often end users vary in the data they provide and it is always better for an organization to retain as much information about their customers as is possible.</p>
<p>I specifically chose the composite screenshot presented because it illustrates one of those important aspects in customer data quality.  Noticed that the element selected from the right hand side was a middle initial.  This data element is invaluable when performing data matching and having that element can make an important distinction between two different customers later on down the road.</p>
<p>Once you&#8217;ve defined what your master record looks like, either via the all inclusive or composite method, it is time to commit that selection.  The screenshot below illustrates how this is performed.</p>
<div id="attachment_316" class="wp-caption aligncenter" style="width: 510px"><img class="size-full wp-image-316" title="commit your master record selection" src="http://dqchronicle.files.wordpress.com/2009/10/master_recordconfirm.jpg?w=500&#038;h=72" alt="How to commit your master record selection" width="500" height="72" /><p class="wp-caption-text">How to commit your master record selection</p></div>
<p>An important option in the commit process is one enabled through the checkbox provided visible in the screenshot above.  Not every field in a record is in all cases exposed via the merge utility.  As a result the option made available through the checkbox allows you to make sure that, as the label indicates, select every field with data from the chosen master record even if there is a different value in the other record.  Simply put, it is an overwrite function that retains all the data from the selected master record beyond what is visible in the merge screen.</p>
<p>Once you&#8217;ve reviewed and are confident in your selection, you simply click on the OK button.  Provided there are no commit locks on the record, which indicates that another user has one of the two records open and is actively working on it, you will receive the following dialog box confirming your consolidation success.</p>
<div id="attachment_317" class="wp-caption aligncenter" style="width: 451px"><img class="size-full wp-image-317" title="master record complete" src="http://dqchronicle.files.wordpress.com/2009/10/master-record-complete.jpg?w=441&#038;h=127" alt="Duplicate elimination success!" width="441" height="127" /><p class="wp-caption-text">Duplicate elimination success!</p></div>
<p>It is critical to note that the subordinate, or non-master record, is NOT deleted from the system.  It is simply deactivated.  This is to say that a flag (statecode) is changed to inactivate.  One important note about the statecode field is that unlike conventional notation a value of &#8216;1&#8242; is not active in Microsoft Dyanmics CRM.  Instead Microsoft chose the value &#8216;0&#8242; as active and &#8216;1&#8242; as inactive.  Consequently all non-master records in CRM have a statecode value of &#8216;1&#8242;.  This little fact can save hours of data analysis and perserve the samity of your DBAs, so it is worth noting.</p>
<p>I hope this information was beneficial to you Microsoft Dyanmics CRM users and administrators.  As usual I welcome all comments, questions, and suggestions.  So please feel free to comment on this post and I&#8217;ll try and replay in a timely manner.</p>
Posted in data quality Tagged: CRM, data quality, deduplication, duplicate detection rules, master data management, merge, merging duplicates, Microsoft CRM, Microsoft Dynamics CRM <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dqchronicle.wordpress.com/297/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dqchronicle.wordpress.com/297/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dqchronicle.wordpress.com/297/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dqchronicle.wordpress.com/297/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dqchronicle.wordpress.com/297/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dqchronicle.wordpress.com/297/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dqchronicle.wordpress.com/297/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dqchronicle.wordpress.com/297/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dqchronicle.wordpress.com/297/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dqchronicle.wordpress.com/297/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dqchronicle.wordpress.com&blog=6584949&post=297&subd=dqchronicle&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://dqchronicle.wordpress.com/2009/10/17/removing-duplicates-in-microsoft-dynamics-crm/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/57521ea1af5d6a8b1abbfa6208c125f6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">wesharp</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/10/home_1_01.png?w=150" medium="image">
			<media:title type="html">home_1_01</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/10/soapbox1.jpg?w=80" medium="image">
			<media:title type="html">soapbox</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/10/clip_image013_thumb.jpg" medium="image">
			<media:title type="html">clip_image013_thumb</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/10/master_record12.jpg" medium="image">
			<media:title type="html">Master Record Selection</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/10/master_recordcomp.jpg" medium="image">
			<media:title type="html">Composite Master Record Selection </media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/10/master_recordconfirm.jpg" medium="image">
			<media:title type="html">commit your master record selection</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/10/master-record-complete.jpg" medium="image">
			<media:title type="html">master record complete</media:title>
		</media:content>
	</item>
		<item>
		<title>Data Quality and Microsoft Dynamics CRM</title>
		<link>http://dqchronicle.wordpress.com/2009/10/02/data-quality-and-microsoft-dynamics-crm/</link>
		<comments>http://dqchronicle.wordpress.com/2009/10/02/data-quality-and-microsoft-dynamics-crm/#comments</comments>
		<pubDate>Fri, 02 Oct 2009 22:21:11 +0000</pubDate>
		<dc:creator>wesharp</dc:creator>
				<category><![CDATA[data quality]]></category>
		<category><![CDATA[customer data integration]]></category>
		<category><![CDATA[duplicate detection]]></category>
		<category><![CDATA[Microsoft CRM]]></category>

		<guid isPermaLink="false">http://dqchronicle.wordpress.com/?p=269</guid>
		<description><![CDATA[This month I'd like to talk about my recent experienecs with some of the data quality features of Microsoft's Dynamics CRM package and how to put them to use in the typical enterprise environment.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dqchronicle.wordpress.com&blog=6584949&post=269&subd=dqchronicle&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><div class="mceTemp mceIEcenter" style="text-align:left;">First I&#8217;d like to thank all of those who submitted blog entries to the August edition of the IAIDQ Festival del IDQ Bloggers.  I enjoyed reviewing the submissions and look forward to hosting more IAIDQ related material.</div>
<p>This month I&#8217;d like to talk about my recent experienecs with some of the data quality features of Microsoft&#8217;s Dynamics CRM package and how to put them to use in the typical enterprise environment.</p>
<p>One of my favorite features is one that attempts to prevent duplication from occurring by executing a match scan on create or update of a record.  One of the reasons this is a favorite of mine is that it is proactive and aims to prevent duplication where every data quality experts recommends; at the beginning.  One of the issues I have seen with this feature is that in high data volumes it causes significant delays in record creation. </p>
<p>The feature can be enabled by checking on the &#8220;When a rcord is created or updated&#8221; option in the Duplicate Detection Settings panel of the Data Management console. </p>
<div id="attachment_276" class="wp-caption alignleft" style="width: 390px"><img class="size-full wp-image-276  " style="border:black 1px solid;" title="duplicate1" src="http://dqchronicle.files.wordpress.com/2009/10/duplicate1.jpg?w=380&#038;h=157" alt="Data Management Console" width="380" height="157" /><p class="wp-caption-text">Data Management Console</p></div>
<div class="mceTemp"><img class="size-full wp-image-290" title="clip_image008_thumb" src="http://dqchronicle.files.wordpress.com/2009/10/clip_image008_thumb2.jpg?w=441&#038;h=397" alt="Configuration setting for preventing the dupliction of data upon create or update" width="441" height="397" /></div>
<div class="mceTemp"> </div>
<div class="mceTemp"> In the event that duplicates do get created in your data, Microsoft Dynamics CRM also has some reactive features that allow you to identify these records and consolidate them.</div>
<p>One of these features are duplicate detection rules which are a set of criteria used for matching records.  For instance, it is very common for organizations to accumulate more than one record for a single customer.  In this example, a duplicate detection rule can be built to identify all customer transactions that have a match on customer&#8217;s first and last name as well as their zip code.  As you may already know, I am a diligent advocate of requiring more than first and last name to truely identify an individual.  Based on research regarding change of address (COA), it makes practical sense to limit this criteria to a geographic area like those from which postal codes are generated.  You may also want to throw in more qualifying data elements such as street name and number but for the purposes of this posting, we&#8217;ll stick to a customer&#8217;s full name and zip code.</p>
<p>This rule will identify each set of records that share the same values for these three data elements.  It is possible to require an exact match on the data values or a substring of the characters of the values.  Again, I am an advocate of using partial matches on data like last name due to the frequency of data entry errors.</p>
<p>Once you create your match criteria you&#8217;ll want to intialize, or publish, the rule.  You can publish a rule, with the proper permissions, by clicking on the greeen arrowed icon labeled &#8220;Publish&#8221; on the tool bar after the rule is saved.  Be forewarned some rules take quite a long time and a lot of resources to publish so you may want to perform this action as part of your off-hours operations.</p>
<p style="text-align:center;"><img class="size-full wp-image-291 aligncenter" title="publish" src="http://dqchronicle.files.wordpress.com/2009/10/publish1.png?w=500&#038;h=96" alt="Publish icon" width="500" height="96" /></p>
<p>Once the rule is published, it is time to schedule when it will be utilized.  This is done by building a duplicate detection job which includes the start time, a setting for execution reoccurance, and an option to provide an email for notification of when the job finishes.  The following is a snapshot of the interface for developing detection jobs.</p>
<div id="attachment_280" class="wp-caption aligncenter" style="width: 473px"><img class="size-full wp-image-280 " style="border:black 1px solid;" title="clip_image010_2" src="http://dqchronicle.files.wordpress.com/2009/10/clip_image010_2.jpg?w=463&#038;h=445" alt="clip_image010_2" width="463" height="445" /><p class="wp-caption-text">Duplicate Detection Job </p></div>
<p>Once you have your rules and jobs created, you completed the basic steps to remove duplicates from your data.  After the job completes and you receive your email you&#8217;ll want to review the duplicate matches. </p>
<p> This can be done by opening the duplicate detection job from the System Jobs queue and double-clicking on it.  Once the job is open you&#8217;ll see an option labeled &#8220;View Duplictes&#8221;.  <img class="size-full wp-image-294" title="clip_image012_thumb" src="http://dqchronicle.files.wordpress.com/2009/10/clip_image012_thumb.jpg?w=500&#038;h=303" alt="View Duplicates interface" width="500" height="303" /></p>
<p>Next month, I&#8217;ll dive deeper into the details on how to remove the duplicates with a posting on the merge feature.  I hope this was informative and enough to get most of you started.  I&#8217;ll address detailed questions if you have them, so please feel free to comment!</p>
Posted in data quality Tagged: customer data integration, data quality, duplicate detection, Microsoft CRM <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dqchronicle.wordpress.com/269/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dqchronicle.wordpress.com/269/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dqchronicle.wordpress.com/269/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dqchronicle.wordpress.com/269/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dqchronicle.wordpress.com/269/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dqchronicle.wordpress.com/269/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dqchronicle.wordpress.com/269/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dqchronicle.wordpress.com/269/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dqchronicle.wordpress.com/269/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dqchronicle.wordpress.com/269/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dqchronicle.wordpress.com&blog=6584949&post=269&subd=dqchronicle&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://dqchronicle.wordpress.com/2009/10/02/data-quality-and-microsoft-dynamics-crm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/57521ea1af5d6a8b1abbfa6208c125f6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">wesharp</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/10/duplicate1.jpg" medium="image">
			<media:title type="html">duplicate1</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/10/clip_image008_thumb2.jpg" medium="image">
			<media:title type="html">clip_image008_thumb</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/10/publish1.png" medium="image">
			<media:title type="html">publish</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/10/clip_image010_2.jpg" medium="image">
			<media:title type="html">clip_image010_2</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/10/clip_image012_thumb.jpg" medium="image">
			<media:title type="html">clip_image012_thumb</media:title>
		</media:content>
	</item>
		<item>
		<title>August Edition of IAIDQ Festival del IDQ Bloggers</title>
		<link>http://dqchronicle.wordpress.com/2009/09/01/august-edition-of-iaidq-festival-del-idq-bloggers/</link>
		<comments>http://dqchronicle.wordpress.com/2009/09/01/august-edition-of-iaidq-festival-del-idq-bloggers/#comments</comments>
		<pubDate>Tue, 01 Sep 2009 17:52:05 +0000</pubDate>
		<dc:creator>wesharp</dc:creator>
				<category><![CDATA[data quality]]></category>
		<category><![CDATA[business case]]></category>
		<category><![CDATA[communication]]></category>
		<category><![CDATA[data governance]]></category>
		<category><![CDATA[data matching]]></category>
		<category><![CDATA[empathetic listening]]></category>
		<category><![CDATA[IAIDQ]]></category>
		<category><![CDATA[IAIDQ Festival del IDQ Bloggers]]></category>
		<category><![CDATA[identity resolution]]></category>
		<category><![CDATA[information quality management]]></category>
		<category><![CDATA[master data management]]></category>
		<category><![CDATA[recession proof profession]]></category>
		<category><![CDATA[reference data]]></category>
		<category><![CDATA[value proposition]]></category>

		<guid isPermaLink="false">http://dqchronicle.wordpress.com/?p=214</guid>
		<description><![CDATA[This year the IAIDQ, an international not-for-profit dedicated to developing the profession of Information Quality Management, is 5 years old and is having a series of rolling celebrations, the Blog Carnival “Festival del IDQ Bloggers” being one of the strands of those celebrations. 
I am glad to be hosting the Festival del IDQ Bloggers this month!  I&#8217;ve [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dqchronicle.wordpress.com&blog=6584949&post=214&subd=dqchronicle&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>This year the <a title="IAIDQ" href="http://iaidq.org/" target="_blank">IAIDQ</a>, an international not-for-profit dedicated to developing the profession of Information Quality Management, is 5 years old and is having a series of rolling celebrations, the Blog Carnival “Festival del IDQ Bloggers” being one of the strands of those celebrations. </p>
<p>I am glad to be hosting the Festival del IDQ Bloggers this month!  I&#8217;ve tried to capture the core of each message, but each of these is worth a deeper look.  Don&#8217;t forget to follow the submission links and get all the details!</p>
<p>This month&#8217;s first submission comes to use from <a title="Daragh O'Brien" href="http://twitter.com/daraghobrien" target="_blank">Daragh O Brien</a>.  Daragh poses an interesting question when he asks, <a href="http://obriend.info/2009/08/25/is-information-quality-management-a-recession-proof-profession/" target="_blank">Is information quality management a recession proof profession</a>?</p>
<p>One clear take away from this post is that those who express the value proposition of an information quality initiative are more likely to be regarded as valuable and necessary.  In that way those who participate in these initiatives can be thought of a recession proof.</p>
<p>About Daragh: Obsessive blogger, information quality consultant and Director of IAIDQ with over 12 years experience at the sharp end of Information Quality. Taoiseach (CEO) of Castlebridge Associates, a specialist Information Quality consulting business based in Ireland.</p>
<p>Continuing on a theme our second post comes to us from <a title="Dylan Jones" href="http://twitter.com/dataqualitypro" target="_blank">Dylan Jones</a> and shows us <a href="http://www.dataqualitypro.com/data-quality-home/how-to-deliver-a-compelling-data-quality-business-case.html" target="_blank">How To Deliver A Compelling Data Quality Business Case</a>.</p>
<p> Dylan recommends several excellent ways to build and deliver the value proposition such as:</p>
<ul>
<li>use time/date stamps to show information quality as a long term problem and not a short lived glitch</li>
<li>use cause and affect analysis to link data quality issues with business process gaps</li>
<li>review the annual report to gain insight into corporate strategies that can benefit from information quality services</li>
<li>don&#8217;t use PowerPoint</li>
<li>keep it simple to get the point across</li>
</ul>
<p>About Dylan: Dylan is the founder and editor of the <a title="Data Quality Pro" href="http://www.dataqualitypro.com" target="_blank">Data Quality Pro </a>which is dedicated to &#8220;helping data quality professionals take their career or business to the next level.&#8221;</p>
<p>Switching gears a little, <a href="http://www.linkedin.com/in/jimharris" target="_blank">Jim Harris</a> reminds us to keep our theories in check until after we&#8217;ve taken the time to really listen to what&#8217;s being said  in his post <a href="http://www.ocdqblog.com/home/hailing-frequencies-open.html">Hailing Frequencies Open</a>. </p>
<p>In this post Jim points out the difference between waiting to talk and what is called empathetic listening where we are actually listening with the intent to really try to understand the other person&#8217;s frame of reference. </p>
<p>About Jim:  Jim Harris is an independent consultant, speaker, writer and blogger with over 15 years of professional services and application development experience in data quality.  <a href="http://www.ocdqblog.com/" target="_blank">Obsessive-Compulsive Data Quality</a> is an independent blog offering a vendor-neutral perspective on data quality.</p>
<p>While we are on the topic of communication <a href="http://twitter.com/stevesarsfield" target="_blank">Steve Sarsfield</a> recommends some data governance dialog for CEO&#8217;s in his post, <a href="http://data-governance.blogspot.com/2009/08/9-questions-ceos-should-ask-about-data.html">9 Questions CEOs Should Ask About Data Governance</a>.</p>
<p> In this post Steve points out that the executive team is responsible to lower risk and gain control through their influence on data governance.  The following are a few of the question Steve suggestions:</p>
<ol>
<li>Do we have a data management strategy?</li>
<li>Are we in compliance with all laws regarding our governance of data?</li>
<li>Do you have the access to data you need?</li>
</ol>
<p>About Steve: Steve Sarsfield is a data quality evangelist and author of the book the <a href="http://www.itgovernance.co.uk/products/2445" target="_blank"><em>Data Governance Imperative</em></a>. </p>
<p>Finally in <a href="http://liliendahl.wordpress.com/2009/08/05/sweden-meets-united-states/" target="_blank">Sweden meets United States</a> a post from <a href="http://www.linkedin.com/in/henrikliliendahlsoerensen" target="_blank">Henrik </a>Liliendahl Sørensen on data matching and how character sets, address formats and naming conventions are just a few of the complexities when the data originates in a different language.  Maybe, as he suggests, centralized reference data is a step in the right direction to solving some of these issues?</p>
<p>About Henrik: <a href="http://www.linkedin.com/in/henrikliliendahlsoerensen" target="_blank">Henrik </a>is a Data Quality and Master Data Management professional also doing Data Architecture.  You can check out what is on his mind at <a href="http://http://liliendahl.wordpress.com/" target="_blank">Liliendahl on Data Quality</a>.</p>
<p>I hope you enjoyed this month&#8217;s edition of the blog carnival!  I want to thank all those who&#8217;ve submitted postings and encourage those who have not done so yet to participate in this opportunity.</p>
Posted in data quality Tagged: business case, communication, data governance, data matching, data quality, empathetic listening, IAIDQ, IAIDQ Festival del IDQ Bloggers, identity resolution, information quality management, master data management, recession proof profession, reference data, value proposition <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dqchronicle.wordpress.com/214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dqchronicle.wordpress.com/214/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dqchronicle.wordpress.com/214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dqchronicle.wordpress.com/214/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dqchronicle.wordpress.com/214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dqchronicle.wordpress.com/214/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dqchronicle.wordpress.com/214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dqchronicle.wordpress.com/214/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dqchronicle.wordpress.com/214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dqchronicle.wordpress.com/214/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dqchronicle.wordpress.com&blog=6584949&post=214&subd=dqchronicle&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://dqchronicle.wordpress.com/2009/09/01/august-edition-of-iaidq-festival-del-idq-bloggers/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/57521ea1af5d6a8b1abbfa6208c125f6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">wesharp</media:title>
		</media:content>
	</item>
		<item>
		<title>The DQ Two Step!</title>
		<link>http://dqchronicle.wordpress.com/2009/08/05/the-dq-two-step/</link>
		<comments>http://dqchronicle.wordpress.com/2009/08/05/the-dq-two-step/#comments</comments>
		<pubDate>Wed, 05 Aug 2009 03:00:00 +0000</pubDate>
		<dc:creator>wesharp</dc:creator>
				<category><![CDATA[data quality]]></category>
		<category><![CDATA[customer data integration]]></category>
		<category><![CDATA[data matching]]></category>
		<category><![CDATA[master data management]]></category>

		<guid isPermaLink="false">http://dqchronicle.wordpress.com/?p=200</guid>
		<description><![CDATA[In order to positively identify a non-unique individual you need to pair their name with an additional piece of identifying information, usually an address. 
In other words, it is a two part matchon name and address that can, with a realtively high confidence level, identify a true duplicate.  If we only used a match on name to identify duplicate, we'd consolidate all the John Smith's in the dataset to one customer.  One brief glance in the local phone directory will be enough to demonstrate how non-unique names really are.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dqchronicle.wordpress.com&blog=6584949&post=200&subd=dqchronicle&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Howdy Folks!  I&#8217;d like to sing you a tune about matching customer data if you don&#8217;t mind?  It&#8217;s called the <em><strong>&#8220;data quality two step&#8221;</strong></em>. </p>
<p>Heck! You can even grab your partner if the mood strikes you?</p>
<div id="attachment_203" class="wp-caption alignleft" style="width: 95px"><img class="size-full wp-image-203" title="success" src="http://dqchronicle.files.wordpress.com/2009/08/success1.jpg?w=85&#038;h=127" alt="Feels good!" width="85" height="127" /><p class="wp-caption-text">Feels good!</p></div>
<p>I&#8217;ve recently cleansed some customer data for a great bunch of folks that I <em>love</em> calling my client. </p>
<p>We had a consolidation ratio of 1:6 or around 17%.  Which equated to roughly a million duplicates removed.  That&#8217;s a lot savings on postage stamps for the marketing department so they were psyched!  We validated over 90% of the addresses and built reports to identify those that did not meet the requirements for a valid address.  Not too shabby if I don&#8217;t say so myself! </p>
<p>Now that we&#8217;ve deployed the data into User Acceptance (UAT) I find myself in a familiar place; the business logic. </p>
<div id="attachment_202" class="wp-caption alignleft" style="width: 110px"><img class="size-thumbnail wp-image-202" title="What's this?" src="http://dqchronicle.files.wordpress.com/2009/08/computer_screen_big.jpg?w=100&#038;h=150" alt="What's this?" width="100" height="150" /><p class="wp-caption-text">You missed something</p></div>
<p>You can spend all the time you need, or even care to, on rules for consolidation but it usually is not until the data hits the screen that the ramifications are easily understood by the average business user.</p>
<p>Case in point, I recently received an email from a stakeholder asking me to look over some data with him.  I was curious what I&#8217;d find when I reached his office as I analyzed this data and the processing code more than a few times by now.  On my walk over I went through many possible scenarios in my head.</p>
<p>Was it something I missed?  Surely not.  I&#8217;ve performed several test runs in order to validate the business logic.  With my curiosity peaked I rounded the corner and into his office I went.</p>
<div id="attachment_204" class="wp-caption alignleft" style="width: 160px"><img class="size-thumbnail wp-image-204" title="chitchat" src="http://dqchronicle.files.wordpress.com/2009/08/chitchat.jpg?w=150&#038;h=135" alt="&quot;Good point!&quot;" width="150" height="135" /><p class="wp-caption-text">&quot;Good point!&quot;</p></div>
<p>After a little chit-chat, like I said I <em>love</em> this client, we got down to business.  He proceeded to type a few parameters in the search utility and I waited with anticipation.</p>
<p>However after a second, maybe less,  my anticipation was replaced with relief and more than a little disbelief.  I&#8217;d been over this a time or two which is why I was in a state of shock.  Not to mention my client was not someone who needed &#8220;Data for Dummies&#8221;. </p>
<p>With identities masked to protect the innocent, below is a sample of the records he was concerned about and wanted me to see.</p>
<div class="mceTemp mceIEcenter">
<div class="mceTemp mceIEcenter">
<div id="attachment_208" class="wp-caption aligncenter" style="width: 478px"><img class="size-full wp-image-208" title="wtf" src="http://dqchronicle.files.wordpress.com/2009/08/wtf2.jpg?w=468&#038;h=58" alt="Duplicate?" width="468" height="58" /><p class="wp-caption-text">Duplicate?</p></div>
</div>
<div class="mceTemp mceIEcenter" style="text-align:left;">So if you&#8217;ve been wondering about this two step thing, here it comes. </div>
<div class="mceTemp mceIEcenter" style="text-align:left;">In order to positively identify a non-unique individual you need to pair their name with an additional piece of <em>identifying</em> information, usually an address. </div>
<div class="mceTemp mceIEcenter" style="text-align:center;">In other words, it is a <strong>two part match</strong> on name <em>and</em> address that can, with a realtively high confidence level, identify a <em>true duplicate</em>. </div>
<div class="mceTemp mceIEcenter" style="text-align:left;">If we only used a match on name to identify duplicate, we&#8217;d consolidate all the John Smith&#8217;s in the dataset to one customer.  Talk about lost opportunity!  This approach could turn millions of customers into thousands in an instant.</div>
<div class="mceTemp mceIEcenter" style="text-align:left;">One brief glance in the local phone directory will be enough to demonstrate how non-unique names really are.  </div>
<div class="mceTemp mceIEcenter" style="text-align:left;">
<div id="attachment_211" class="wp-caption aligncenter" style="width: 478px"><img class="size-full wp-image-211" title="yellow" src="http://dqchronicle.files.wordpress.com/2009/08/yellow.jpg?w=468&#038;h=197" alt="They've been through this before!" width="468" height="197" /><p class="wp-caption-text">They&#39;ve been through this before!</p></div>
</div>
<div class="mceTemp mceIEcenter" style="text-align:left;">Go a step further and ask your local DBA to run some counts on first-last name combinations and you&#8217;ll be surprised at the results.</div>
<div class="mceTemp mceIEcenter" style="text-align:left;">Just in case this little story wasn&#8217;t sufficient enough to remind you here is that tune I promised you:</div>
<div class="mceTemp mceIEcenter" style="text-align:left;"> </div>
<div class="mceTemp mceIEcenter" style="text-align:left;">The two step matching ditty goes a little like this &#8230;</div>
<div class="mceTemp mceIEcenter" style="text-align:left;"> </div>
<div class="mceTemp mceIEcenter" style="text-align:center;">Grab your partner&#8217;s name and twirl it around</div>
<div class="mceTemp mceIEcenter" style="text-align:center;">Make sure the nickname&#8217;s proper equal is found</div>
<div class="mceTemp mceIEcenter" style="text-align:center;">Then grab you their address and scrub with the care</div>
<div class="mceTemp mceIEcenter" style="text-align:center;">Make sure their mail can be delivered there</div>
<div class="mceTemp mceIEcenter" style="text-align:center;">Don&#8217;t get rid of your partner until you are sure</div>
<div class="mceTemp mceIEcenter" style="text-align:center;">That you&#8217;ve got a match on more</div>
<div class="mceTemp mceIEcenter" style="text-align:center;">Than the name <em>or</em> the door</div>
<div class="mceTemp mceIEcenter" style="text-align:left;">lyrics by Data Pickins</div>
<div class="mceTemp mceIEcenter" style="text-align:left;">music by YouToo?<span id="more-200"></span></div>
<div class="mceTemp mceIEcenter" style="text-align:left;"> </div>
</div>
Posted in data quality Tagged: customer data integration, data matching, data quality, master data management <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dqchronicle.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dqchronicle.wordpress.com/200/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dqchronicle.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dqchronicle.wordpress.com/200/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dqchronicle.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dqchronicle.wordpress.com/200/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dqchronicle.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dqchronicle.wordpress.com/200/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dqchronicle.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dqchronicle.wordpress.com/200/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dqchronicle.wordpress.com&blog=6584949&post=200&subd=dqchronicle&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://dqchronicle.wordpress.com/2009/08/05/the-dq-two-step/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/57521ea1af5d6a8b1abbfa6208c125f6?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">wesharp</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/08/success1.jpg" medium="image">
			<media:title type="html">success</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/08/computer_screen_big.jpg?w=100" medium="image">
			<media:title type="html">What's this?</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/08/chitchat.jpg?w=150" medium="image">
			<media:title type="html">chitchat</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/08/wtf2.jpg" medium="image">
			<media:title type="html">wtf</media:title>
		</media:content>

		<media:content url="http://dqchronicle.files.wordpress.com/2009/08/yellow.jpg" medium="image">
			<media:title type="html">yellow</media:title>
		</media:content>
	</item>
	</channel>
</rss>