Quantcast
Channel: SCN : Unanswered Discussions - Data Services and Data Quality
Viewing all articles
Browse latest Browse all 3719

Detailed question - French data cleansing

$
0
0

I'm working on a team doing data cleansing and de-duplication within SAP. We've noticed some unusual behaviour, specifically relating to how Primary Names are handled.

 

Some examples are in the table below:

 

RAW_DATACLEANSED_DATA
2 allée de Longchamps2 all de Longchamp
92, rue Réaumur92 rue Reaumur
33, rue Juliette Récamier 69006 Lyon33 rue Juliette Recamier
12 Rue du Général Patton12 rue du General Patton
253, avenue du Président Wilson253 av du President Wilson
40-42 rue de la Boétie40-42 rue la Boetie
8 Avenue Delcassé8 av Delcasse
22 Boulevard Maréchal Foch22 bd Marechal Foch
9-11 allée de l'Arche  Tour Egée9-11 all de l'Arche, Tour Egée
14 rue Avaulée14 rue Avaulee
405 avenue Galilée405 av Galilee

 

Each record has the status code description of "Data Quality corrected the following address components: region, locality, primary name "

 

In each case the cleansed version is factually incorrect, the missing accents and changing "Longchamps" to "Longchamp" make the addresses less good quality than the originals.

 

I cannot see any parameters that have been set to make this happen, this behaviour only seems to relate to Primary Name, other address components are fine. "Convert Latin Output to Ascii" is set to NO as evidenced by the fact that is it just this address component that is affected.

 

As far as I can see, there are two possible explanations:

 

  • There is a codepage issue which is preventing diacritical characters on Primary Name - but this does not explain "Longchamps" being changed to "Longchamp"
  • The address directory itself contains these inaccurate entries

 

I suspect that we'll need to raise it with the product group, but has anyone else encountered similar issues ?

 

Best regards,

 

Barry Carlino


Viewing all articles
Browse latest Browse all 3719

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>