Quantcast
Channel: SCN : Unanswered Discussions - Data Services and Data Quality
Viewing all articles
Browse latest Browse all 3719

Handling XML nodes with mixed content

$
0
0

Hi.

I'm trying to load xml nodes with mixed content into DS, however, haven't succeeded in loading the free text content of nodes. I’ve tried XML Schemas and DTD to define my XML, both bring only tags and not the free text.

I appreciate your help.

 

Input XML:

 

<TailParas>

            <Para>It isn't just <En cat="keyphrase" instRef="4">chip makers</En> that need engineers and <En cat="keyphrase" instRef="5">math whizzes</En> they say they can't find in the U.S. or farmers who need workers to pluck the oranges no one here wants to pick. It's a number of industries and, some argue, the economy in general that needs a revamping of <En cat="keyphrase" instRef="6">immigration rules</En> so the U.S. can compete with the rest of the world.</Para>

</TailParas>

 

DTD chunck:

<!ELEMENT Para (#PCDATA | En)*>  <-- Defines mixed content

<!ELEMENT En (#PCDATA)>

<!ATTLIST En

          cat CDATA #FIXED "keyphrase"

          instRef CDATA #REQUIRED>

<!ELEMENT TailParas (Para* )>

 

 

(equivalent XSD chunck:

  <xs:element name="Para">       <-- Defines Mixed Content

      <xs:choice minOccurs="0" maxOccurs="unbounded">

        <xs:element ref="En"/>

      </xs:choice>

    </xs:complexType>

  </xs:element>

  <xs:element name="En">

    <xs:complexType>

      <xs:simpleContent>

        <xs:extension base="xs:string">

          <xs:attribute name="cat" type="EntityTypeName" use="required"/>

          <xs:attribute name="instRef" type="xs:unsignedInt" use="required"/>

        </xs:extension>

      </xs:simpleContent>

    </xs:complexType>

  </xs:element>

  <xs:element name="TailParas">

    <xs:complexType>

      <xs:sequence>

        <xs:element ref="Para"/>

      </xs:sequence>

    </xs:complexType>

  </xs:element>

)

 

Content being loaded by DS and passed to the Java Adapter that consumes these XMLs:

 

<TailParas><Para><En cat = "keyphrase"  instRef = "4" >chip makers</En>

<En cat = "keyphrase"  instRef = "5" >math whizzes</En>

<En cat = "keyphrase"  instRef = "6" >immigration rules</En>

<En cat = "keyphrase"  instRef = "7" >factory migration</En>

<En cat = "keyphrase"  instRef = "8" >bipartisan momentum</En>

<En cat = "keyphrase"  instRef = "9" >immigration reform</En>

</Para>

</TailParas>


Viewing all articles
Browse latest Browse all 3719

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>