XML::Simple oversimplification

Perl’s XML::Simple module is an easy way to get your application to talk some basic XML. it uses expat to parse data, so there the leverage is good. But where things tend to go awry is with consistency of reading and writing back.

Example

Let us take the following sample XML document:

<data>
  <survey key="123">
    <description>A survey with a description.</description>
    <qa key="1" type="radio" text="Only one valid option in answers">
      <a key="1">Enie</a>
      <a key="2">Menie</a>
      <a key="3">Be</a>
    </qa>
    <qa key="2" type="text" limit="255" text="A long comment field, with a set length limit (default to 500 characters)">
      <a />
    </qa>
  </survey>
</data>

Once read in and dumped through Data::Dumper, it is represented thusly (keeproot option of XML::Simple was set to 0):

$VAR1 = {
  'survey' => {
      'qa' => {
          '1' => {
                   'a' => {
                            '1' => {
                                     'content' => 'Enie'
                                   },
                            '3' => {
                                     'content' => 'Be'
                                   },
                            '2' => {
                                     'content' => 'Menie'
                                   }
                          },
                   'text' => 'Only one valid option in answers:',
                   'type' => 'radio'
                 },
          '2' => {
                   'a' => {},
                   'text' => 'A long comment field, with a set length limit (default to 500 characters)',
                   'type' => 'text',
                   'limit' => '255'
                 }
        },
      'description' => 'Second survey with a much longer description.',
      'key' => '123'
    }
  };

However, writing out with XMLout does not produce XML file one has just read in:

<opt>
  <survey key="123" description="Second survey with a much longer description.">
    <qa name="1" text="Only one valid option in answers:" type="radio">
      <a name="1">Enie</a>
      <a name="2">Menie</a>
      <a name="3">Be</a>
    </qa>
    <qa name="2" limit="255" text="A long comment field, with a set length limit (default to 500 characters)" type="text">
      <a></a>
    </qa>
  </survey>
</opt>

XML::Simple is unable to distinguish from the nested Perl hash whether an item is an element attribute or a tag — see how <description> has become an attribute? Forcing array out put helps, but causes, for instance, the same <description> tag content to be wrapped into a single element array.

Perl code to do reading and writing of the above:

#!/usr/bin/perl -w
use XML::Simple;
use Data::Dumper;
$xmlDoc =<
<data>
  <survey key="123">
    <description>A survey with a description.</description>
    <qa key="1" type="radio" text="Only one valid option in answers">
      <a key="1">Enie</a>
      <a key="2">Menie</a>
      <a key="3">Be</a>
    </qa>
    <qa key="2" type="text" limit="255" text="A long comment field, with a set length limit (default to 500 characters)">
      <a />
    </qa>
  </survey>
</data>
END
$xs = new XML::Simple (forcearray => 0,
                       keeproot => 0);
$xRef = $xs->XMLin($xmlDoc);
print Dumper($xRef);
print XMLout($xRef);

Leave a Comment