XML : Conditional XML reorganization

I'm looking for an efficient way to reorganize parts of an XML document that contain multiple children of any type 'Audiovisual' or 'Gallery'. In addition, the process must update the value in the tag and add a new tag named with a fixed value (INET). Lastly, any other data that does not match the selection criteria must pass through unmodified.

My input sample looks like the following:

  <Manifest>      <!-- Version will be modified, and a <Usage/> tag will be added -->      <Compatibility>          <Version>1.4</Version>      </Compatibility>        <!-- Copied to the output as-is -->      <Presentation PresentationID='presentationid:clip.1'>          <TrackMetadata>              <TrackSelectionNumber>0</TrackSelectionNumber>              <VideoTrackReference>                  <VideoTrackID>vidtrackid:clip.1.video</VideoTrackID>              </VideoTrackReference>              <AudioTrackReference>                  <AudioTrackID>audtrackid:clip.1.audio.en.primary</AudioTrackID>              </AudioTrackReference>          </TrackMetadata>      </Presentation>        <Experiences>          <!-- This experience contains more than 1 element (Audiovisual and Gallery),                it will be reorganized as a set of 2 discrete experience children -->          <Experience ExperienceID="experiencedid:bonus.1">              <Language>en</Language>              <Region>                  <country>US</country>              </Region>              <ContentID>cid:bonus.1</ContentID>              <Audiovisual ContentID="cid:clip.1">                  <Type>Clip</Type>                  <SubType>Cats and Dogs</SubType>                  <PresentationID>presentationid:clip.1</PresentationID>                  <ContentID>cid:clip.1</ContentID>              </Audiovisual>              <Gallery GalleryID="galleryid:gallery.1">                  <Type>Bonus</Type>                  <PictureGroupID>picturegroupid:gallery.1</PictureGroupID>                  <GalleryName>Cats with Phasers</GalleryName>                  <ContentID>cid:egallery.1</ContentID>              </Gallery>              <TimedSequenceID>timedsequenceid:related</TimedSequenceID>          </Experience>            <!-- This node has a single element (Audiovisual),               it will be copied to the output as-is -->          <Experience ExperienceID='experiencedid:bonus.2.ce.1'>              <Language>en</Language>              <Region>                  <country>US</country>              </Region>              <Audiovisual ContentID="cid:clip.2">                  <Type>Clip</Type>                  <SubType>Puppet Parade</SubType>                  <PresentationID>presentationid:clip.2</PresentationID>                  <ContentID>cid:clip.2</ContentID>              </Audiovisual>          </Experience>            <!-- This node has a single element (Gallery),               it will be copied to the output as-is -->          <Experience ExperienceID='experiencedid:bonus.2.ce.2'>              <Language>en</Language>              <Region>                  <country>US</country>              </Region>              <Gallery GalleryID="galleryid:gallery.1">                  <Type>Bonus</Type>                  <PictureGroupID>picturegroupid:gallery.1</PictureGroupID>                  <GalleryName>Cats with Phasers</GalleryName>                  <ContentID>cid:egallery.1</ContentID>              </Gallery>          </Experience>            <!-- There are no duplicates of Audiovisual or Gallery in this node,                it will be copied to the output as-is -->          <Experience ExperienceID="experiencedid:bonus.2">              <ContentID>cid:bonus.1</ContentID>              <TimedSequenceID>timedsequenceid:related</TimedSequenceID>              <Child>                  <Relationship>ispartof</Relationship>                  <SequenceInfo>                      <Number>1</Number>                  </SequenceInfo>                  <ExperienceID>experiencedid:bonus.2.ce.1</ExperienceID>              </Child>              <Child>                  <Relationship>ispartof</Relationship>                  <SequenceInfo>                      <Number>2</Number>                  </SequenceInfo>                  <ExperienceID>experiencedid:bonus.2.ce.2</ExperienceID>              </Child>          </Experience>          </Experiences>  </Manifest>    

Based on this input, the output of the process should look like:

  <Manifest>      <!-- Modified to increase the Version and add the <Usage>INET</Usage> tag -->      <Compatibility>          <Version>2</Version>          <Usage>INET</Usage>      </Compatibility>        <!-- Copied to the output as-is -->      <Presentation PresentationID='presentationid:clip.1'>          <TrackMetadata>              <TrackSelectionNumber>0</TrackSelectionNumber>              <VideoTrackReference>                  <VideoTrackID>vidtrackid:clip.1.video</VideoTrackID>              </VideoTrackReference>              <AudioTrackReference>                  <AudioTrackID>audtrackid:clip.1.audio.en.primary</AudioTrackID>              </AudioTrackReference>          </TrackMetadata>      </Presentation>        <Experiences>          <!-- Standalone experience named bonus.1.ce.1 based on clip.1 -->          <Experience ExperienceID='experiencedid:bonus.1.ce.1'>              <Language>en</Language>              <Region>                  <country>US</country>              </Region>              <Audiovisual ContentID="cid:clip.1">                  <Type>Clip</Type>                  <SubType>Cats and Dogs</SubType>                  <PresentationID>presentationid:clip.1</PresentationID>                  <ContentID>cid:clip.1</ContentID>              </Audiovisual>          </Experience>          <!-- Standalone experience named bonus.1.ce.2 based on gallery.1 -->          <Experience ExperienceID='experiencedid:bonus.1.ce.2'>              <Language>en</Language>              <Region>                  <country>US</country>              </Region>              <Gallery GalleryID="galleryid:gallery.1">                  <Type>Bonus</Type>                  <PictureGroupID>picturegroupid:gallery.1</PictureGroupID>                  <GalleryName>Cats with Phasers</GalleryName>                  <ContentID>cid:egallery.1</ContentID>              </Gallery>          </Experience>          <!-- The bonus.1 experience now references content by way of discrete child experiences -->          <Experience ExperienceID="experiencedid:bonus.1">              <ContentID>cid:bonus.1</ContentID>              <TimedSequenceID>timedsequenceid:related</TimedSequenceID>              <Child>                  <Relationship>ispartof</Relationship>                  <SequenceInfo>                      <Number>1</Number>                  </SequenceInfo>                  <ExperienceID>experiencedid:bonus.1.ce.1</ExperienceID>              </Child>              <Child>                  <Relationship>ispartof</Relationship>                  <SequenceInfo>                      <Number>2</Number>                  </SequenceInfo>                  <ExperienceID>experiencedid:bonus.1.ce.2</ExperienceID>              </Child>          </Experience>          <!-- bonus.2.ce.1 is copied literally from the input -->          <Experience ExperienceID='experiencedid:bonus.2.ce.1'>              <Language>en</Language>              <Region>                  <country>US</country>              </Region>              <Audiovisual ContentID="cid:clip.2">                  <Type>Clip</Type>                  <SubType>Puppet Parade</SubType>                  <PresentationID>presentationid:clip.2</PresentationID>                  <ContentID>cid:clip.2</ContentID>              </Audiovisual>          </Experience>          <!-- bonus.2.ce.2 is copied literally from the input -->          <Experience ExperienceID='experiencedid:bonus.2.ce.2'>              <Language>en</Language>              <Region>                  <country>US</country>              </Region>              <Gallery GalleryID="galleryid:gallery.1">                  <Type>Bonus</Type>                  <PictureGroupID>picturegroupid:gallery.1</PictureGroupID>                  <GalleryName>Cats with Phasers</GalleryName>                  <ContentID>cid:egallery.1</ContentID>              </Gallery>          </Experience>          <!-- bonus.2 is copied literally from the input -->          <Experience ExperienceID="experiencedid:bonus.2">              <ContentID>cid:bonus.1</ContentID>              <TimedSequenceID>timedsequenceid:related</TimedSequenceID>              <Child>                  <Relationship>ispartof</Relationship>                  <SequenceInfo>                      <Number>1</Number>                  </SequenceInfo>                  <ExperienceID>experiencedid:bonus.2.ce.1</ExperienceID>              </Child>              <Child>                  <Relationship>ispartof</Relationship>                  <SequenceInfo>                      <Number>2</Number>                  </SequenceInfo>                  <ExperienceID>experiencedid:bonus.2.ce.2</ExperienceID>              </Child>          </Experience>         </Experiences>  </Manifest>    

The ideal solution will use XSLT but, any solution (bash, javascript, php, python, ruby, go, etc) that gets the job done is a worthy contender.

No comments:

Post a Comment