1

I have a xml file that is structured like this:

<?xml version="1.0" encoding="utf-8"?>
<ScheduleMessage DtdVersion="3" DtdRelease="0">
  <MessageIdentification v="ETSOVista-DMinus1TotalLoadForecast-DE-2012-1" />
  <MessageVersion v="1" />
  <MessageType v="A11" />
    <ScheduleTimeSeries>
    <SendersTimeSeriesIdentification v="10YCB-GERMANY--8" />
    <SendersTimeSeriesVersion v="1" />
    <BusinessType v="A05" />
    <Period>
      <TimeInterval v="2012-11-15T23:00Z/2012-11-16T23:00Z" />
      <Resolution v="PT60M" />
      <Interval>
        <Pos v="1" />
        <Qty v="52452" />
      </Interval>
      <Interval>
        <Pos v="2" />
        <Qty v="50527" />
      </Interval>
      <Interval>
       <Pos v="3" />
       <Qty v="49221" />
      </Interval>
      <Interval>
       <Pos v="4" />
       <Qty v="49344" />
      </Interval>
    </Period>
   </ScheduleTimeSeries>
   <ScheduleTimeSeries>
    <SendersTimeSeriesIdentification v="10YCB-GERMANY--8" />
    <SendersTimeSeriesVersion v="1" />
    <BusinessType v="A05" />
    <Period>
     <TimeInterval v="2012-11-16T23:00Z/2012-11-17T23:00Z" />
     <Resolution v="PT60M" />
     <Interval>
      <Pos v="1" />
      <Qty v="50935" />
     </Interval>
     <Interval>
      <Pos v="2" />
      <Qty v="48918" />
     </Interval>
     <Interval>
      <Pos v="3" />
      <Qty v="47347" />
     </Interval>
     <Interval>
      <Pos v="4" />
      <Qty v="46382" />
  </Interval>
 </Period>
</ScheduleTimeSeries>
</ScheduleMessage>

I only need the Qty values. So far my code looks like this:

xml <- xmlInternalTreeParse(file = "test.xml")
xml_top <- xmlRoot(xml)
xml_children <- xmlChildren(x = xml_top)

But when I try to get more deep into the file with:

xml_children2 <- xmlChildren(x = xml_children)

I receive the following error:

Error in UseMethod("xmlChildren") : 
no applicable method for 'xmlChildren' applied to an object of class "c('XMLInternalNodeList', 'XMLNodeList')"

I also tried to subset the file using [] or [[]], but it always guides me into the same error.

max.mustermann
  • 115
  • 1
  • 1
  • 9

2 Answers2

0

This is much simpler with an XQuery processor such as xqilla:

$ echo 'for $v in //Qty/@v return xs:string($v)' | xqilla -i test.xml /dev/stdin
52452
50527
49221
49344
50935
48918
47347
46382

The output can then be read in easily using read.table. You might also be able to use the RXQuery package to run this within R, or as shown in this answer.

Credits: Answer to Extract value of attribute node via XPath

Community
  • 1
  • 1
krlmlr
  • 22,030
  • 13
  • 107
  • 191
0

I solved my problem by using:

xpathSApply(doc = xml_top, 
            file = "//ScheduleMessage/ScheduleTimeSeries/Period/Interval/Qty", 
            fun = xmlAttrs)
max.mustermann
  • 115
  • 1
  • 1
  • 9