CFI problem

4 posts / 0 new

Submitted by emn178 on October 4, 2012 - 6:27pm

I am studying CFI spec, and I have some questions about it.
Here are some cases that confuses me and I don't know what their path resolutions are.

case 1:
<body>
<b>ABC</B>
DEF
</body>

Q: What is the index of text node "DEF"?
I know <b> element is /4/2, but is "DEF" /4:1 or /4:3?

case 2:
<body>
<![CDATA[ABC]]>
DEF
</body>

Q: How do I assign the indexes?

I use a xml parser which considers them as one text node like this:
<body>
ABC
DEF
</body>

Is that ok?
Should I consider CDATA and text DEF as two nodes?

Thanks for answer.

Submitted by matt.garrish on October 5, 2012 - 4:05am

I'll take a stab at your questions, but interpreting CFIs makes my head spin...

To your first question, the specification defines the non-element nodes before, between and after elements as unique instances, even if they are empty, so that would suggest that /4:3 is correct as /4:1 would refer to the location before the first b tag. See the first bullet in s. 3.1.1.

And for your second question, CDATA blocks are treated as text nodes so you only have a single instance of non-element data. (See the second bullet.)

Submitted by matt.garrish on October 5, 2012 - 8:47am

Sorry, made a typo in my post. Those should be /4/1 and /4/3 to refer to the non-element locations. A colon indicates character offset within the range.

And if it helps, the sequencing should never be out of order. You can think of the sequencing for referencing into any element abstractly like this:

<element (#)> non-element content(#/1) <child (#/2)> non-element content (#/3) ... <child (#/n)> non-element content(#/n+1) </element>

Again, it doesn't matter if there is no character data before the first child element. The position is still referenceable as /1.

Submitted by emn178 on October 6, 2012 - 5:01am

I got it, thanks again.

CFI problem

Search form

Secondary menu