CFI problem

4 posts / 0 new
Last post

I am studying CFI spec, and I have some questions about it.
Here are some cases that confuses me and I don't know what their path resolutions are.

case 1:
<body>
<b>ABC</B>
DEF
</body>

Q: What is the index of text node "DEF"?
I know <b> element is /4/2, but is "DEF" /4:1 or /4:3?

case 2:
<body>
<![CDATA[ABC]]>
DEF
</body>

Q: How do I assign the indexes?

I use a xml parser which considers them as one text node like this:
<body>
ABC
DEF
</body>

Is that ok?
Should I consider CDATA and text DEF as two nodes?

Thanks for answer.

I'll take a stab at your questions, but interpreting CFIs makes my head spin...

To your first question, the specification defines the non-element nodes before, between and after elements as unique instances, even if they are empty, so that would suggest that /4:3 is correct as /4:1 would refer to the location before the first b tag. See the first bullet in s. 3.1.1.

And for your second question, CDATA blocks are treated as text nodes so you only have a single instance of non-element data. (See the second bullet.)

Sorry, made a typo in my post. Those should be /4/1 and /4/3 to refer to the non-element locations. A colon indicates character offset within the range.

And if it helps, the sequencing should never be out of order. You can think of the sequencing for referencing into any element abstractly like this:

<element (#)> non-element content(#/1) <child (#/2)> non-element content (#/3) ... <child (#/n)> non-element content(#/n+1) </element>

Again, it doesn't matter if there is no character data before the first child element. The position is still referenceable as /1.

I got it, thanks again.

Secondary menu