<tr class="from" id="n1" >
<td>
String Zero
</td>
<td>
String One
</td>
<td>
String Two
</td>
<td>
String Three
</td>
<td>
String Four
</td>
</tr>
<tr class="from" id="n2" >
<td>
String Zero
</td>
<td>
String One
</td>
<td>
String Two
</td>
<td>
String Three
</td>
<td>
String Four
</td>
</tr>
<tr class="from" id="n3" >
<td>
String Zero
</td>
<td>
String One
</td>
<td>
String Two
</td>
<td>
String Three
</td>
<td>
String Four
</td>
</tr>
And so on..
For ever Table, i need to extract only String Two and String Three.
For this task it's better to use regex or libxml++ or other library?
Can someone give me some ideas for do this?
Thanks!
I have found it to be easier to do it yourself if the data is in a VERY simple format. When the format becomes nested or complicated, you should use a library.
This looks simple enough to hit with reg-ex or even just a find/substring grouping, something like find "<td>", extract string zero, find td a few times, extract string three, find </tr>, repeat...
Yes, you can use libxml(++/2/whatever) if you deem it worthy of using a library. Otherwise, like jonnin said, if the html is simple enough, just find <td>, extract characters between, until you find </td>.