Scholar Repository
Home>Manual>Parse XML data into an array structure

Parse XML data into an array structure


xml_parse_into_struct

(PHP 4, PHP 5)

xml_parse_into_structParse XML data into an array structure

Description

int xml_parse_into_struct ( resource $parser , string $data , array &$values [, array &$index ] )

This function parses an XML file into 2 parallel array structures, one (index) containing pointers to the location of the appropriate values in the values array. These last two parameters must be passed by reference.

Parameters

parser

data

values

index

Return Values

xml_parse_into_struct() returns 0 for failure and 1 for success. This is not the same as FALSE and TRUE, be careful with operators such as ===.

Examples

Below is an example that illustrates the internal structure of the arrays being generated by the function. We use a simple note tag embedded inside a para tag, and then we parse this and print out the structures generated:

Example #1 xml_parse_into_struct() example

<?php
$simple 
"<para><note>simple note</note></para>";
$p xml_parser_create();
xml_parse_into_struct($p$simple$vals$index);
xml_parser_free($p);
echo 
"Index array\n";
print_r($index);
echo 
"\nVals array\n";
print_r($vals);
?>

When we run that code, the output will be:

Index array
Array
(
    [PARA] => Array
        (
            [0] => 0
            [1] => 2
        )

    [NOTE] => Array
        (
            [0] => 1
        )

)

Vals array
Array
(
    [0] => Array
        (
            [tag] => PARA
            [type] => open
            [level] => 1
        )

    [1] => Array
        (
            [tag] => NOTE
            [type] => complete
            [level] => 2
            [value] => simple note
        )

    [2] => Array
        (
            [tag] => PARA
            [type] => close
            [level] => 1
        )

)

Event-driven parsing (based on the expat library) can get complicated when you have an XML document that is complex. This function does not produce a DOM style object, but it generates structures amenable of being transversed in a tree fashion. Thus, we can create objects representing the data in the XML file easily. Let's consider the following XML file representing a small database of aminoacids information:

Example #2 moldb.xml - small database of molecular information

<?xml version="1.0"?>
<moldb>

  <molecule>
      <name>Alanine</name>
      <symbol>ala</symbol>
      <code>A</code>
      <type>hydrophobic</type>
  </molecule>

  <molecule>
      <name>Lysine</name>
      <symbol>lys</symbol>
      <code>K</code>
      <type>charged</type>
  </molecule>

</moldb>
And some code to parse the document and generate the appropriate objects:

Example #3 parsemoldb.php - parses moldb.xml into an array of molecular objects

<?php

class AminoAcid {
    var 
$name;  // aa name
    
var $symbol;    // three letter symbol
    
var $code;  // one letter code
    
var $type;  // hydrophobic, charged or neutral
    
    
function AminoAcid ($aa
    {
        foreach (
$aa as $k=>$v)
            
$this->$k $aa[$k];
    }
}

function 
readDatabase($filename
{
    
// read the XML database of aminoacids
    
$data implode(""file($filename));
    
$parser xml_parser_create();
    
xml_parser_set_option($parserXML_OPTION_CASE_FOLDING0);
    
xml_parser_set_option($parserXML_OPTION_SKIP_WHITE1);
    
xml_parse_into_struct($parser$data$values$tags);
    
xml_parser_free($parser);

    
// loop through the structures
    
foreach ($tags as $key=>$val) {
        if (
$key == "molecule") {
            
$molranges $val;
            
// each contiguous pair of array entries are the 
            // lower and upper range for each molecule definition
            
for ($i=0$i count($molranges); $i+=2) {
                
$offset $molranges[$i] + 1;
                
$len $molranges[$i 1] - $offset;
                
$tdb[] = parseMol(array_slice($values$offset$len));
            }
        } else {
            continue;
        }
    }
    return 
$tdb;
}

function 
parseMol($mvalues
{
    for (
$i=0$i count($mvalues); $i++) {
        
$mol[$mvalues[$i]["tag"]] = $mvalues[$i]["value"];
    }
    return new 
AminoAcid($mol);
}

$db readDatabase("moldb.xml");
echo 
"** Database of AminoAcid objects:\n";
print_r($db);

?>
After executing parsemoldb.php, the variable $db contains an array of AminoAcid objects, and the output of the script confirms that:
** Database of AminoAcid objects:
Array
(
    [0] => aminoacid Object
        (
            [name] => Alanine
            [symbol] => ala
            [code] => A
            [type] => hydrophobic
        )

    [1] => aminoacid Object
        (
            [name] => Lysine
            [symbol] => lys
            [code] => K
            [type] => charged
        )

)

Home>Manual>Parse XML data into an array structure