Regular Expression to Parse Text Between Simple Tags (XML)

By | September 6, 2008

It is often necessary to extract text from a variable that contains HTML or XML code. I’ve created a simple regular expression that will help you to extract all text between certain tags into an array. It is a PHP solution, though regular expression is compatible with other programming languages.

preg_match_all(“/<tag>(.*?)</tag>/”, $source, $results);

This construsion will create an array with extracted data. All you need is to change “tag” to any tag you like. This string was created to parse xml files, but it will work for simple HTML tags without attributes too.

The function above will extract all occurences of regular expression match. $output will contain an array with the extracted values. Please, run var_dump to check what’s in this array

2 thoughts on “Regular Expression to Parse Text Between Simple Tags (XML)

  1. Tester

    (.*?) does not work. It gives entire tag not value of the tag. For example

    text = “one”

    Here i wish relsult “one” not “one”

  2. admin Post author

    Don’t quite understand your question: do you need to avoid getting results with quotes? Did you perform a var_dump of your resulting array? Can I have more speific example of text you’re trying to parse?

Comments are closed.