Sunday, February 24, 2013

ESV API - pulling in some verses...

I'm playing around with getting the Bible going on my site in a couple different ways.  My particular usage requires that I be able to identify each verse rather than just requesting a bunch of verses and dumping them to the screen, so that's a bit of a special requirement.

Here's the code called ala AJAX to take the $_GET['ref'] and give back the text
<?php
// set up configuration and get some useful functions available
include_once("config.php");
include_once("byheart.php");
echo "DATE: ".date('r')."<br>\n"; // for testing purposes so I know when it's done
bhPrepDB();
if(get_magic_quotes_gpc()) {
    $ref  = stripslashes($_GET['ref']);
    $text = stripslashes($_GET['text']);
} else {
    $ref  = $_GET['ref'];
    $text = $_GET['text'];
}
$ref  = mysql_real_escape_string($ref);
list($sbk, $sch, $svs, $ebk, $ech, $evs) = ParseReference($ref);
$sbook = BookCode2Name($sbk);
$ebook = BookCode2Name($ebk);
$VerseList = GenerateVerseList($sbk,$sch,$svs,$ebk,$ech,$evs);
$text = mysql_real_escape_string($text);
$text = preg_replace("/\[[a-z]\]/i", "", $text); // get rid of footnotes
    //build query
$url = "http://www.esvapi.org/v2/rest/passageQuery?key=IP&include-footnotes=0&include-passage-references=0&include-headings=0&output-format=crossway-xml-1.0&passage=".urlencode($ref);
$xml = simplexml_load_file($url);
$ch = $VerseList[0]['chapter'];
foreach($xml->xpath('//verse-unit') as $verse)
{
    $text = preg_replace("/^\s*$/m", "", $verse);
    $vnum = $verse->xpath('child::verse-num');
    $v = $vnum[0];
    if ($x = $v['begin-chapter']) {
        $ch = $x;
    }
    $v = "$ch:$v";
    echo "REF:" . $v . ": " . $text . "<br>";
}
echo "DEBUG: asXML: ".$xml->asXML()."<br>\n";

This works great in that I get each verse with the appropriate REF:1:2: prefixing it.

HOWEVER, ESV API puts things like this:

<verse-unit>Jesus, the <span ...>Lord</span> said, <span class="woc">"I am the way, the Truth, and the Life"</span></verse-unit>

And Simple XML has no way (apparently?) of sending those span tags (or other tags) back out -- it strips them out as being child tags which can only be reported separately.

CONCLUSION: Don't try to use SimpleXML with the ESV API if you are using their XML approach.  Clearly the easiest way to handle this would be to use their

http://...?output-format=plain-text

But I'm trying to learn how to properly handle XML so I'll hold out for a bit before going with the relatively simpler parsing solution...

Stay tuned!

===EDIT 2/25/2013===

I figured out if I used a slightly modified version of this code from esvapi.org (referenced from this page) then I can get rid of those sub-tags and correctly pull the code in via simpleXML.  The difficulty is getting the formatting (particularly handling poetry formatting).

I actually tried the

http://...?output-format=plain-text
and discovered that getting formatting right is just about impossible.  It includes hard-coded newlines which fix everything very nicely for poetry, but it messes everything up in normal paragraph formatting.  And I don't know how to correctly identify which context I'm in, so I can't just strip out the newlines.

0 Comments:

Post a Comment

<< Home