ESV API - pulling in some verses...
I'm playing around with getting the Bible going on my site in a couple different ways. My particular usage requires that I be able to identify each verse rather than just requesting a bunch of verses and dumping them to the screen, so that's a bit of a special requirement.
Here's the code called ala AJAX to take the $_GET['ref'] and give back the text
This works great in that I get each verse with the appropriate REF:1:2: prefixing it.
HOWEVER, ESV API puts things like this:
And Simple XML has no way (apparently?) of sending those span tags (or other tags) back out -- it strips them out as being child tags which can only be reported separately.
CONCLUSION: Don't try to use SimpleXML with the ESV API if you are using their XML approach. Clearly the easiest way to handle this would be to use their
But I'm trying to learn how to properly handle XML so I'll hold out for a bit before going with the relatively simpler parsing solution...
Stay tuned!
===EDIT 2/25/2013===
I figured out if I used a slightly modified version of this code from esvapi.org (referenced from this page) then I can get rid of those sub-tags and correctly pull the code in via simpleXML. The difficulty is getting the formatting (particularly handling poetry formatting).
I actually tried the
Here's the code called ala AJAX to take the $_GET['ref'] and give back the text
<?php
// set up configuration and get some useful functions available
include_once("config.php");
include_once("byheart.php");
echo "DATE: ".date('r')."<br>\n"; // for testing purposes so I know when it's done
bhPrepDB();
if(get_magic_quotes_gpc()) {
$ref = stripslashes($_GET['ref']);
$text = stripslashes($_GET['text']);
} else {
$ref = $_GET['ref'];
$text = $_GET['text'];
}
$ref = mysql_real_escape_string($ref);
list($sbk, $sch, $svs, $ebk, $ech, $evs) = ParseReference($ref);
$sbook = BookCode2Name($sbk);
$ebook = BookCode2Name($ebk);
$VerseList = GenerateVerseList($sbk,$sch,$svs,$ebk,$ech,$evs);
$text = mysql_real_escape_string($text);
$text = preg_replace("/\[[a-z]\]/i", "", $text); // get rid of footnotes
//build query
$url = "http://www.esvapi.org/v2/rest/passageQuery?key=IP&include-footnotes=0&include-passage-references=0&include-headings=0&output-format=crossway-xml-1.0&passage=".urlencode($ref);
$xml = simplexml_load_file($url);
$ch = $VerseList[0]['chapter'];
foreach($xml->xpath('//verse-unit') as $verse)
{
$text = preg_replace("/^\s*$/m", "", $verse);
$vnum = $verse->xpath('child::verse-num');
$v = $vnum[0];
if ($x = $v['begin-chapter']) {
$ch = $x;
}
$v = "$ch:$v";
echo "REF:" . $v . ": " . $text . "<br>";
}
echo "DEBUG: asXML: ".$xml->asXML()."<br>\n";
This works great in that I get each verse with the appropriate REF:1:2: prefixing it.
HOWEVER, ESV API puts things like this:
<verse-unit>Jesus, the <span ...>Lord</span> said, <span class="woc">"I am the way, the Truth, and the Life"</span></verse-unit>
And Simple XML has no way (apparently?) of sending those span tags (or other tags) back out -- it strips them out as being child tags which can only be reported separately.
CONCLUSION: Don't try to use SimpleXML with the ESV API if you are using their XML approach. Clearly the easiest way to handle this would be to use their
http://...?output-format=plain-text
But I'm trying to learn how to properly handle XML so I'll hold out for a bit before going with the relatively simpler parsing solution...
Stay tuned!
===EDIT 2/25/2013===
I figured out if I used a slightly modified version of this code from esvapi.org (referenced from this page) then I can get rid of those sub-tags and correctly pull the code in via simpleXML. The difficulty is getting the formatting (particularly handling poetry formatting).
I actually tried the
http://...?output-format=plain-textand discovered that getting formatting right is just about impossible. It includes hard-coded newlines which fix everything very nicely for poetry, but it messes everything up in normal paragraph formatting. And I don't know how to correctly identify which context I'm in, so I can't just strip out the newlines.